Gofer's Philosophy
Things should be simple, easy, and fast. For if they are not, people will look for an alternate solution.
Gofer focuses on the usage of common docker containers to run workloads that don't belong as long-running applications. The ability to run containers easily is powerful tool for users who need to run various short-term workloads and don't want to care about the idiosyncrasies of the tooling that they run on top of.
How do I use Gofer? What's a common workflow?
- Create a docker container with the workload/code you want to run.
- Create a configuration file (kept with your workload code) in which you tell Gofer what containers to run and when they should be run.
- Gofer takes care of the rest!
What problem is Gofer attempting to solve?
The current landscape for running short-term jobs is heavily splintered and could do with some centralization and sanity.
1) Tooling in this space is often CI/CD focused and treats gitops as a core tenet.
Initially this is really good, Gitops is something most companies should embrace. But eventually as your workload grows you'll notice that you'll want/need to have a little more control over your short term workloads without setting up complicated release scheduling.
2) Tooling in this space can lack testability.
Ever set up a CI/CD pipeline for your team and end up with a string of commits simply testing or fixing bugs in your assumptions of the system? This is usually due to not understanding how the system works, what values it will produce, or testing being difficult.
These are issues because most CI/CD systems make it hard to test locally. In order to support a wide array of job types(and lean toward being fully gitops focused) most of them run custom agents which in turn run the jobs you want.
This can be bad, since it's usually non-trivial to understand exactly what these agents will do once they handle your workload. Dealing with these agents can also be an operational burden. Operators are generally unfamiliar with these custom agents and it doesn't play to the strengths of an ops team that is already focused on other complex systems.
Gofer leverages schedulers which work locally and are already native to your environment, so testing locally is never far away!
3) Tooling in this space can lack simplicity.
Some user experience issues I've run into using other common CI/CD tooling:
- 100 line bash script (filled with sed and awk) to configure the agent's environment before my workload was loaded onto it.
- Debugging docker in docker issues.
- Reading the metric shit ton of documentation just to get a project started, only to realize everything is proprietary.
- Trying to understand a groovy script nested so deep it got into one of the layers of hell.
- Dealing with the security issues of a way too permissive plugin system.
- Agents giving vague and indecipherable errors to why my job failed.
Gofer aims to use tooling that users are already are familiar with and get out of the way. Running containers should be easy. Forgoing things like custom agents and being opinionated in how workloads should be run, allows users to understand the system immediately and be productive quickly.
Familiar with the logging, metrics, and container orchestration of a system you already use? Great! Gofer will fit right in.
Why should you not use Gofer?
1) You need to simply run tests for your code.
While Gofer can do this, the gitops process really shines here. I'd recommend using any one of the widely available gitops focused tooling. Attempting to do this with Gofer will require you to recreate some of the things these tools give you for free, namely git repository management and automatic deployments.
2) The code you run is not idempotent.
Gofer does not guarantee a single run of a container. Even though it does a good job in best effort, a perfect storm of operator error, extension errors, or sudden shutdowns could cause multiple runs of the same container.
3) The code you run does not follow cloud native best practices.
The easiest primer on cloud native best practices is the 12-factor guide, specifically the configuration section. Gofer provides tooling for container to operate following these guidelines with the most important being that your code will need to take configuration via environment variables.
4) The scheduling you need is precise.
Gofer makes a best effort to start jobs on their defined timeline, but it is at the mercy of many parts of the system (scheduling lag, image download time, competition with other pipelines). If you need precise down to the second or minute runs of code Gofer does not guarantee such a thing.
Gofer works better when jobs are expected to run +1 to +5 mins of their scheduled event/time.
Why not use <insert favorite tool> instead ?
Tool | Category | Why not? |
---|---|---|
Jenkins | General thing-doer | Supports generally anything you might want to do ever, but because of this it can be operationally hard to manage, usually has massive security issues and isn't by default opinionated enough to provide users a good interface into how they should be managing their workloads. |
Buildkite/CircleCI/Github actions/etc | Gitops cloud builders | Gitops focused cloud build tooling is great for most situations and probably what most companies should start out using. The issue is that running your workloads can be hard to test since these tools use custom agents to manage those jobs. This causes local testing to be difficult as the custom agents generally work very differently locally. Many times users will fight with yaml and make commits just to test that their job does what they need due to their being no way to determine that beforehand. |
ArgoCD | Kubernetes focused CI/CD tooling | In the right direction with its focus on running containers on already established container orchstrators, but Argo is tied to gitops making it hard to test locally, and also closely tied to Kubernetes. |
ConcourseCI | Container focused thing do-er | Concourse is great and where much of this inspiration for this project comes from. It sports a sleek CLI, great UI, and cloud-native primatives that makes sense. The drawback of concourse is that it uses a custom way of managing docker containers that can be hard to reason about. This makes testing locally difficult and running in production means that your short-lived containers exist on a platform that the rest of your company is not used to running containers on. |
Airflow | ETL systems | I haven't worked with large scale data systems enough to know deeply about how ETL systems came to be, but (maybe naively) they seem to fit into the same paradigm of "run x thing every time y happens". Airflow was particularly rough to operate in the early days of its release with security and UX around DAG runs/management being nearly non-existent. As an added bonus the scheduler regularly crashed from poorly written user workloads making it a reliability nightmare. Additionally, Airflow's models of combining the execution logic of your DAGs with your code led to issues of testing and iterating locally. Instead of having tooling specifically for data workloads, instead it might be easier for both data teams and ops teams to work in the model of distributed cron as Gofer does. Write your stream processing using dedicated tooling/libraries like Benthos (or in whatever language you're most familiar with), wrap it in a Docker container, and use Gofer to manage which containers should run when, where, and how often. This gives you easy testing, separation of responsibilities, and no python decorator spam around your logic. |
Cadence | ETL systems | I like Uber's cadence, it does a great job at providing a platform that does distributed cron and has some really nifty features by choosing to interact with your workflows at the code level. The ability to bake in sleeps and polls just like you would regular code is awesome. But just like Airflow, I don't want to marry my scheduling platform with my business logic. I write the code as I would for a normal application context and I just need something to run that code. When we unmarry the business logic and the scheduling platform we are able to treat it just like we treat all our other code, which means code workflows(testing, for example) we were all already used to and the ability to foster code reuse for these same processes. To test Uber's cadence you'll need to bring up a copy of it. to test Gofer you can simply test the code in the container. Gofer doesn't force you to change anything about your code at all. |