Production @ Home
For the last 18 months, I’ve been intermittently working on a project I call “production at home.” The initial motivation was to better understand the behavior of a web application running in a Platform As A Service (PAAS). Since I had very little visibility into the service’s implementation, debugging was difficult.
The triggering issue was unacceptable logging performance in an ELK stack. And I’d seen the exact same issue at a previous company, which used the PAAS.
Lacking access to the actual production infrastructure, and unwilling to risk crashing a monolith in the quest for knowledge, making a model of the system seemed like an effective way to get more insight. By model, I mean “working, scale facsimile.” Sort of like if you wanted to understand how a panhead works, you don’t need to spend $10,000, just build a model panhead for $600. Same chug, less money.
So, build a model, what next?
There are a couple of options. The first and much less desirable is to treat one’s working computer as the hosting system, installing all the tools and applications, then configuring the host machine. This has the advantage of leveraging (say) homebrew for a fast start. I have attempted this in the past. With care, it can work for very simple systems, for example, an application and its database.
I found it difficult to move much past very simple systems though. Configuration is usually best done at the system level, and configuration files are typically not managed under version control. Throw a few more copies of Postgres, Redis, and ELK stack into the mix and it’s likely not going to end well.
A next option is build the whole simulation locally using Docker containers. This is almost surely how the PAAS works anyway.
Another option is building on a substrate such as AWS, but now we’re back to spending a lot of money overprovisioning for a system which is by design supposed to be small.
The locally hosted Docker option seemed most attractive, reasoning that any initial difficulty learning Docker would be offset by 1. going deeper into a portable skill set, and 2. avoiding the pain of local configuration management. Besides, having everything in Docker allows moving the whole system to AWS later should I want to run more than what my Macbook can handle.
However, learning Docker well enough was its own hurdle, and progress remained more or less in the “wouldn’t it be cool” stage.
mid-2023…
I did not mention that my initial exposure to the motivating issue was sometime in 2019, seeing it again at a different company in 2021.
Some cursory examination of logs indicated that the issue might not be solely with the PAAS. From their side, everything was configured correctly with many other customers on the same technical stack not having any problems.
In addition, I wanted to better understand how to configure Postgres for logical replication to support an internal proposal for easing read load on long running cron jobs.
It’s mid-2023. You know what’s coming next.
At some point I asked ChatGpt (3.5) to explain Docker in more detail. And it was off to the races! Vibe coding before it was even a thing.
In short order (weeks) I had 3 Postgres databases running, along with Telegraf, Influx, and Grafana, all running from scripts. Not quite push button, but pretty close. Logical replication worked. I built a toy Rails application with an ELK stack and dug into Rails logging. Now I could crash a Rails monolith with impunity.
Show me the code!
The project repo is hosted on Github. It’s usually public access, and if not reach out.
It is not polished even a little bit.
Cleaning up documentation is an ongoing challenge, as the scope has increased considerably from its original intent of demonstrating logical replication. For example, the addition of both container and application logging changed the entire initial scope of the project.
The potential for future expansion is limited only imagination and hardware. Hardware capabilities are increasing faster than the minimum viable size of the toolchain, which allows packing more tools into the container system.
What sort of things can you imagine running on Production@Home?