I’ve heard a lot of misconceptions about Internal Developer Platforms (IDPs) lately. Most of them are from practitioners that have not actually taken the time to understand this approach in detail. Be it due to prejudice, categorical preferences for self-built tools or ill-guided interpretations.
Most are so religious, an open conversation will likely not get us far. If you believe you should build something as complex as an IDP yourself, you should probably do so. I would question why you’re not doing the same for your continuous integration tool? Sounds like a fun exercise too.
But there is another critique that, if true, would sway my deep technical conviction in this category: developers working with Internal Developer Platforms are losing the concept of how their code runs in production. That would, in fact, be fatal. It would be a price too high to pay for enabling developer self-service.
Is that the case? Are all these large tech companies taking context on how applications run in production away from their developers? Would that be sustainable? The answer is “no”.
An interesting exercise is looking at the setup a team runs before such a platform. Let’s play through a scenario I see very often: Jenkins is used to script pipelines, maybe Gitlab. Terraform is used for Infrastructure as Code (IaC). The setup is running on Kubernetes, you’re utilizing Helm for app config management and ArgoCD to sync your cluster. State-of-the-art, one would assume. And yes, you need all of those tools, no questions asked. But just throwing this setup at developers doesn’t help them understand how their apps run in production. To the contrary. Letting your average developer spend time diving into your infra repositories to understand why this manifest is necessary to spin up a side-car proxy to get your Google Cloud SQL instance running is not only providing little value, it’s naive to assume anybody would ever do this. I’m talking about the average developer. There are those senior engineers in every team that want (and frankly need) any little detail. And if we keep stuff abstracted from them, that’s deadly. The key mistake we’re making: designing platforms for few experts, while leaving the rest without any context and dependent on others.
Because what actually happens is a pattern I’ve observed literally hundreds of times and we’ve even cached it as data as we analysed 1850 in a recent benchmarking study.
As developers get overwhelmed by the complexity of their setups, they stop trying to understand the context. Worst case, they go straight to the one senior developer who gets it to let her run all deployments, do all the config changes, do all the debugging. Liveness what? Horizontal pod auto-what? I call that “shadow operations”. Senior engineer is blocked, you are blocked, nobody wins. It’s not DevOps and it’s the case in a whopping 78,8% of cases. Do you think these developers know how their apps run in production? Really?
Not only do they not take the ownership to get their app from idea to prod, they slow down the entire herd. So what are the options? “Train your developers, if you don’t do that you are evil”. Yes, point taken, you should always do that. The reality is that your engineering manager is under pressure to deliver and she’s paid to provide business logic. She might eventually.
The path that Internal Developer Platforms and platform teams in general are taking is, in my opinion, by far the most sustainable. Simplify, yet provide context. Structure, yet embrace the unstructured. Match the ability and the preference of the developer with the right level of abstraction. Senior developer who wants to go low level? Let them go wild with Terraform and restrict the platform's job in this case to referencing the files so it’s well documented for the next engineer to look at. Junior who wants to deploy with confidence? Provide golden paths that they get the job done in self-service. Point them at the underlying Helm charts, YAML files and manifests to get the actual context of what’s happening under the hood.
With Humanitec, as an example, developers can change the underlying baseline charts or simply use Helm charts for their respective service. They can use Terraform, Pulumi, Cloudformation or else to get the exact networking configs they require. And for app configs, a good old Helm chart might do the job for them:
But to make sure every developer can operate everything, golden paths help everybody run even the most complex deployments by surfacing which environment variable connects to what resource, making sure a test DB never points at a production workload and providing transparent guardrails. So the developer might want to visualize the (exact same) chart maybe in a more structured way:
Both approaches end up creating a full representation as code sitting in your repository. One leads to most developers just executing files they don’t really understand, the latter helps everybody gain context. But both approaches work in tandem. And they form a sustainable setup to meet the individual preferences of each team member.
Only self-service, paired with the ability to go deep at will, lead to a sustainable, well scaling and productive toolchain. Internal Developer Platforms bridge both worlds. Context all the way, paired with the right level of abstraction for the individual preference of the contributor. All for an informed, healthy and productive engineering team.