Vol. 61: Turbo charge deployment with a platform reference architecture

Hey,

Hope you enjoyed PlatformCon 2023!

22,000 people attended, what a number. I particularly enjoyed the presentation by Stephan Schneider and Mike Gatto on Platform as Code. They showcased their Internal Developer Platform (IDP) reference architecture that they packaged as code and for me, three things jumped out:

The idea of not only treating your platform as a product but thinking about it as code

The way Stephan and Mike think is code-first. From obvious things such as resources that are “IaC’ed” with Terraform, all the way to the portal layer (Backstage in their case). This has a ton of positive implications. You can test, you can version (huge at large scale), you have disaster recovery, to name just a few.

Understanding the partition into five planes

They then divide the concept of an IDP into five planes. This was such a revelation for me and helped me really understand which plane is responsible for what “job to be done”. The more I’m using this with teams the more I realize its beauty. Current planes are the Developer Control Plane, the Integration and Delivery Plane, the Resource Plane, the Observability Plane, and the Security Plane. After using this narrative for a while I’m wondering whether we’re missing a Governance and Policy Plane. But nonetheless it’s an awesome start.

The approach to configuration and resource management

One of the design principles the team used was “static over dynamic configuration management”. They really excelled at designing the system to yield the full impact of Dynamic Configuration Management. With every single release they are doing a full rebuild of all app and infrastructure configs. I am, as you might be aware, a HUGE proponent of this approach because it leads to an unparalleled degree of standardization and reduces maintenance effort significantly. I bet if you run an analysis now you have dozens if not hundreds of Postgres DBs. All of them will be configured slightly differently. But why? Do you really need 400 ways of configuring Postgres? Is that even possible? No, you maybe need a handful. The rest is only “config drift by mis-design”. But who’s maintaining all of those configs? Who ensures they’re secure? And who updates them? Missing standards today are the technical debt of tomorrow. Their design is exceptional in solving this by regenerating the final configs from the abstract request of a developer using Score. It then applies it to baseline configs (a resource definition for Postgres) based on the context of the deployment (env type = staging for instance).

I was so excited about this talk that I wrote a reference architecture whitepaper for various ecosystems such as AWS, GCP, Azure etc. (arguably too long). They can be found here.

As always I look forward to your feedback. I’d also be happy to show you the reference architecture in action, so let me know if you’re interested.

‍

Best

Kaspar