I am always fascinated to what extent Platform Engineering is the “practice of intelligent structuring of repositories and the roles and permissions on those repositories”. That is if you believe in the mantra of “code first”. One of our most read articles is on this exact topic.
A lot of the debates of “how do I standardize”, “how do I avoid abstracting my users too much”, “how do I ensure security” are at its core debates about how you structure repositories.
So I would like to walk you through how I like to set up repositories that achieve all the good stuff one wants to achieve with a platform engineering initiative. And - if you allow me that comment - call it whatever you want. It’s “good hygiene” regardless of what name of practice you put on top.
First rule of business: differentiate between app- and infrastructure repositories. I’m aware that likely the majority of teams out there just stuff everything that “belongs to one app” in one repository. But that’s poison for any attempt to standardize. Why? Because if you want to get anywhere you need to break the cycle of “every single Postgres running in staging is configured slightly differently”. That means many different apps and services can all “get their dependent Postgres DB in staging configured exactly the same way”. To achieve this you need to have those things pulled apart. You literally need a default repo where that “universal definition for Postgres in Staging” resides.
That on the other hand means that the app repos cannot include any hint of how that Postgres is defined in staging. They just need to contain the general information that the workload needs a Postgres. Which is why the app-repo is environment agnostic.
If we are thinking of a containerized application, the app-repo would be structured as follows:
- App-repo some text
- Service 1some text
- Service source code
- Docker File
- Score (or any agnostic workload configuration)
- Service 1some text
The infrastructure (or Platform Source Code) repositories are structured as follows:
- Infra-reposome text
- Resource of Type Postgressome text
- Postgres definition for Staging some text
- IaC file representing the configuration
- Resource Definition file tell the executing Orchestrator what IaC configuration to use in what situation (eg “for Postgres in Staging use this parameterized Terraform file)
- Postgres definition for Development
- Postgres definition 1 for Production
- Postgres definition N for Production
… - DNS definition for environment n
- Postgres definition for Staging some text
- Resource of Type Postgressome text
So yes, you got it right, there is no repo that you actually touch as a human that contains the environment specific configuration. This configuration gets generated by the platform backend with every deployment by reading the app-repo, matching the right definition on the infrastructure side, and generating the resource graph and final configuration. This is done by a Platform Orchestrator. This “final definition” can be stored by the Orchestrator in a “Target State File Repository” which can then be executed in “GitOps” fashion by something like ArgoCD.
Why is this design so advantageous?
- You are driving standardization. All of a sudden you don’t need to maintain 100 ways of configuring Postgres in Staging but 2-3;
- You simplify the life of the developer because they can just stay on the level of the workload definition like Score and enjoy the defaults.
- You don’t abstract them away or shut them out because you can just give them read or even write access to the infra repositories.
- You encourage inner-source: your infra repos are your golden paths. Missing one? Allow people to fork and contribute to the central repository. This way you don’t need to guess which infra definition is missing, users will show you in action.
- The final production infra can be potentially gated. That allows your security team to review them once and a system is then generating the final configs.
- Because everything is code we have nice clean backups and we can policy check everything.
Result: higher standardization with lower burden on central ops. Developer self-service without context being taken. More secure design and governance.
Disagree? Tell me why!
Want to go deeper? Join next weeks workshop.