The role of infrastructure teams in the platform engineering era

May 10, 2024

What is the role of infrastructure and operations teams in the platform engineering era, and how will platform engineering change things for these crucial teams? Let’s find out.

Knock, knock. Who’s there? It’s platform engineering, the next big thing knocking on the door of the infrastructure and operations (I&O) team, screaming for their attention. You’ve “shifted left.” You’ve let them “build it and run it.” You went “declarative,” you did “DevOps,” you moved to the cloud and kept the lights on in the data center while the crowd cheered in Vegas at ReInvent, and Werner Vogels announced an API endpoint for S3 that you featured years ago. And now, here we are again.

It ain’t easy in infrastructure. You’re often blamed, always responsible, cleaning up the crap in the background while others get the praise for building the “things that power the world.”

While I hear you and acknowledge it, I would like you to give me five minutes to convince you of something: the era of platform engineering (which is here whether we like it or not) will be an era that can have a transformational impact on the daily lives of infrastructure and operations professionals.

Self-service? I’ll believe it when I see it

For the last four years, platform engineering has been defined as the “active development of a platform that enables developer self-service and reduces their cognitive load.” But after piling generations of platforms and service catalogs on top of each other, the industry realized that somebody needs to maintain all those resources created in a “fire and forget” manner by Backstage and friends. Who secures and updates resources as they drift? Who builds the business logic that makes for sophisticated automation? Who builds platforms that actually scale and go beyond the frontend?

Platform engineering is more than developer experience (DevEx), i.e making life for app devs better; it’s about IOex too (no, it’s not a term, and it shouldn't be because it sounds terrible, but you get what I mean). It’s important because it’s simply not the case that there are no problems in infrastructure land.

App developers have pain points, infrastructure and operations have pain points, and platform teams can help align those pain points to provide value for all.

So here’s how I will derive this: a.) I will try to describe what part of an Internal Developer Platform (IDP) should be “owned” by the infrastructure team as well as who builds the interfaces that I/O teams work against. And b.) I will outline how this changes the daily “job to be done” for the I/O team going forward for the better.

How I/O teams functionally interface with a platform

Well-designed platform engineering teams are a little bit like an alliance of different countries. Each country sends heralds to align on standards, ensure interoperability of equipment, and make sure missions can be executed predictably. The heralds sent by the application developers are usually referred to as “DevEx platform engineers”. They focus on developer experience, reducing cognitive load for the developers and speeding up their productivity.

The “heralds” of the infrastructure and operations teams focus on the discipline referred to by Gartner as “Infrastructure Platform Engineering” (Paywall link). These folks make sure that the IDP - the final product of the platform engineering team interfaces well and sustainably with the work of the I/O teams.

A really simple way of thinking about it is that the platform engineers focusing on DevEx care about the frontend of the platform and the infrastructure platform engineers care about the backend of the platform.

Platforms dare I say it, do productize separation of concern. Application developers tend to care more about the code they write than about the configuration of infrastructure. I/O teams care more about the configuration of infrastructure and the security and codification of workflows.

But of course, it’s not a “strict” separation of concern; it is a “fluid” one. Some product teams may choose to operate more of the infrastructure themselves than others. In platform design, we always follow the mantra of golden paths vs cages. Developers choose where the cut-off point is.

Here’s a quick visual on this:

How do I/O teams technically interface with a platform

To answer this, let’s make things more tangible. We will take the most commonly used reference architecture for IDPs as first described by Schneider, Deleat et al (you can find them here and explore the details for yourself.) And we will zoom in right away to explore how this design “produces separation of concern”.

To do this we consider the following scenario: An application developer requires a Redis cache for an existing workload for an environment of type staging. The user might use a portal or a workload spec as code or even a simple CLI. A part is usually supplied by the DevEx platform engineer. On this abstract level, they would add the abstract request “I need a resource of type redis.”

This request would make its way through the CI pipeline and hit the backend of the platform (usually a Platform Orchestrator). The Orchestrator is a graph-based backend. It reads the abstract request (“this is how the workload fits to what resources the user wants Redis and is deploying to the context “staging environment "") and would match the correct resource pack. It would then create a resource graph, all relevant app and infra configs based on the baseline configs in the resource pack, fetch the credentials, create new workload configs, and inject them through secrets at runtime into the container. It would then run sign-offs and checks, and deploy.

The beauty is that every time any developer needs a resource of a certain type for a staging environment the backend will consume your universal rules for that resource type. And it means that if you update those rules centrally, every resource graph consuming them will automatically update. If you think this through, when configured the right way, all your S3 buckets in staging can suddenly be configured the exact same way; and you can enforce, update, and maintain them.

Those rules are called Resource Definitions and are packaged into Resource Packs, which are essentially infrastructure configurations shared across the organizations so many resources of the same type, in for instance the same type of environment, can be maintained and secured as a fleet. They can be reusable Terraform modules + the rules when and how to use them to orchestrate infrastructure.

The backend and maybe the pipeline logic is developed by the infrastructure platform engineers and the resource packs are created by the infrastructure and operations teams. There’s no hard demarcation line of course and it differs by organization.

How does this empower things for I/O teams?

You go from one-by-one changes to global rules

By setting global rules for how resources get created (and letting a system eliminate drift from those rules), you can truly double down on nailing the individual configuration of resources by context and keep them secure and up to date.

You have a common interface to work against as an I/O team

Operating against a common interface and format allows you to align and streamline your internal work. You now have an API to query what resources run, in what state and version, and what workloads consume them. This allows you to proactively work on things and stay ahead of the curve.

Unify public and private cloud against a common interface

Remember that feeling when you hear AWS announce something you’ve been supporting for years but nobody notices how strange that is? Well with this approach this doesn’t matter anymore, since users rarely interact with the cloud interface but with a higher level layer. This means that in the end, all that matters is how smoothly the underlying infrastructure runs; the experience is leveled out.

From nanny to self-service

By providing DevEx teams with a common interface (the backend API) to develop against, you enable them to request things in self-service. Previously, you’d have to do these things manually in response to tickets filed by developers.

So how do Humanitec’s products support this?

I hope I was able to convince you at least a little bit that platform engineering is something for developers and I/O professionals alike if applied correctly. It helps you focus, standardize, and automate consistently.

Humanitec provides tools for DevEx teams (Score and the Portal) as well as for infrastructure platform engineers (the Platform Orchestrator) and for infrastructure and operations teams (Resource Packs). Our components are battle-tested at scale, secure, and optimally support every step into the world of platform engineering.

A good way to dip your toes in is to start with reference architectures that are essentially pre-packaged starting points for every flavor. You can also try the guided MVP program where our experienced platform architects help you to build a Minimum Viable Platform in four simple steps.

Kaspar von Grünberg

CEO at Humanitec

Kaspar is an early pioneer in platform engineering. Over the last decade he has been building Internal Developer Platforms (IDPs) at scale, and is responsible for coining the term IDP. A regular speaker on the topic of platform engineering, Kaspar is the author of several associated defining articles. He is also the founder and CEO of Humanitec, the company behind Score and the Platform Orchestrator which is now the centerpiece at the majority of dynamic IDPs.

Upcoming events

No items found.

See all events