For almost a decade, Puppet has published an annual State of DevOps Report. Over the last few years we’ve seen relentless hype about the ability of DevOps to transform teams and workplaces (and the craft of software engineering). By comparison, the report provides an honest look at the realities of DevOps. More than 35,000 technical professionals worldwide have contributed to the body of research, the longest-running and most widely referenced DevOps research in the industry. Puppet is soon to realize their latest findings for 2021, and Humanitec's Kaspar von Grünberg spoke with Field CTO at Puppet and one of the report's authors, Nigel Kersten to talk about the growth of DevOps and platform engineering.
Why research the practice of DevOps?
There's plenty written about the philosophies underpinning the DevOps movement. Still, Puppet saw a lack of documentation to help those evolving their DevOps practices, especially as Nigel explains, "it moves from small startups to scaling companies and traditional companies."
Nigel notes that while people understood the ideals and end goal promised by DevOps, "small cross-functional teams that were doing really well singing in harmony, application developers wearing the pager, happily fixing issues. But almost no company out there in the whole world is actually like that." Instead, Puppet found a demand for hands-on advice about where to get started.
In response, Puppet created an evolutionary model that maps out low, medium and higher levels of evolution, representing distinct clusters in behaviour. "For example, how often do you deploy new changes to your application stack? If that's every six months or twelve months, then you're at the low level. Mid-level, once a twice a week (pretty dramatic improvement). High-level or elite groups are deploying on-demand," explains Nigel.
In general, Puppet's research found that high-performing teams can release software much more frequently than low performers due to Automation and more collaborative, less bureaucratic working methods. At the same time, high performers enjoy greater system stability. They treat infrastructure as code and design it as part of their software development process.
The importance of the platform model
In 2020 Puppet decided to examine the structural issues that are holding organizations back and new approaches to achieving DevOps agility that allow you to maintain careful governance. One significant structural change they see more often is the shift to internal platform teams. Unlike DevOps teams or product teams responsible for end-to-end delivery of their product, internal platform teams are responsible for providing a platform that provides the infrastructure, environments, deployment pipelines, and other internal services. This platform enables internal customers — usually application development teams — to build, deploy and run their applications.
The platform model is a fairly new approach to enabling application teams. Done right, it simply works, resulting in the faster, more efficient delivery of high-quality software that meets an organization's business needs — and at scale.
Nigel shares that repeated problem in DevOps of companies copying a model espoused by another typically massive and successful company like Amazon or Netflix and attempt to shoehorn it unedited into their company's DevOps how to and failing.
He notes, “If you look at Netflix, they really only build one big major service. They hire an awful lot of developers, and they pay them really, really, really well. They have really high expectations of their developers. You're working on a modern codebase. Netflix is not that old, as an organisation, you're not a bank, you're not subject to the regulatory pressures. You're not scattered, all over the world in the same way. So Netflix can do things in terms of deployment that are just fun.”
Most companies don’t have budgets that are anywhere near that of Amazon or Netflix and often struggle to get comparatively experienced and talented platform engineers.
Further, many people are working with fundamental constraints specific within their organisation and industry where iit is a mistake to attempt a wholesale approach without accommodating that difference. Nigel asserts that while there’s the notion of “move fast and break things, I don't think you should move fast and break things when it's your bank account, your pension, the Telecommunication System. Different businesses have different constraints.”
He notes that platform terms were rising up in response to the challenges of successfully scaling out DevOps practices into an enterprise. He explains
"We found a really high correlation between well-run platform teams and the higher levels of DevOps evolution. And in fact, it's so strong that I think it's just about impossible in a large organization of 10,000 plus people to scale-out DevOps practices without having some kind of investment in platform teams, even if you don't call it a platform."
Self-service
According to Nigel, companies are evolving from long-lived large stand-alone test environments to self-service. He notes that shared test environments correlate with poor DevOps performance. Even better, companies are progressing their software practices as they reap the benefits of DevOps done well, in particular moving onto higher level self-service like authentication, GRPC load balancing, geolocation.
He details, "Once you've solved the infrastructure problems, you don't stop there. So I think many people think of an internal platform team as being, it's like EC2 plus S3, plus CloudFront, or something like that. But as you start moving to a more highly evolved world, you keep building much more developer-centric, self-service practices."
It follows that readers and respondents of Puppet's survey are keen to know how to move from low to high performance. Nigel notes that elite teams have both cultural and structural practices that facilitate their success. For example, "there's a version control system that everyone can look into to see if someone else has solved a problem. There's a culture of the inner source. It's about having CI/CD, using configuration management systems." He asserts that "Automation is the social contract between different teams. Implement, automatic, expand, self-service."
A product mindset is key to scaling DevOps and your internal platforms
One of the most critical attributes of a successful platform team is being agile, listening to your users and getting their feedback to find out their problems. It's the correlation between lean methodology and DevOps or, more specifically, according to Nigel, "The idea of building a small experiment, testing it with real-world users getting their feedback, using it to modify the actual products that you're building. And you've got to have a product mindset, which means you have to evangelize it to them, you have to market it, you have to listen to their problems and go, here's the thing that actually solves your problems. Pure top-down mandating to use a platform doesn't work."
This corresponds with what Manuel Pais, co-author of Team Topologies has shared with us. Traditionally decision making around the introduction and management of platforms was lead by
CTO/CIO and heads of department, with external product management product teams and platform leads - as well as those who will use the products - further down the food chain when it comes to influence and decisions. As Manuel notes these managers tend to think about high level goals: “We want to be Spotify”.
By comparison, modern platforms are led by platform teams or platform product teams who both strongly influence and inform. Manuel asserts, “In terms of deciding what we build, the decisions should be with the people who do the work and those who consume it.” He further notes that having the guidance of experienced product managers is crucial as platform building is not easy.
An API-first approach is increasingly relevant to devs
Nigel suggests that an API-first approach is desirable as your audience is intrinsically technical, and if you make something API-first, "it means your service developers can interact with it. It means your teams can build their own Slack workflows inside to deal with it."
Customers are already embracing approaches like IFTTT and "If you give someone an API endpoint, they can start being creative in solving the problem that they have in front of them, which might require using two or three different API's. This is the de facto world, and the modern developer is glueing together lots of different services.”
He asserts that getting the service offering right is essential, specifically the level of abstraction. A complete BlackBox solution fails to advantage the creativity of your user base. But you also want to give people access to the layer of workings beneath, like VMs, networks, and firewall ports. Being able to recompose these things to solve new problems means that the platform team and IT team are not bugged for absolutely everything. This also builds a highly attractive walled garden that enables developers to build things and provides a common base for business processes like security, auditing, and compliance.
What is the goal of security teams with internal platforms?
While the notion of shifting left has gained great traction in DevOps, Nigel stresses that deep engagement at every stage is ever more important. He shared: "Security is fundamentally different to a platform team, but a well-run platform team should work with security to collaborate around design and function. You want to ideally, get into a world where you are doing continuous compliance."
How to get employer buy-in on Internal Developer Platforms
Getting employer support for self-service IDPs can be a challenge. Nigel suggests putting a case together of the current TOIL and manual workload vs the time it saves. He notes, "Developers only started managing infrastructure themselves in general because it was better than waiting for a slow IT department that couldn't be responsive. I think this is the joy of an internal platform; you're actually providing the infrastructure and various things that they want. Focus on hours saved to make a really easy business case." Another aspect is security as a selling point. He expands, "Your security teams are probably frantic and just as overloaded as the infrastructure teams. The idea of giving a single common point of control, to simplify their world is something that will get a lot of buy-in, and for better or for worse, security often has a larger budget than infrastructure."
What will the next year mean for Internal Developer Platforms and DevOps effectiveness?
It will be interesting how IDPs have grown since Puppet’s 2020 report where sixty-three percent of survey respondents were in companies with at least one self-service internal platform, and 60 percent had between two and four IDPs. How will these companies have progressed and how has it informed and changed their workflow and output?
Nigel asserts that Puppet is focused on the practices that make DevOps work, detailing, “We've been trying to focus on the things that unblock people. And it's almost all what we've traditionally called culture problems, like organizational buying culture around risk...But the thing is, as you evolve, once you start identifying these things, you stop thinking of them as culture, and you start thinking of them as the problems that need to be solved."
It's a shift to an almost post-DevOps world beyond labels where you stop labelling the thing you are trying to do, and "you start being focused on the work that you're actually doing instead."
Puppet will release the State of DevOps 2021 very soon. In the meantime, you can take a deep dive into the discussion covered here by watching Kaspar's interview with Nigel on Humanitec’s YouTube channel.