With cloud providers like Amazon Web Services (AWS) becoming a popular DevOps tool in many businesses, DevOps is facing a crisis of identity.
On one hand, the rise of cloud computing is demonstrating that Infrastructure as Code (IaC) can, and should, be integrated into development teams. However, many businesses still see the need for dedicated system admins and operations professionals to handle IT hardware, even in instances where that hardware is managed through a third-party vendor.
In September, we were lucky enough to sit down with Barry O’Reilly and Kief Morris to talk more about how IaC is changing DevOps and how this new industry standard will power the future. Barry is the author of Unlearn: Let Go of Past Success to Achieve Extraordinary Results, he helps some of the world's leading companies, from disruptive startups to Fortune 500 behemoths, break the vicious cycles that spiral businesses toward death by enabling a culture of experimentation. Kief Morris is the author of the book Infrastructure as Code, which was recently published in the second edition. He heads the global Infrastructure and Cloud Engineering community of practice at ThoughtWorks. He works across commercial and public organizations to adopt new approaches to maximize the potential of modern cloud and automation technologies.
The history of IaC
Despite how widespread IaC is now, it’s surprising to think that this concept was on the bleeding edge of DevOps less than 10 years ago.
In the very early days, when AWS was still on its first and second iterations, the standard way of implementing new IT infrastructure was through physically purchasing and configuring servers and networks whenever your business needed more capacity.
Of course, this came with its own challenges, but given that this was the way that all businesses had operated for years, it was one that businesses were used to. Challenges like not correctly estimating how much RAM you’d need were expected, and they were costly mistakes that sys-admins and ops wouldn’t likely make twice.
When cloud server providers like AWS stormed the market, many businesses still held to this old method of working. Instead of integrating operations into engineering teams, there still remained specialized operations professionals who would handle services like AWS.
However, this led to a significant change in how businesses viewed servers.
Previously, servers were considered almost as pets. Operations would know what OS they ran on, the manufacturer for each specific part, and would have their own preferences for where to source hardware. But, now, servers became livestock. Businesses started to regard them as beasts of burden that we can buy, sell, and switch out as needed.
While this meant that engineering teams could get on with building software in the ways they always had before, it effectively canceled out the main benefit of cloud-based servers.
Operation teams could easily reconfigure servers as needed with only a few clicks, but devs would still have to raise tickets for changes like increasing capacity or reallocating resources. So, they’d often be waiting for weeks for a change that could be made in less than a minute if they had access to the same systems.
Around this time, AWS and other no-code solutions were starting to gain traction in entrepreneurial circles. More businesses than ever before were using IaC and cloud computing to patch together prototypes of new products, find investors, and bring new products to market that were far more innovative than products from the previous generation.
With the DevOps bar being constantly raised higher, businesses were forced to step up their game or risk falling by the wayside. So, the concept of using IaC to integrate operations into an engineering environment was born.
Introducing IaC as a viable DevOps solution
In 2012, Barry and Kief were contacted by Channel 4, a UK TV station, to build and implement a solution that would help their servers manage a massive influx of demand.
This channel was, at the time, airing a show hosted by Jamie Oliver who would tell viewers to visit the Channel 4 website for recipes at the end of the show. So, every week, the website would crash because the servers couldn’t handle the load - but Channel 4 didn’t get enough demand at other times to warrant fully upgrading their server capacity.
At the time, the concept of IaC and using cloud server providers was still relatively new. However, creating an environment where developers could control the website’s infrastructure from the engineering department presented a logical and cost-effective solution.
For Channel 4, this presented a clear benefit, even if it was hard to convince them as an older business to unlearn the concept of keeping engineering and operations separate.
Fortunately, a significant benefit of IaC is that it’s surprisingly easy to communicate value across different levels of the organization.
Developers understand the need for having a toolkit for infrastructure and a way to easily scale server usage. IT and operations don’t need to spend all of their time purchasing and configuring hardware. And, you can tell the board that they only have to pay for the server load they need and that they don’t have to fork out for server capacity that’ll never get used.
Up until this point, however, IaC was about using software like Puppet and Chef to manage your infrastructure. There was a degree of automation involved, however, more often than not, Ops would still need to write custom scripts to manage server loads and configure preferences.
At the time of the Channel 4 project, IaC as we understand it today began to emerge. It’s projects like this that highlighted the need for adaptability, flexibility, and the desire to continuously push this next generation of DevOps to its limit. A whole new generation of IaC approaches and tooling emerged, with tech like CloudFormation, Terraform and Pulumi allowing operators to easily manage whole cloud infrastructures, such as Kubernetes clusters and serverless setups across multiple cloud providers.
What you need to unlearn
So, with all of that in mind, what do we need to take forward into our own businesses?
As Barry says, it’s extremely easy for businesses to fall into the sunk cost fallacy and become resistant to change.
Most businesses want to play it safe when it comes to bleeding-edge technology. Without having the technical knowledge to understand whether the benefits a new tech stack promises are worth the paper they’re written on, C-suite executives generally don’t want to invest a good chunk of cash into retraining their employees and switching their tech stack to something new.
However, developers want to experiment with new technologies. The industry is constantly moving forwards, and the last thing your developers want is to be stuck performing mundane tasks with an out-of-date tech stack.
It’s a big reason why many companies that fail to implement modern concepts like IaC can’t retain their talent or hire high-quality developers.
It’s an unfortunate fact that many companies see new technologies and concepts like IaC as investing only in that technology. When, in fact, we need to look at it as investing in the ability to use technology.
Investing in AWS for your business, and training your talent to use AWS in their tech stack isn’t locking your business into only ever using AWS for cloud servers.
There is something to be said for having an over-reliance on one vendor, and that’s something that deserves a blog post of its own. But as Gregor Hohpe clearly explains in his article, most teams should really worry about velocity rather than vendor lock-in.
However, when you’re investing in, for example, AWS for your business, you’re not saying that AWS is all you’ll ever use. You’re investing in the ability to adapt to new technologies in your business, to move forward, and to learn and adapt as you need to.
When we talk about unlearning, it can put people on the defensive. It’s understandable to see why. Having your ideas challenged can feel like a personal attack, and so the concept of unlearning has come to be seen as synonymous with “everything you know is wrong, and that makes you a bad person”.
That couldn’t be further from the truth.
Unlearning is a conscious process of letting go of what you think you know in the light of new information. In the context of DevOps, it’s understanding that as technology moves forwards, both you and your product need to as well.
As one of the attendees rightly put it - “Are you a Python developer, or are you a developer who currently works with Python?”.
The biggest thing we have to unlearn here is using what previously worked as a blueprint for the future because it’s a surefire way to get locked into a way of working that fails to adapt to modern technology and DevOps principles.
The future of IaC and DevOps
As we move forwards with IaC in DevOps, both BarryBrian and Kief agree that we’re going to see more instances of teams using the “golden paths” concept in development.
Because IaC allows for developers to build and use infrastructure elements that they can quickly implement, the most logical step is to build the elements that are most commonly needed within company projects and make sure “reivent the wheel” syndrome is stopped in its tracks.
However, this doesn’t mean forcing your developers to work with a specific set of tools - you need to give your developers room to go “off-road” and find better ways to achieve what the project requires.
That’s why the “golden path” concept focuses primarily on the foundational elements of a developer workflow. Ideally, IaC in DevOps will have room for your talent to build custom infrastructure elements, but they’ll be able to use these pre-built containers around 80% of the time.
This has two significant advantages. The first is that organizations, naturally, want to have a great deal of harmony within their projects, and building common elements helps to build a language to make the entire development process more efficient.
Secondly, this reduces a significant amount of cognitive load from developers, as it gives them the building blocks they need to work the rest of the project around. So, they spend less time working on mundane tasks, and more time on complex and creative problem solving that enhances the final product.
This will, of course, look different for every business. Different developer teams have different needs to work directly with infrastructure elements, and it’s only natural that some businesses will need to work with custom-built solutions versus stitching together elements from low- and no-code solutions.
Kief describes it as comparing a predetermined Lego kit versus a box of Lego bricks. Some teams will want a blueprint to work from, and others will want complete freedom to build what they need when they need it.
Infrastructure as Code allows for hardware to work in greater harmony with your software. At its core, it’s about flexibility, modernity, and the courage to constantly push the boundaries of new technologies. If there’s one key “unlearning” point you need to remember, it’s that we can’t allow for the sunk cost fallacy to creep into modern DevOps.