In traditional development models, SRE and Ops teams are, more often than not, the biggest source of blockage for development teams. With this ticket-based pipeline, developers have to wait on these teams when they need to requisition infrastructure. This leaves developers waiting days or weeks for SRE or Ops teams to complete a task that takes only a few minutes.
With developer experience (DevEx)becoming a more pressing issue, more organizations than ever before are turning to GitOps and Internal Developer Platforms ( IDPs) to improve and simplify developer workflows.
In August, we were joined by Stefan Kolesnikowicz, Principle Site Reliability Engineer (SRE) at Achievers, to discuss how he helped move the company into microservices, help them scale, and improve the developer experience through GitOps.
Moving away from ticket-based Ops
When Stefan first joined Achievers, the company was using a traditional engineering approach with developers, operations, and SRE teams all acting independently of each other. This meant that the SRE team usually blocked other engineering teams as everything would be waiting on SRE getting around to dealing with individual tickets.
In addition, high availability and reliability were rarely on the engineers’ radar. Development teams would develop their code and business logic, and anything to do with production or deployment was handled by Ops and SRE teams.
There was also the problem of creating an engineering environment that could handle Achievers’ growth as a business.
Stefan explains:
“We had this giant monolith and had to split it into a distributed system. We needed to be able to scale across the world and meet data residency requirements.”
Because Achievers was looking at growing the business outside of Canada, they needed an engineering environment that would allow for teams around the world to deploy quickly and adapt to local laws without needing to run everything through SRE first.
This led the team to Stefan adopting Kubernetes to build a self-service application for the development teams.
Improving DevEx with GitOps
Building internal tools like IDPs takes a lot of time, planning, dedication, and ongoing maintenance. It's important for businesses to treat their IDPs with the same respect and resources as they would any other product.
Stefan says:
“We need to follow the same mindset for developing new tools and workflows for engineers. We want engineers to be happy because they’re our customers, and we want them to continue to use our tools and improve on that process”.
The key philosophy for the GitOps project was that no SRE should block an engineer from managing the lifecycle of their microservice, and this was something that drove the team towards making their internal platforms as easy to navigate as possible. So, it was decided early on to use common elements and templates to make the system easy to adopt.
However, the system also had to adapt to different countries and DevOps environments where teams might not need the same feature suite. This led to techniques like feature flags, so engineers could turn features on and off without having to involve SRE to handle that process for them.
This led to the development of three main steps to handle every part of the development pipeline:
For each step, Stefan said it was important to make everything as simple for engineers as possible:
“Try to keep everything fluent. If the engineers are deploying something to development, it should be a similar process to deploying to UAT and production. It shouldn’t veer too much.”
Specifically, the team used Tilt to handle these processes for two reasons. One was that it integrates well with Kubernetes, so it could fit seamlessly into this workflow. But, most importantly, it’s a Python dialect that most of the engineers were already familiar with, so adapting to use this tool had a very low barrier to entry.
Key learnings from adopting GitOps
Because Achievers was so focused on creating a positive developer experience, this directly impacted the way the team developed their internal platform and GitOps workflows. The concept of using golden paths instead of golden cages - something that Kaspar, the CEO of Humanitec has mentioned in previous webinars - is something that heavily resonated with Stefan. He says:
“A key part of this was empowering the engineering team to have more control. We didn’t give our engineers a lot of access to stuff in the past, so we’ve opened that up a lot more to give them more flexibility without going off the rails.”
An important part of this was building these internal systems with the knowledge that GitOps wouldn’t be appropriate for everything. By still using traditional pipelines in some cases, the DevOps environment could run effectively without needing to maintain GitOps in workflows where it didn’t belong.
The team also learned that it was vital to use the same languages and structures that engineers were already familiar with. This made it as easy as possible for engineers to adopt this new technology because it lowered the barrier to entry.
However, there’s no point in reinventing the wheel when there are tools on the market that already fulfill the role teams need. In this case, Argo CD met the team’s needs for a flexible and decentralized tool that supported the use of GitHub as a source of truth.
Measuring success
Very early in the project, Stefan understood it was vital to use internal Service Level Objectives (SLOs) to measure the project’s success and adapt the technology as it was implemented. He explains:
“A great thing to measure is the rate of adoption, which is the number of new users versus regular users, and how long it’s taken teams to adapt something over time.”
Similarly, teams should consider tracking other metrics like time to production and the amount of tickets and manual work done to understand how well their platform automations are being implemented and utilized.
Customer satisfaction was also a significant focus of this project, so the team used Achievers’ software to run quarterly surveys to gauge how engineers used the tool and gather feedback. They also have SRE office hours to talk to engineers directly, run troubleshooting sessions, and to open up about their process as a team.
With that in mind, Stefan says teams implementing GitOps and Kubernetes should be prepared to tackle a flurry of feature requests in the early days of your product launch.
“Feature requests came fast and furious, and it got a bit out of control. A lot of people didn’t know what they wanted early on, so it was an evolution and we had to make changes as people asked”.
Finally, the team put procedures in place to handle the culture change that came with implementing this new tool. This meant that, if internal SLOs weren’t met for this project, engineers would focus on improving reliability over pushing new features, which resulted in a far greater focus on quality over quantity across engineering teams.
GitOps and DevEx: In summary
While GitOps can be a fantastic approach for improving developer experience, it’s not the only tool you can use to make your DevOps environment more efficient. By understanding how and where GitOps should fit into your workflow, the limitations of this philosophy, and what tools your teams need to succeed, you can build internal platforms that keep developers happy.
Thanks again to Stefan Kolesnikowicz for joining us for this webinar. If you haven’t already, make sure to watch the full webinar to get the full story of how Stefan improved the developer experience at Achievers.