Vol. 44: The era of dynamic configuration management

I’ll allow myself a little bit of self-advertisement. I recently wrote an article I’m really proud of about Dynamic Configuration Management. Something I’m irrationally excited about.

Why? Because I believe it’s the architectural answer to all the confusion, anger and sorrow that occurs in larger engineering teams whenever developers need to make a change to their services and dependencies. Ticket ops, waiting times, high maintenance, change failure rate. All these things can be fundamentally improved through dynamic configuration management.

In the article, I argue that the current static approach to application and infrastructure configuration management is the root evil in most cloud-native delivery setups. Static here refers to the way microservices talk to each other and to their dependent resources, as this gets manually scripted by humans for each environment and against a static set of infrastructure components. 99% of all setups today follow this approach that I refer to as “static configuration management”. Your “classic” Helm chart and Terraform setup are almost certainly designed precisely this way.

This static approach causes problems every time a team wants to do things that go beyond the simple update of an image (e.g. rolling back, changing the config or the application architecture, adding infrastructure, etc.). Every change in static setups requires the alignment of silos inside the team, adds unnecessary cognitive load, and is hard to maintain or hand over. The problem gets worse as a function of team size and the number of services.

The solution I am highlighting is one we’ve seen adopted well at several high performing organizations in the last 2-3 years, especially those building dynamic Internal Developer Platforms (approaching configuration management dynamically heavily correlates with being part of the platform engineering movement). We call this approach “dynamic configuration management”. Following this method, configurations are split into environment-agnostic and environment-specific elements. The developer describes the workload and its relationship to the rest of the architecture (several workloads with dependent resources such as DBs, file storage, DNS, etc.) in one file. The actual configuration and representation of the application are dynamically created with every deployment and executed. Using an always apt cooking analogy, rather than delivering a baked cake ready for consumption, where almost nothing (filling, toppings, etc.) can be swapped easily, we instead deliver a cooking recipe and bake a fresh new cake with every deployment.

Enjoy it, criticize it, attack my opinion! While you are at it, check out all the cool webinars we have coming up:

Infrastructure GitOps for EKS - Rafay project leader Praveen Kashimsetty demonstrates how to build a multi-stage pipeline for integrating EKS with your GitOps deployments.
Palantir’s GitOps journey with Apollo - Greg DeArment from Palantir shares his experience from the last 7 years of developing the Apollo platforms and hands out nuggets on do's and dont's in scaling GitOps.
Kubernetes Antipatterns: CPU Limits - Robusta’s Natan Yellin explains how they work and go over best practices for using Kubernetes and also whether you should or shouldn’t apply limits to your own workloads.

Enjoy your summer holidays and all the best from overheated Germany!

All the best,

Kaspar