In this talk, we will cover the architectural details you must understand about Kube-Prometheus-Stack to be successful. Then we will dive into best practices and techniques that you can use to make sure that no critical issue goes undetected, and that your team is not overwhelmed with alerts no matter how big your environment and how many clusters you have.
- What are the key components of Kube-Prometheus-Stack and what is the ideal setup and common gotchas
- How you can reduce the chance of missing critical alerts and take control of alerting volume
- What other tools are present in the open-source Prometheus ecosystem which can help you
Audience - who should join?
Platform engineers, platform architects, DevOps engineers, site reliability engineers (SREs), infrastructure and operations, security engineers, enterprise and solution architects, application developers with an affinity for platform engineering and technical management focusing on improving DevEx and ops efficiency.