Operating Kubernetes at Scale: Autoscaling and Remediation Best Practices

Date

Jun 11, 2025

Time

3:30 PM EDT - 4:00 PM EDT

Location

TBD

Kubernetes provides powerful autoscaling capabilities, but scaling alone is not enough—teams must also address performance bottlenecks, resource contention, and service reliability challenges.

In this session, we’ll dive into strategies for effectively leveraging Kubernetes Autoscaling and Active Remediation. We will demonstrate how Datadog’s platform provides the insights and direct control needed to action on data-driven scaling decisions such as over-provisioning, under-utilization, and unpredictable traffic spikes. Using event-driven alerting and automation tools, we will show how you can set up self-healing workflows to mitigate performance degradations and infrastructure failures before they impact users.

We will cover:

Configuration and tuning with Kubernetes Autoscalers for efficiency and cost savings
Use cases demonstrating automated scaling and troubleshooting in action
Live remediation workflows to reduce downtime and improve resolution speed

Whether you're managing a growing Kubernetes deployment or looking to enhance operational resilience, this session will equip you with the knowledge and tools to scale with confidence.

sharing to your network