Documentation

Grafana mimir-distributed Helm chart

Configure

Configure autoscaling

Configure Grafana Mimir autoscaling with Helm

Warning
Autoscaling support in the Helm chart is currently experimental. Use with caution in production environments and thoroughly test in a non-production environment first.

You can configure autoscaling for Mimir components using (Kubernetes Event-driven Autoscaling) KEDA and Kubernetes Horizontal Pod Autoscaler (HPA).

Before you begin

Ensure you have a running Mimir cluster deployed with Helm.
Verify you have the required permissions to modify Helm deployments.
Familiarize yourself with Kubernetes HPA concepts.

Prerequisites

To use autoscaling, you need:

KEDA installed in your Kubernetes cluster
Prometheus metrics available for scaling decisions

Warning
Don’t use the same Mimir or Grafana Enterprise Metrics cluster for storing and querying autoscaling metrics. Using the same cluster can create a dangerous feedback loop.
For instance, if the Mimir or GEM cluster becomes unavailable, autoscaling stops working, because it cannot query the metrics. This prevents the cluster from automatically scaling up during high load or recovery. This inability to scale further exacerbates the cluster’s unavailability, which might, in turn, prevent the cluster from recovering.
Instead, use a separate Prometheus instance or a different metrics backend for autoscaling metrics.

Supported components

The Mimir Helm chart supports autoscaling for the following components:

About KEDA

KEDA is a Kubernetes operator that simplifies the setup of HPA with custom metrics from Prometheus. It consists of:

An operator and external metrics server
Support for multiple metric sources, including Prometheus
Custom resources (ScaledObject) that define scaling parameters
Automatic HPA resource management

For more information, refer to the KEDA documentation.

Configure autoscaling for a new installation

Follow these steps to enable autoscaling when deploying Mimir for the first time.

Steps

Configure the Prometheus metrics source in your values file:

kedaAutoscaling:
  prometheusAddress: "http://prometheus.monitoring:9090"
  pollingInterval: 10

Enable and configure autoscaling for desired components:

querier:
  kedaAutoscaling:
    enabled: true
    minReplicaCount: 2
    maxReplicaCount: 10

Deploy Mimir using Helm:

helm upgrade --install mimir grafana/mimir-distributed -f values.yaml

Expected outcome

After deployment:

KEDA creates ScaledObject resources for configured components.
HPA resources are automatically created and begin monitoring metrics.
Components scale based on configured thresholds and behaviors.

Migrate existing deployments to autoscaling

Follow these steps to enable autoscaling for an existing Mimir deployment.

Warning
Autoscaling support in the Helm chart is currently experimental. Migrating to autoscaling carries risks for cluster availability.
Enabling autoscaling removes the replicas field from deployments. If KEDA/HPA hasn’t started autoscaling a deployment yet, Kubernetes interprets no replicas as meaning 1 replica. This can cause an outage if the transition is not handled carefully. If you’re using GitOps tools like FluxCD or ArgoCD, you might need to take additional steps to manage the transition.
Consider testing the migration in a non-production environment first.

Before you begin

Back up your current Helm values.
Plan for potential service disruption.
Consider testing in a non-production environment first.
Ensure you have a rollback plan ready.
Consider migrating one component at a time to minimize risk.

Steps

Add the autoscaling configuration with preserveReplicas enabled:

querier:
  kedaAutoscaling:
    enabled: true
    preserveReplicas: true # Maintains stability during migration
    # ... autoscaling configuration ...

Apply the changes and verify the KEDA setup:

# Apply changes
helm upgrade mimir grafana/mimir-distributed -f values.yaml

# Verify setup
kubectl get hpa
kubectl get scaledobject
kubectl describe hpa

Wait 2-3 polling intervals to confirm that KEDA is managing scaling.

Remove preserveReplicas.

querier:
  kedaAutoscaling:
    enabled: true
    # Remove preserveReplicas

Apply the updated configuration:

helm upgrade mimir grafana/mimir-distributed -f values.yaml

Troubleshooting

If pods scale down to 1 replica after removing preserveReplicas:

Revert changes:

querier:
  kedaAutoscaling:
    enabled: true
    preserveReplicas: true

Verify KEDA setup:
- Check HPA status
- Verify metrics are being received
- Check for conflicts with other tools
- Ensure enough time was given for KEDA to take control (at least 2-3 polling intervals)
Try migrating again after resolving issues.

Note
If you’re using GitOps tools like FluxCD or ArgoCD, they might try to reconcile the state and conflict with HPA’s scaling decisions. Consult your GitOps tool’s documentation for handling HPA transitions.

Monitor autoscaling health

The following conditions indicate unhealthy autoscaling:

KEDA operator is down: ScaledObject changes don’t propagate to HPA.
KEDA metrics server is down: HPA can’t receive updated metrics.
HPA is unable to scale: MimirAutoscalerNotActive alert fires.

For production deployments, configure high availability for KEDA.

For more information about monitoring autoscaling, refer to Monitor Grafana Mimir.

Was this page helpful?

Email docs@grafana.com

Help and support

Community

Configure Grafana Mimir autoscaling with Helm

Before you begin

Prerequisites

Supported components

About KEDA

Configure autoscaling for a new installation

Steps

Expected outcome

Migrate existing deployments to autoscaling

Before you begin

Steps

Troubleshooting

Monitor autoscaling health

Was this page helpful?

Related documentation