Skip to content

KEDA Reference: Event-Driven Autoscaling — Kafka, SQS, Redis, Prometheus & Scale to Zero

KEDA (Kubernetes Event-Driven Autoscaling) is a CNCF-graduated project that scales Kubernetes workloads based on external event sources — Kafka lag, queue depth, HTTP request rate, cron, or any custom metric — rather than just CPU/memory.

1. KEDA vs HPA

When KEDA, when standard HPA
Feature KEDA Standard HPA
Scale to zero Yes — scale to 0 when no events No — minimum 1 replica
Scale triggers 80+ scalers: Kafka, RabbitMQ, SQS, Redis, Prometheus, HTTP, cron, custom CPU, memory, custom metrics via HPA v2
External events Native — polls source directly (no metrics pipeline needed) Requires custom metrics adapter/Prometheus adapter
CRD approach ScaledObject / ScaledJob CRDs HorizontalPodAutoscaler CRD
Use case Event-driven workloads, batch jobs, queue consumers Web services with CPU/memory load patterns
# Install KEDA:
helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace

kubectl get pods -n keda               # keda-operator + keda-metrics-apiserver
kubectl get crds | grep keda           # ScaledObject, ScaledJob, TriggerAuthentication

2. Kafka Scaler

Scale consumers based on consumer group lag
# ScaledObject: binds KEDA to your Deployment
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: my-kafka-consumer            # Deployment to scale
  minReplicaCount: 0                   # scale to zero when no lag
  maxReplicaCount: 20
  pollingInterval: 15                  # check every 15 seconds
  cooldownPeriod: 300                  # wait 5 min before scaling down
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka:9092
        consumerGroup: my-consumer-group
        topic: orders
        lagThreshold: "50"             # scale up when lag > 50 messages per partition
        offsetResetPolicy: latest      # where to start if no offset exists

# With SASL authentication:
      authenticationRef:
        name: kafka-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-auth
spec:
  secretTargetRef:
    - parameter: sasl
      name: kafka-credentials
      key: sasl
    - parameter: username
      name: kafka-credentials
      key: username
    - parameter: password
      name: kafka-credentials
      key: password

3. SQS, Redis & Prometheus Scalers

AWS SQS queue depth, Redis list length, Prometheus metric
# AWS SQS scaler:
triggers:
  - type: aws-sqs-queue
    authenticationRef:
      name: aws-auth-kiam          # KIAM/IRSA for AWS IAM
    metadata:
      queueURL: https://sqs.us-east-1.amazonaws.com/123456789/my-queue
      queueLength: "10"            # target: 10 messages per replica
      awsRegion: us-east-1

# Redis list length:
triggers:
  - type: redis
    metadata:
      address: redis:6379
      listName: job-queue
      listLength: "20"             # scale when list > 20 items
      enableTLS: "false"

# Prometheus metric (most flexible — any custom metric):
triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring:9090
      metricName: http_requests_pending
      threshold: "100"             # scale when metric > 100
      query: sum(http_requests_pending{service="my-service"})

# RabbitMQ queue depth:
triggers:
  - type: rabbitmq
    authenticationRef:
      name: rabbitmq-auth
    metadata:
      protocol: amqp
      queueName: task-queue
      mode: QueueLength             # or MessageRate
      value: "20"                   # scale when depth > 20

# HTTP scaler (scale on pending HTTP requests — great for scale-to-zero APIs):
# Requires KEDA HTTP Addon: helm install http-add-on kedacore/keda-add-ons-http
triggers:
  - type: http
    metadata:
      targetPendingRequests: "100"  # scale when > 100 pending HTTP requests

4. ScaledJob

Spawn K8s Jobs from a queue — one job per message
# ScaledJob: creates K8s Jobs (not Deployment replicas) per event
# Use when: each job is stateless, processes one item, should run to completion

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: image-processor
  namespace: production
spec:
  jobTargetRef:
    parallelism: 1
    completions: 1
    activeDeadlineSeconds: 600      # job timeout
    backoffLimit: 2
    template:
      spec:
        containers:
          - name: processor
            image: my-image-processor:latest
            env:
              - name: QUEUE_URL
                value: https://sqs.us-east-1.amazonaws.com/123/images
  pollingInterval: 10
  maxReplicaCount: 50               # max concurrent jobs
  triggers:
    - type: aws-sqs-queue
      metadata:
        queueURL: https://sqs.us-east-1.amazonaws.com/123/images
        queueLength: "1"            # 1 job per message
        awsRegion: us-east-1
  scalingStrategy:
    strategy: accurate              # default=default (batches jobs), accurate=1:1 with messages

5. Cron & Operations

Scheduled scaling, status checks, and debugging
# Cron scaler (scale up at peak hours, down at night):
triggers:
  - type: cron
    metadata:
      timezone: Europe/London
      start: 30 9 * * 1-5       # 9:30am weekdays
      end: 0 18 * * 1-5         # 6pm weekdays
      desiredReplicas: "10"      # scale to 10 during business hours

# Combine multiple triggers (OR — scale if ANY trigger threshold is met):
triggers:
  - type: cron              # scale up during business hours
    metadata: {timezone: UTC, start: "0 9 * * 1-5", end: "0 17 * * 1-5", desiredReplicas: "5"}
  - type: kafka             # AND also scale based on lag
    metadata: {bootstrapServers: kafka:9092, consumerGroup: grp, topic: events, lagThreshold: "100"}

# Check ScaledObject status:
kubectl get scaledobject -n production
kubectl describe scaledobject kafka-consumer-scaler -n production
# Look for: READY=True, Active=True/False, Trigger counts

# Get current replica count from KEDA perspective:
kubectl get hpa -n production    # KEDA creates an HPA behind the scenes

# Pause scaling (maintenance, testing):
kubectl annotate scaledobject kafka-consumer-scaler autoscaling.keda.sh/paused=true -n production
# Resume:
kubectl annotate scaledobject kafka-consumer-scaler autoscaling.keda.sh/paused- -n production

# Debug: check KEDA operator logs:
kubectl logs -n keda deploy/keda-operator --tail=50 | grep -i error

Track KEDA, Kubernetes, and autoscaling tool releases.
ReleaseRun monitors Kubernetes, Docker, and 13+ DevOps technologies.

Related: Apache Kafka Reference | Kubernetes YAML Reference | Prometheus Reference | Kubernetes EOL Tracker

🔍 Free tool: K8s YAML Security Linter — check the K8s workload manifests KEDA scales for security misconfigurations.

Founded

2023 in London, UK

Contact

hello@releaserun.com