Skip to content

Velero Reference: K8s Backup, Restore, Disaster Recovery & Cross-Cluster Migration

Velero is the standard tool for Kubernetes cluster backup and disaster recovery — it backs up cluster resources and persistent volumes, enables cross-cluster migration, and supports scheduled backups to object storage (S3, GCS, Azure Blob).

1. Installation & Concepts

Install Velero and connect to object storage
# Install velero CLI:
brew install velero    # macOS
# Or download from: https://github.com/vmware-tanzu/velero/releases

# Install Velero server on K8s (AWS S3 example):
velero install   --provider aws   --plugins velero/velero-plugin-for-aws:v1.9.0   --bucket my-velero-backups   --backup-location-config region=us-east-1   --snapshot-location-config region=us-east-1   --secret-file ./aws-credentials           # AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY

# Credentials file format (~/.aws/credentials compatible):
# [default]
# aws_access_key_id=AKIAIOSFODNN7EXAMPLE
# aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

# AWS IAM permissions needed (attach to the IAM user/role):
# S3: s3:GetObject, s3:PutObject, s3:DeleteObject, s3:ListBucket
# EC2 EBS snapshots: ec2:CreateSnapshot, ec2:DeleteSnapshot, ec2:DescribeVolumes

# GCS example:
velero install   --provider gcp   --plugins velero/velero-plugin-for-gcp:v1.9.0   --bucket my-velero-backups   --secret-file ./gcp-credentials.json       # service account key

# Verify:
kubectl get pods -n velero
velero backup-location get                   # should show "Available"

2. Backups

Create, schedule, and manage backups
# Create a one-off backup:
velero backup create my-backup                             # backup everything
velero backup create my-backup --include-namespaces production  # specific namespace
velero backup create my-backup --include-namespaces production,staging
velero backup create my-backup --exclude-namespaces kube-system,velero
velero backup create my-backup   --include-resources deployments,services,configmaps,secrets  # specific resource types
velero backup create my-backup --snapshot-volumes=false    # skip PV snapshots (faster)

# Pre/post backup hooks (exec commands in pods before/after backup):
# Add annotation to pod:
kubectl annotate pod my-pod   pre.hook.backup.velero.io/command='["/bin/bash", "-c", "mysqldump -u root -p$MYSQL_ROOT_PASSWORD mydb > /backup/dump.sql"]'
  pre.hook.backup.velero.io/container=mysql
  post.hook.backup.velero.io/command='["/bin/bash", "-c", "rm /backup/dump.sql"]'

# Monitor backup:
velero backup describe my-backup
velero backup logs my-backup
velero backup get                   # list all backups with status

# Scheduled backups (cron syntax):
velero schedule create daily-backup   --schedule="0 2 * * *" \          # 2am daily
  --include-namespaces production   --ttl 720h                        # keep for 30 days, then auto-delete

velero schedule get                  # list schedules
velero schedule describe daily-backup

3. Restores

Restore from backup — full, namespace, or specific resources
# Restore everything from a backup:
velero restore create --from-backup my-backup

# Restore specific namespace(s):
velero restore create --from-backup my-backup   --include-namespaces production

# Restore to a different namespace:
velero restore create --from-backup my-backup   --namespace-mappings production:production-restored

# Restore only specific resource types:
velero restore create --from-backup my-backup   --include-resources deployments,services

# Exclude PVs from restore (restore config but not data):
velero restore create --from-backup my-backup   --restore-volumes=false

# Check restore status:
velero restore describe my-backup-TIMESTAMP
velero restore logs my-backup-TIMESTAMP
velero restore get                   # list all restores

# Restore from a scheduled backup (pick specific run):
velero backup get | grep daily-backup   # find the backup name
velero restore create --from-backup daily-backup-20260314020000
Test your restores. Backup completion means data was stored — it doesn’t guarantee a restore will succeed. Run a quarterly restore drill to a separate namespace to verify your backups are actually usable.

4. Disaster Recovery & Migration

Cross-cluster restore and cluster migration pattern
# Cross-cluster migration pattern:
# Source cluster:
velero backup create cluster-migration   --include-namespaces production   --snapshot-volumes=true            # include PV data

# Destination cluster (point Velero at the same S3 bucket):
velero install   --provider aws   --plugins velero/velero-plugin-for-aws:v1.9.0   --bucket my-velero-backups \       # SAME bucket as source
  --backup-location-config region=us-east-1

# Destination will sync and see the backup:
velero backup get                    # cluster-migration should appear
velero restore create --from-backup cluster-migration

# Verify restored resources:
kubectl get all -n production
kubectl get pvc -n production        # check PVCs bound

# Partial migration — skip cluster-level resources:
velero restore create --from-backup cluster-migration   --exclude-resources nodes,events,namespaces,storageclasses,persistentvolumes

# Object storage only migration (no EBS/disk snapshots, use for cross-region):
velero backup create migration --snapshot-volumes=false
# Uses restic/kopia file backup for PV contents — slower but works cross-region/provider

5. Volume Backup with Restic/Kopia

Filesystem-level PV backup when snapshot isn’t available
# Node-agent (formerly restic) — filesystem backup when cloud snapshots unavailable
# e.g.: on-prem, cross-cloud migration, hostPath volumes

# Enable node-agent during install:
velero install ... --use-node-agent

# Opt-in per pod (annotate to include PV in file backup):
kubectl annotate pod my-pod   backup.velero.io/backup-volumes=my-pvc-volume,another-volume
# OR opt-out specific volumes:
kubectl annotate pod my-pod   backup.velero.io/backup-volumes-excludes=cache-volume

# Check node-agent pods (one per node):
kubectl get pods -n velero -l name=node-agent

# Restic backups are slower than native snapshots but:
# - Work on any storage backend (not just cloud EBS/GPD)
# - Enable cross-cloud migration (AWS → GCS)
# - Work for hostPath and local storage

# Monitor backup with PV contents:
velero backup describe my-backup --details   # shows PodVolumeBackup objects

6. Debugging & Operations

Common failures and operational checks
# Check overall Velero health:
velero backup-location get           # must show "Available"
velero snapshot-location get         # for PV snapshots

# If backup-location shows "Unavailable":
kubectl logs -n velero deploy/velero | grep -i error
# Common causes: IAM permissions, bucket doesn't exist, wrong region

# Backup stuck in "InProgress":
velero backup describe my-backup     # shows phase + errors
kubectl get backups.velero.io my-backup -n velero -o yaml  # raw status
kubectl logs -n velero deploy/velero --tail=100

# Restore missing resources:
velero restore describe my-restore --details
# Look for: Errors, Warnings, Phase
# "Resource already exists" warnings = OK, existing resources skipped

# Delete old backups manually:
velero backup delete my-backup       # also removes from object storage
velero backup delete --all           # DANGEROUS — removes everything

# Upgrade Velero:
# 1. Update Velero server (CRDs + deployment):
velero install ... --image velero/velero:v1.13.0 # run install again with new version

# Metrics (Velero exposes Prometheus metrics):
kubectl port-forward -n velero deploy/velero 8085
curl http://localhost:8085/metrics | grep velero_backup
# Key metrics: velero_backup_total, velero_backup_success_total, velero_backup_failed_total

Track Velero, Kubernetes, and backup tool releases.
ReleaseRun monitors Kubernetes, Docker, and 13+ DevOps technologies.

Related: Kubernetes Storage Reference | Kubernetes YAML Reference | ArgoCD Reference

🔍 Free tool: K8s YAML Security Linter — check your K8s workload manifests for security issues before Velero backs them up.

Founded

2023 in London, UK

Contact

hello@releaserun.com