Skip to content

Kubernetes Storage Reference

Kubernetes Storage Reference

PersistentVolumes, PersistentVolumeClaims, StorageClasses, StatefulSet volumes, CSI drivers, and the reclaim policy trap that deletes your data.

PersistentVolume and PersistentVolumeClaim — the basics
# PersistentVolume (PV) — the actual storage resource (cluster-level)
# Usually created by a StorageClass (dynamic provisioning) — not manually
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce       # see access modes table below
  persistentVolumeReclaimPolicy: Retain  # or Delete — see below
  storageClassName: gp3   # matches PVC's storageClassName
  csi:                    # modern CSI driver format
    driver: ebs.csi.aws.com
    volumeHandle: vol-0abc123def456789
    fsType: ext4

---
# PersistentVolumeClaim (PVC) — a request for storage (namespace-level)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-data
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: gp3      # must match a StorageClass
  # volumeName: my-pv       # uncomment to bind to a specific PV

---
# Mount the PVC in a pod
spec:
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: postgres-data   # must be in same namespace
  containers:
    - name: postgres
      volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
Access Mode Short Meaning Typical use
ReadWriteOnce RWO One node at a time, read+write Databases, single-replica apps
ReadOnlyMany ROX Many nodes, read only Shared config, static assets
ReadWriteMany RWX Many nodes, read+write Shared data, NFS, EFS
ReadWriteOncePod RWOP One pod only (K8s 1.22+) Strict single-writer guarantee

RWO is per-node, not per-pod — multiple pods on the same node can use an RWO volume. Use RWOP (ReadWriteOncePod) if you need true single-pod access.

StorageClass — dynamic provisioning
# StorageClass — defines how PVs are dynamically created
# PVCs with storageClassName matching a StorageClass get auto-provisioned

# AWS EBS gp3 (recommended over gp2)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"  # default for PVCs
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer   # provision in same AZ as pod
reclaimPolicy: Delete                     # or Retain
allowVolumeExpansion: true
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  encrypted: "true"
  kmsKeyId: arn:aws:kms:us-east-1:123456789:key/my-key

---
# GKE persistent disk
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard-rwo
provisioner: pd.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: true
parameters:
  type: pd-ssd       # or pd-standard, pd-balanced, pd-extreme

---
# Azure managed disk
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-premium
provisioner: disk.csi.azure.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: true
parameters:
  skuName: Premium_LRS   # or Standard_LRS, UltraSSD_LRS

---
# NFS / shared storage (ReadWriteMany)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-0123456789abcdef
  directoryPerms: "700"

WaitForFirstConsumer is critical on multi-AZ clusters. Without it, PVs are provisioned immediately in a random AZ. If your pod then schedules to a different AZ, it can’t mount the volume. Always use WaitForFirstConsumer for block storage on cloud.

Reclaim policy — how to not accidentally delete your data
# reclaimPolicy controls what happens to a PV when its PVC is deleted

# Delete (default for dynamically provisioned)
# → PV is deleted AND the underlying storage is DELETED
# → Data is GONE. Use for ephemeral/development workloads.
reclaimPolicy: Delete

# Retain
# → PV is kept (status: Released) but NOT reusable by a new PVC automatically
# → Underlying storage (EBS volume, GCS disk) is KEPT
# → Manual cleanup required
reclaimPolicy: Retain

# How to reuse a Retained PV:
# 1. kubectl delete pv my-pv        (deletes the K8s object, NOT the storage)
# 2. Create a new PV pointing to the same underlying storage
# 3. Create PVC with volumeName: my-new-pv

# Check PV status
kubectl get pv
# Bound     — in use by a PVC
# Available — not yet bound (freshly created)
# Released  — PVC was deleted, Retain policy, waiting for admin
# Failed    — dynamic provisioning failed

# Change reclaim policy on an existing PV
kubectl patch pv my-pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

# Production recommendation:
# Databases: Retain (never auto-delete database volumes)
# Ephemeral workloads: Delete (automatic cleanup)
# Set per StorageClass or override per PV after creation

The most common data-loss incident in Kubernetes: someone deletes a StatefulSet’s PVC (or kubectl delete namespace) not knowing the StorageClass has reclaimPolicy: Delete. Always check reclaimPolicy before deleting PVCs in production. Use Retain for all databases.

StatefulSet volumes — stable per-pod storage
# StatefulSet — each pod gets its own PVC automatically
# Pod names: postgres-0, postgres-1, postgres-2
# PVC names: data-postgres-0, data-postgres-1, data-postgres-2
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres-headless    # headless service for DNS (required)
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data
            - name: config
              mountPath: /etc/postgresql/conf.d
          env:
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
  volumeClaimTemplates:             # creates one PVC per pod
    - metadata:
        name: data
      spec:
        accessModes: [ReadWriteOnce]
        storageClassName: gp3
        resources:
          requests:
            storage: 50Gi

# Lifecycle notes:
# kubectl delete statefulset postgres               → pods deleted, PVCs KEPT
# kubectl delete statefulset postgres --cascade=orphan → keep pods too
# kubectl delete pvc data-postgres-0               → THIS deletes the data (if Delete policy)
# Scaling down (3→2) leaves PVC for pod-2 in place
# Scaling up again (2→3) reattaches the same PVC to the new pod-2
Volume expansion — resizing PVCs
# Prerequisites:
# 1. StorageClass must have allowVolumeExpansion: true
# 2. CSI driver must support volume expansion
# 3. You can only expand — never shrink

# Check if expansion is allowed
kubectl get storageclass gp3 -o yaml | grep allowVolumeExpansion

# Expand a PVC — just edit the spec.resources.requests.storage
kubectl patch pvc postgres-data -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'
# or kubectl edit pvc postgres-data

# Watch the expansion
kubectl get pvc postgres-data -w
# Condition FileSystemResizePending means: volume expanded but FS not yet expanded
# FS expansion happens when pod is restarted (for most drivers)

# Expansion without pod restart (CSI online expansion)
# Supported by: EBS CSI, GCE PD CSI, Azure Disk CSI (K8s 1.15+)
# If expansion is stuck in FileSystemResizePending state:
kubectl delete pod postgres-0   # triggers FS resize on pod restart

# Expand all PVCs of a StatefulSet
kubectl get pvc -l app=postgres -o name | xargs -I{} kubectl patch {} \
  -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'

# Check resize conditions
kubectl describe pvc postgres-data | grep -A5 Conditions
Volume types — emptyDir, configMap, secret, projected
# emptyDir — ephemeral scratch space (wiped when pod is deleted)
volumes:
  - name: cache
    emptyDir:
      sizeLimit: 1Gi        # optional limit
      medium: Memory         # use RAM instead of disk (tmpfs)

# Use cases for emptyDir:
# - scratch space for processing
# - sharing files between containers in a pod (sidecars)
# - tmpfs for secrets that shouldn't touch disk

# configMap — mount config files as a volume
volumes:
  - name: nginx-config
    configMap:
      name: nginx-conf
      items:                        # optional: mount specific keys
        - key: nginx.conf
          path: nginx.conf
          mode: 0444                # optional: file permissions

# secret — mount secrets as files (prefer over env vars — not in process env)
volumes:
  - name: tls-certs
    secret:
      secretName: my-tls-certs
      defaultMode: 0400             # read-only, owner only
      optional: false               # fail if secret doesn't exist

# projected — merge multiple sources into one mount
volumes:
  - name: projected-volume
    projected:
      sources:
        - configMap:
            name: app-config
        - secret:
            name: db-creds
        - serviceAccountToken:     # custom SA token (audience, expiry)
            audience: vault
            expirationSeconds: 3600
            path: vault-token

# hostPath — mount node filesystem (avoid — breaks pod portability)
volumes:
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock   # only for DinD, logging agents, etc.
      type: Socket                  # File, Directory, Socket, CharDevice

Prefer secret volumes over environment variables for sensitive data. Env vars are exposed via kubectl describe pod, process listings, and often end up in crash dumps. Files are more controlled.

kubectl storage commands
# List all PVCs across namespaces
kubectl get pvc -A
kubectl get pvc -n production
kubectl get pvc -l app=postgres    # filter by label

# Get PVC status
kubectl describe pvc postgres-data
# Look for: Conditions (FileSystemResizePending), Events (provisioning errors)

# List PVs
kubectl get pv
kubectl get pv -o wide             # shows claim, access modes, reclaim policy
kubectl describe pv pvc-abc123     # PV created by dynamic provisioning

# List StorageClasses
kubectl get storageclass
kubectl describe storageclass gp3

# Find which pod is using a PVC
kubectl get pods -A -o json | python3 -c "
import json,sys
d=json.load(sys.stdin)
for p in d['items']:
  for v in p['spec'].get('volumes', []):
    if v.get('persistentVolumeClaim', {}).get('claimName') == 'postgres-data':
      ns = p['metadata']['namespace']
      name = p['metadata']['name']
      print(f'{ns}/{name}')
"

# Check PVC usage inside pod
kubectl exec -it postgres-0 -- df -h /var/lib/postgresql/data

# Force delete a stuck PVC (finalizers)
kubectl patch pvc stuck-pvc -p '{"metadata":{"finalizers":null}}'

# Backup data from a PVC using a temporary pod
kubectl run backup --image=alpine --rm -it \
  --overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"postgres-data"}}],"containers":[{"name":"backup","image":"alpine","command":["sh"],"volumeMounts":[{"name":"data","mountPath":"/data"}]}]}}' \
  -- tar czf /dev/stdout /data | gzip > backup.tar.gz

Track Kubernetes EOL dates, version history, and upgrade paths at ReleaseRun Kubernetes Releases — free, live data.

🔍 Free tool: K8s YAML Security Linter — check your storage-related K8s manifests — PVCs, StorageClasses — for 12 security misconfigurations.

Founded

2023 in London, UK

Contact

hello@releaserun.com