Skip to content

YAML Reference

Syntax, types, multi-line strings, anchors, and the gotchas that break your Kubernetes manifests, GitHub Actions, and Ansible playbooks.

Basic syntax — scalars, mappings, sequences
# YAML document starts with --- (optional but good practice)
# Comments use #

---
# Mapping (key: value pairs — equivalent to a dict/object)
name: Alice
age: 30
active: true
score: 9.5
nothing: null      # or ~

# Sequence (list — equivalent to an array)
fruits:
  - apple
  - banana
  - cherry

# Nested mapping
address:
  street: 123 Main St
  city: London
  country: UK

# Mapping with sequence values
servers:
  - name: web1
    ip: 10.0.1.1
    tags:
      - web
      - prod
  - name: db1
    ip: 10.0.1.2

# Inline (flow) style — compact, single line
colors: [red, green, blue]
person: {name: Bob, age: 25}

# Quoted strings (when you need to preserve special chars)
message: "Hello: World"     # colon in value needs quotes
path: "C:\\Users\\alice"    # backslash
version: "1.0"              # prevent "1.0" from becoming float 1.0
boolean_str: "true"         # prevent true from becoming boolean
Scalar types — strings, numbers, booleans, null
# Strings — unquoted (most of the time)
name: Alice Smith
url: https://example.com    # colons in URLs are fine without quotes

# Strings that need quoting
tricky: "yes"       # without quotes: parsed as boolean true
port: "8080"        # without quotes: parsed as integer 8080
colon: "key: val"   # contains colon+space
hash: "a#b"         # contains hash
newline: "line1\nline2"

# Numbers
integer: 42
negative: -7
float: 3.14
scientific: 1.5e3    # 1500.0
hex: 0xFF            # 255
octal: 0o77          # 63 (YAML 1.2) — be careful: 077 parsed as octal in YAML 1.1

# Booleans — YAML 1.1 is broad, YAML 1.2 is strict
# YAML 1.2 booleans (JSON-compatible): true / false / True / False
# YAML 1.1 also treated as bool: yes/no/on/off (and capitalised variants)
# Safe practice: always quote yes/no/on/off in config files

enabled: true
debug: false
# Risky (YAML 1.1 parsers): 
# Sweden: SE        # fine
# Norway: no        # parsed as false in YAML 1.1 — the "Norway problem"

# Null
nothing: null
also_null: ~
empty_key:

The Norway problem: Country codes NO, YES, ON, OFF are parsed as booleans by many YAML 1.1 parsers (Python's PyYAML, Ruby's Psych). Always quote them: "NO", "yes".

Multi-line strings — literal and folded blocks
# Literal block scalar: | — preserves newlines exactly
script: |
  #!/bin/bash
  set -euo pipefail
  echo "Starting deploy..."
  ./deploy.sh --env prod

# Each line becomes a real newline in the parsed value.
# Trailing newline is preserved. Leading indentation is stripped.

# Folded block scalar: > — folds newlines into spaces
description: >
  This is a long description that
  wraps across multiple lines but
  will be joined into one paragraph.

# Result: "This is a long description that wraps across multiple lines but will be joined into one paragraph.\n"

# Block chomping:
# |  / >   — keep final newline (default: clip)
# |- / >-  — strip final newline
# |+ / >+  — keep ALL trailing newlines (keep)

script_no_newline: |-
  echo "no trailing newline"

# Multi-line strings inline (not recommended — hard to read)
message: "line one\nline two\nline three"

# Indented literal block (works fine, just needs consistent indentation)
config: |
  [server]
  host = 0.0.0.0
  port = 8080
  
  [database]
  url = postgres://localhost/mydb

Rule: Use | for shell scripts, config files, SQL, code — anything where newlines matter. Use > for long prose descriptions where line wrapping is cosmetic.

Anchors and aliases — reuse without repeating
# Define an anchor with &name
# Reference it with *name
# Merge a mapping with <<: *name

# Basic anchor
defaults: &defaults
  timeout: 30
  retries: 3
  log_level: info

production:
  <<: *defaults           # merge all keys from defaults
  log_level: warn         # override one key
  endpoint: https://prod.example.com

staging:
  <<: *defaults
  endpoint: https://staging.example.com

# Sequence anchor
common_tags: &tags
  - team:engineering
  - managed-by:ansible

service_a:
  tags: *tags

service_b:
  tags:
    - *tags               # merges individual items differently
    - extra:tag

# Practical: Docker Compose
x-common-env: &common-env
  NODE_ENV: production
  LOG_LEVEL: info
  DB_HOST: postgres

services:
  web:
    environment:
      <<: *common-env
      PORT: "3000"
  worker:
    environment:
      <<: *common-env
      CONCURRENCY: "4"

Note: Anchors and aliases are defined in a single YAML document. They do not work across multiple files. The <<: merge key only works with mappings, not sequences.

Kubernetes YAML patterns
# Multiple documents in one file (separated by ---)
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  LOG_LEVEL: info
  PORT: "8080"          # quoted — otherwise parsed as integer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    app: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: app
          image: my-app:1.4.2
          env:
            - name: LOG_LEVEL
              valueFrom:
                configMapKeyRef:
                  name: app-config
                  key: LOG_LEVEL
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-secret
                  key: password
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
          ports:
            - containerPort: 8080

# Common gotchas in K8s YAML:
# - Values that look like numbers must be quoted: port: "8080"
# - Boolean strings must be quoted: value: "true"
# - Indentation must be spaces, never tabs
# - After changing a label selector, you must delete+recreate the resource
GitHub Actions YAML patterns
on:
  push:
    branches: [main, "release/*"]    # glob patterns need quoting
  pull_request:
    branches: [main]
  schedule:
    - cron: "0 9 * * 1"             # always quote cron expressions

jobs:
  build:
    runs-on: ubuntu-latest
    env:
      NODE_ENV: production
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: "20"         # quoted — prevent 20 vs 20.0 ambiguity

      - name: Install and test
        run: |                       # literal block — preserves newlines
          npm ci
          npm test

      - name: Build
        run: npm run build
        if: github.ref == 'refs/heads/main'

      - name: Set output
        id: version
        run: echo "tag=$(git describe --tags)" >> $GITHUB_OUTPUT

      - name: Use output
        run: echo "Version is ${{ steps.version.outputs.tag }}"

# Matrix strategy
strategy:
  matrix:
    node: [18, 20, 22]
    os: [ubuntu-latest, windows-latest]
  fail-fast: false

# Gotcha: YAML special chars in expressions
# Wrap ${{ }} expressions in double quotes when they contain : or #
run: echo "${{ toJSON(github.event) }}"
Common gotchas and parser differences
ValueYAML 1.1 (PyYAML default)YAML 1.2 (strict/JSON-compat)Safe fix
yes / notrue / falsestring"yes"
on / offtrue / falsestring"on"
07763 (octal)string "077""077"
1.0float 1.0float 1.0"1.0" if string needed
1e3float 1000.0float 1000.0"1e3" if string needed
nullNoneNonefine as-is
~Nonestring "~" in some parsersuse null
2024-01-15datetime objectstring"2024-01-15"

Tab characters are not valid YAML indentation. Every YAML parser will reject tabs. Configure your editor to use spaces in YAML files.

# Other common mistakes

# BAD: missing space after colon
name:Alice        # parse error

# GOOD
name: Alice

# BAD: inconsistent indentation
items:
  - one
   - two           # different indent = parse error

# BAD: special char in unquoted string
message: Hello: World    # colon+space splits into key:value
message: "Hello: World"  # GOOD

# BAD: version number becoming float
version: 1.10    # parsed as 1.1 (float)
version: "1.10"  # GOOD — preserves string "1.10"

# BAD: empty value vs null
key:             # null (None)
key: ""          # empty string — different thing!

# GOOD: validate YAML before applying
cat manifest.yml | python3 -c "import sys,yaml; yaml.safe_load(sys.stdin); print('OK')"
yamllint manifest.yml
kubectl apply --dry-run=client -f manifest.yml
YAML in Python — PyYAML and ruamel.yaml
import yaml

# Load YAML (safe_load prevents arbitrary object creation)
with open("config.yml") as f:
    config = yaml.safe_load(f)     # always use safe_load, not load()

# Load multiple documents
with open("k8s.yml") as f:
    docs = list(yaml.safe_load_all(f))

# Parse from string
data = yaml.safe_load("""
name: Alice
tags:
  - admin
  - user
""")

# Dump Python object to YAML
output = yaml.dump(data, default_flow_style=False, sort_keys=False)

# Dump to file
with open("output.yml", "w") as f:
    yaml.dump(data, f, default_flow_style=False)

# ruamel.yaml — preserves comments and formatting
from ruamel.yaml import YAML
yaml_parser = YAML()
yaml_parser.preserve_quotes = True

with open("config.yml") as f:
    config = yaml_parser.load(f)

config["new_key"] = "value"

with open("config.yml", "w") as f:
    yaml_parser.dump(config, f)     # preserves existing comments

PyYAML gotcha: yaml.load() without a Loader is deprecated and unsafe — it can execute arbitrary Python. Always use yaml.safe_load() for untrusted input.

Validation and tooling
# yamllint — lint YAML files (install: pip install yamllint)
yamllint config.yml
yamllint -d relaxed site.yml          # less strict rules
yamllint -d "{extends: default, rules: {line-length: {max: 120}}}" *.yml

# .yamllint config
---
extends: default
rules:
  line-length:
    max: 120
  truthy:
    allowed-values: [true, false]     # ban yes/no/on/off
  comments:
    min-spaces-from-content: 1

# Python quick validation
python3 -c "import yaml,sys; yaml.safe_load(sys.stdin)" < config.yml

# yq — jq for YAML (install: brew install yq / snap install yq)
yq '.metadata.name' deployment.yml
yq '.spec.replicas = 5' deployment.yml      # update in place with -i
yq -o=json deployment.yml                   # convert to JSON
yq eval-all 'select(.kind == "Deployment")' *.yml   # filter docs

# kubectl dry run (best K8s YAML validation)
kubectl apply --dry-run=client  -f manifest.yml
kubectl apply --dry-run=server  -f manifest.yml   # server-side: validates against live API

# Diff two YAML files (semantic diff)
diff <(yq -o=json a.yml) <(yq -o=json b.yml)