Container Image Scanning in 2026: Clair vs Trivy vs Grype
Test first. If you run production traffic, pick one scanner this week and wire it into CI with a rollback plan.
should you care? my upgrade verdict
Yes. This matters.
In my experience, teams skip scanning until the first “why did we ship that OpenSSL CVE” incident. Then they overcorrect and block every build, and on-call eats the blast radius when releases stall. Do not do that. Start cautious. Gate the worst stuff, measure noise, then tighten.
Verdict: default to Trivy for most teams, use Grype if you want an SBOM-first pipeline, use Clair if Quay already runs your registry and you can operate a service.
After you deploy: watch CI job duration, scan failure rate, and “new Critical/High findings per day.” If those spike, you will feel it on-call.
- Pick Trivy: you want one binary, fast CI gates, and optional Kubernetes continuous scanning.
- Pick Grype: you want clean SBOM artifacts (usually from Syft) and focused vuln output with a simple fail threshold.
- Pick Clair: you already run Quay and you want server-side scanning plus re-checks when new CVEs land.
| Feature | Clair | Trivy | Grype |
|---|---|---|---|
| Type | Service (daemon) | CLI + library | CLI |
| Scan Scope | Container images | Images, filesystems, repos, K8s, IaC | Container images, filesystems |
| Vuln Sources | OS advisories | OS + language + commercial NVD | OS + language (Grype DB) |
| Language Deps | Limited | npm, pip, gem, go, cargo, Maven, etc. | npm, pip, gem, go, cargo, Maven, etc. |
| SBOM Support | No | Generates CycloneDX + SPDX | Consumes Syft SBOMs |
| Speed (typical) | Seconds (pre-indexed) | 10-30s first scan, cached after | 5-15s (fast cold start) |
| CI Integration | API-based | Native (GitHub Actions, GitLab CI) | Native (GitHub Actions) |
| IaC Scanning | No | Yes (Terraform, CloudFormation, Helm) | No |
| K8s Scanning | No | Yes (cluster-wide) | No |
| Deployment | Server + PostgreSQL | Single binary | Single binary |
| License | Apache 2.0 | Apache 2.0 | Apache 2.0 |
| Best For | Registry-level scanning, Quay integration | All-in-one scanning, CI pipelines | Fast vuln-only scanning, Syft pairing |
should you care? what problem image scanning actually solves
Yes. Obvious, but people still miss it.
You deploy a blob that bundles app code, OS packages, and runtime libs. You did not write most of it. When a CVE hits a base layer, it hits every service that inherits it. I have watched a “harmless base image bump” turn into a weekend because the org had no inventory and no quick way to answer, “which images run this package version right now?”
Scanning does two things well. It tells you what you shipped. It maps known component versions to known CVEs. That is it. It will not spot a zero-day in your own code. It will not save you from bad runtime config.
On-call rule: treat scanners like smoke detectors. You still need sprinklers. That means admission policy and runtime alerts.
should you care? how scanners work (and where they lie)
🔔 Never Miss a Breaking Change
Monthly release roundup — breaking changes, security patches, and upgrade guides across your stack.
✅ You're in! Check your inbox for confirmation.
Yes. Because false confidence hurts.
Most scanners unpack layers, read package databases, parse lockfiles, then match versions against vuln feeds. The thing nobody mentions is reachability. A scanner can flag an npm dependency that sits in a lockfile but never loads in your runtime path. Your mileage may vary depending on how your build system vendors deps and what the scanner can actually detect inside the image.
- Layer and package detection: Debian dpkg status, RPM DBs, Alpine APK DBs. If your image strips metadata aggressively, detection gets worse.
- Language deps: lockfiles help. Vendored deps, compiled artifacts, and custom packaging can confuse tools.
- Matching logic: different databases, different interpretation of distro backports, different results. Expect disagreements.
Monitoring note: track “scanner disagreement rate” if you run two tools. When it jumps, a feed update or parsing change probably landed.
should you care? clair (service scanner)
Maybe. Depends on your registry setup.
Clair fits when you already run Quay and you want scanning to happen where images land. Clair runs as a service. It wants PostgreSQL. It wants care and feeding. That is fine if you already operate stateful services. It is annoying if you want a quick CI gate.
Evidence: Clair splits indexing from matching, and it can re-evaluate previously indexed images when vulnerability data updates. That matters for long-lived images. You do not need to re-run CI to learn you got a new CVE.
Caveats: operational overhead is real. Postgres backups, migrations, disk growth, API auth, and registry auth plumbing. If your team already struggles to keep Prometheus healthy, do not add Clair casually.
- Use Clair when: Quay runs your world, you want centralized scanning, and you want “tell me when old images become risky.”
- Skip Clair when: you want a CLI in dev laptops and CI with minimal moving parts.
After you deploy: alert on indexer/matcher queue lag and API error rates. A silent Clair is worse than no Clair.
should you care? trivy (wide scope scanner)
Yes. For most teams, this is the boring choice. Good.
Trivy runs as a CLI and can also run in cluster via an operator. It scans images, filesystems, IaC, and more. That breadth can help, or it can create noise. In my experience, you want to start narrow. Scan images. Gate only Critical at first. Then add misconfig and secret scanning when you can staff the findings.
Evidence: Trivy makes CI integration easy and supports SBOM output formats teams actually store. It also tends to have broad ecosystem coverage.
Caveats: database downloads, caching, and rate limits bite in locked-down networks. Also, “scan everything” defaults can spam your CI logs and your Slack.
- Good first gate: fail CI on Critical only, then review noise for 2 weeks before failing on High.
- Tuning you will need: disable scanners you are not ready to action, and standardize ignore rules with expiry dates.
After you deploy: check your build queue time, scan duration p95, and “findings per PR.” If PRs stall, developers will route around the process.
should you care? grype (focused vuln scanner)
Yes, if you like clean pipelines.
Grype stays focused on vulnerabilities. It pairs well with Syft, which generates SBOMs. That split sounds academic until you try to audit. Then it helps. You store the SBOM as an artifact. You rescan it later. You do not need to pull images again just to answer “did today’s feed change our risk?”
Evidence: the –fail-on behavior makes CI gating predictable. It also keeps output tighter than “kitchen sink” scanners.
Caveats: you will need other tools for IaC, secrets, and cluster scanning. That can be fine. Tool sprawl can also be a tax.
- Good for: regulated-ish shops that need audit trails and artifact retention.
- Not enough alone: if you want one tool to scan everything from Terraform to live clusters.
After you deploy: page on “scanner failed to update DB” and “scan job timed out.” Those are quiet gaps until the day you need them.
should you care? ci gating that will not wreck your release flow
Yes. This is where teams shoot themselves.
Some folks fail every build on High findings from day one. I do not, but I get it. If you run a clean baseline and you can patch fast, go strict. Most teams start with a backlog of High issues they cannot clear in a sprint. Blanket gating just trains people to ignore the scanner or jam exceptions everywhere.
- Start cautious: fail on Critical only. Track the backlog trend weekly.
- Make exceptions painful: require an expiry date and an owner. Otherwise exceptions become permanent.
- Run a second scanner selectively: rescan nightly or on main branch only, so PR latency stays sane.
If you cannot measure scan duration and failure rate, you should not gate deployments on it.
should you care? sboms, signing, and admission policy
Yes. This closes the loop.
Scanning tells you what is inside the image. Signing tells you who produced it. Admission policy keeps random images out of prod. In most cases, you want all three, but you can stage it.
Evidence: storing SBOMs lets you rescan without rebuilding. Signing plus admission reduces “someone pushed straight to registry” problems.
Caveats: policy engines can block production for dumb reasons. Test twice. Roll out in audit mode first. Watch your admission webhook latency, because that can slow deployments.
- Week 1: generate SBOMs and store them as build artifacts.
- Week 2: sign images that pass the gate. Verify signatures in a staging cluster first.
- Week 3: turn on admission policy in audit mode. Then enforce for one namespace. Then expand.
Other stuff in this release: dependency bumps, some image updates, the usual.
should you care? runtime monitoring (because scanners miss things)
Yes. Scanners do not see runtime behavior.
I have seen clean scans and ugly incidents. Reverse shells. Weird outbound traffic. A container spawning a shell at 3 a.m. Scanning never caught that. Runtime tools that watch syscalls and network activity can. Falco, Tetragon, KubeArmor. Pick one that your team can actually operate.
Evidence: runtime alerts catch abuse patterns that CVE matching cannot.
Caveats: runtime tools can generate alert floods. Start with a small rule set. Page only on high-confidence signals. Send the rest to a dashboard.
- After you deploy: watch process exec alerts, unexpected outbound connections, and privilege escalation signals.
- SRE reality: if every alert pages, everyone disables paging.
should you care? a cautious rollout plan
Yes. Do not “big bang” this.
Run the scanner in report-only mode for two weeks. Baseline your images. Decide your thresholds. Then flip the gate. If you already run production traffic, treat the scanner and its database updates like production dependencies. Because they are.
- Acceptance criteria: scan job p95 under your CI timeout, DB updates succeed daily, false positive rate stays tolerable.
- Rollback plan: if scan latency spikes or DB updates fail, drop back to warn-only and open an incident.
- Monitoring checklist: scan duration, scan failures, DB update failures, new Critical findings per day, exceptions created per week.
There’s probably a better way to test this, but I usually start by scanning three real production images and comparing outputs across tools.
Quick-start commands for each scanner
Try all three on the same image and compare results. Takes about 5 minutes.
# === TRIVY ===
# Install (macOS):
brew install trivy
# Install (Linux):
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
# Scan an image:
trivy image python:3.12-slim
# Filter by severity:
trivy image --severity HIGH,CRITICAL python:3.12-slim
# Generate SBOM (CycloneDX):
trivy image --format cyclonedx --output sbom.json python:3.12-slim
# Scan a Kubernetes cluster:
trivy k8s --report summary cluster
# === GRYPE ===
# Install:
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin
# Scan an image:
grype python:3.12-slim
# Fail CI on Critical/High:
grype python:3.12-slim --fail-on high
# Scan from SBOM (pair with Syft):
syft python:3.12-slim -o cyclonedx-json > sbom.json
grype sbom:sbom.json
# === CLAIR ===
# Clair runs as a service — quick Docker setup:
docker run -d --name clair-db -e POSTGRES_PASSWORD=clair \
postgres:15-alpine
docker run -d --name clair \
-e CLAIR_MODE=combo \
-e CLAIR_CONF=/config/config.yaml \
-p 6060:6060 \
quay.io/projectquay/clair:latest
# Submit a manifest for scanning via clairctl:
clairctl report python:3.12-slim
# GitHub Actions CI workflow — scan on every push
# .github/workflows/scan.yml
name: Container Scan
on: push
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Trivy scan
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
severity: CRITICAL,HIGH
exit-code: 1 # Fail the build on findings
- name: Grype scan (comparison)
uses: anchore/scan-action@v4
with:
image: myapp:${{ github.sha }}
fail-build: true
severity-cutoff: high
Trivy docs: aquasecurity.github.io/trivy. Grype: github.com/anchore/grype. Clair: quay.github.io/clair. Scan your existing images and SBOMs with our SBOM Analyzer or check your container base image with the Dockerfile Linter.
Related Reading
- Container Escape Vulnerabilities: The CVEs That Shaped Docker and Kubernetes Security — The attacks that scanning is designed to catch
- Python 3.12 vs 3.13 vs 3.14 Comparison — Upgrading your Python base image? Know what changed between versions first
- Docker vs Kubernetes in Production (2026) — The deployment architecture that determines your scanning strategy
- IaC Security in 2026: Terraform, Checkov, and Cloud Drift Detection — Shift security left before images are even built
- Kubernetes Upgrade Checklist — The runbook for when your cluster version changes, not just your images
🛠️ Try These Free Tools
Paste your Kubernetes YAML to detect deprecated APIs before upgrading.
Paste a Dockerfile for instant security and best-practice analysis.
Paste your dependency file to check for end-of-life packages.