All posts

Kubernetes Image Pre-Pulling: Cut Pod Startup from Minutes to Seconds

GeekOnCloud··6 min read

Every second your pod spends pulling images is a second your application isn't serving traffic. On a fresh Kubernetes node, I've watched a simple Java application take 4+ minutes to start—not because the JVM was slow, but because it was downloading a 1.2GB image over a congested network link. In autoscaling scenarios, this latency compounds: your cluster scales up to handle load, but new nodes sit idle pulling images while your existing pods drown.

Image pre-pulling isn't glamorous, but it's one of the highest-leverage optimizations you can make for cluster responsiveness. Let's dig into the mechanics and build a production-ready solution.

Why Image Pulls Are Your Scaling Bottleneck

When Kubernetes schedules a pod onto a node, the kubelet checks if the required image exists locally. If not, it pulls from the registry—and this pull happens synchronously before the container starts. The imagePullPolicy: Always default for :latest tags makes this worse: even if the image exists, kubelet re-pulls to check for updates.

Here's what a typical pull timeline looks like for a moderately-sized application image:

  • DNS resolution to registry: 50-200ms
  • TLS handshake: 100-300ms
  • Authentication token exchange: 200-500ms
  • Layer download (500MB compressed): 30-90 seconds on a 100Mbps link
  • Layer decompression and extraction: 10-30 seconds

A node joining your cluster might need to pull 5-10 images for system pods (CNI, CSI drivers, monitoring agents) before your workloads even begin pulling. I've measured cold-start times of 8+ minutes on nodes running a typical observability stack.

The math gets ugly during scale events. If your HPA triggers and Cluster Autoscaler spins up 10 nodes simultaneously, they all hit your registry concurrently. Registry rate limits kick in. Network bandwidth saturates. What should be a 2-minute scale event becomes a 15-minute outage.

DaemonSets: The Simple Pre-Pull Pattern

The most straightforward pre-pull strategy uses a DaemonSet that runs on every node and references your application images. The trick is making these containers do essentially nothing after pulling.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: image-prepuller
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: image-prepuller
  template:
    metadata:
      labels:
        app: image-prepuller
    spec:
      # Run before your workloads with high priority
      priorityClassName: system-node-critical
      initContainers:
      # Add one initContainer per image you want to pre-pull
      - name: prepull-api
        image: your-registry.io/api-service:v2.4.1
        command: ["/bin/true"]
        resources:
          requests:
            cpu: 1m
            memory: 1Mi
      - name: prepull-worker
        image: your-registry.io/worker:v1.8.0
        command: ["/bin/true"]
        resources:
          requests:
            cpu: 1m
            memory: 1Mi
      - name: prepull-nginx
        image: nginx:1.25-alpine
        command: ["/bin/true"]
        resources:
          requests:
            cpu: 1m
            memory: 1Mi
      containers:
      # Main container just sleeps forever
      - name: pause
        image: registry.k8s.io/pause:3.9
        resources:
          requests:
            cpu: 1m
            memory: 1Mi
      tolerations:
      # Ensure this runs on all nodes, including tainted ones
      - operator: Exists

This works because initContainers run sequentially and must complete before the main container starts. Each init pulls its image, executes /bin/true (which exits immediately with success), and moves on. The pause container keeps the DaemonSet "running" with minimal resource consumption.

The key detail: pin your image tags explicitly. Using :latest defeats the purpose—kubelet will re-pull on every pod restart.

Kube-fledged: Purpose-Built Image Caching

If you need more sophisticated control, kube-fledged is a Kubernetes operator specifically designed for image pre-pulling and caching.

Install it via Helm:

helm repo add kubefledged https://senthilrch.github.io/kubefledged-charts/
helm install kube-fledged kubefledged/kube-fledged \
  --namespace kube-fledged \
  --create-namespace \
  --set controller.hostNetwork=true

# Verify the operator is running
kubectl get pods -n kube-fledged

Then define an ImageCache resource:

apiVersion: kubefledged.io/v1alpha2
kind: ImageCache
metadata:
  name: production-images
spec:
  cacheSpec:
  - images:
    - your-registry.io/api-service:v2.4.1
    - your-registry.io/worker:v1.8.0
    - your-registry.io/ml-model:v3.0.0
    # Target specific node groups via selector
    nodeSelector:
      node-pool: production
  - images:
    - your-registry.io/batch-processor:v1.2.0
    nodeSelector:
      node-pool: batch-workers
  imagePullSecrets:
  - name: registry-credentials

Kube-fledged gives you declarative control over which images land on which nodes, automatic refresh when you update the ImageCache spec, and status reporting on cache state. It's particularly useful when different node pools run different workloads.

Node Lifecycle Hooks: Pull Before Ready

For clusters running on cloud providers with managed node groups, you can hook into the node provisioning lifecycle to pre-pull images before the node reports Ready.

On AWS EKS with managed node groups, use a launch template with a custom bootstrap script:

#!/bin/bash
set -ex

# Standard EKS bootstrap
/etc/eks/bootstrap.sh your-cluster-name \
  --kubelet-extra-args '--node-labels=node.kubernetes.io/lifecycle=normal'

# Pre-pull critical images BEFORE kubelet marks node Ready
# This runs after containerd is configured but before workloads schedule
IMAGES=(
  "your-registry.io/api-service:v2.4.1"
  "your-registry.io/worker:v1.8.0"
  "nginx:1.25-alpine"
)

# Authenticate to private registry
aws ecr get-login-password --region us-west-2 | \
  ctr -n k8s.io images pull --user AWS

for img in "${IMAGES[@]}"; do
  echo "Pre-pulling: $img"
  ctr -n k8s.io images pull "$img" || true
done

echo "Pre-pull complete"

This approach has a significant advantage: the node doesn't join the schedulable pool until images are local. Your workloads get immediate starts with zero pull latency.

The downside is maintenance—you're baking image lists into infrastructure configuration. Every time you ship a new version, you need to update the launch template. Consider generating this script dynamically from your deployment manifests.

Measuring the Impact

Before implementing pre-pulling, establish baselines. Here's a quick way to measure actual pull times on your cluster:

# Create a test pod with a cold image
kubectl run pull-test --image=your-registry.io/api-service:v2.4.1 \
  --restart=Never --dry-run=client -o yaml | \
  kubectl apply -f -

# Watch the event stream for pull timing
kubectl get events --field-selector involvedObject.name=pull-test -w

# Or parse the exact durations from events
kubectl get events --field-selector involvedObject.name=pull-test \
  -o jsonpath='{range .items[*]}{.reason}: {.message}{"\n"}{end}'

On one cluster I optimized, cold-start time dropped from 247 seconds to 31 seconds after implementing DaemonSet-based pre-pulling. The autoscaler became actually responsive during traffic spikes instead of a 5-minute lag.

Track these metrics in production:

  • container_start_time_seconds from kubelet (if you're scraping kubelet metrics)
  • Custom metrics from your application's readiness probes
  • Cluster Autoscaler's scale-up latency metrics

What to Pre-Pull (and What to Skip)

Don't pre-pull everything. Target images that:

  1. Are large (>200MB compressed) — these dominate pull time
  2. Run on every node — system DaemonSets, logging agents, service mesh sidecars
  3. Are latency-sensitive — your main application pods, anything in the critical path
  4. Are used during scale events — the workloads your HPA manages

Skip images that are small, rarely used, or change frequently. The diminishing returns aren't worth the maintenance burden.

One pattern I like: generate your pre-pull manifests from your production deployments. A simple script that extracts image references from your Helm values or Kustomize bases keeps the pre-pull list synchronized with reality.

Your Next Move

Start with the DaemonSet approach—deploy it to staging, measure your pod start times, then roll to production. Once you've validated the impact, consider whether kube-fledged's extra features justify its operational overhead.

If you're running autoscaling workloads on AWS, the launch template hook gives you the cleanest separation between infrastructure readiness and workload scheduling. Just remember to automate the image list generation, or you'll be chasing version updates forever.

Written by GeekOnCloud

DevOps & Infrastructure engineer at geekoncloud.com

Read more posts →