Kubernetes Rolling Deployment

Rolling deployment (RollingUpdate strategy) is Kubernetes’ default approach to updating applications without downtime. It incrementally replaces old Pods with new versions, carefully balancing availability, resource consumption, and rollout speed.

Operational Mechanism

Gradual Transition - The Deployment controller creates new Pods with the updated configuration while simultaneously terminating old Pods. Rather than a sudden cutover, the transition happens in waves, maintaining a mix of old and new versions during the rollout.

ReplicaSet Coordination - The Deployment creates a new ReplicaSet for the updated Pod template while scaling down the old ReplicaSet. Both ReplicaSets exist during the rollout, managing their respective Pod versions.

Service Integration - Services seamlessly handle the version mix. As new Pods become ready, they’re automatically added to the Service’s endpoint pool. As old Pods terminate, they’re gracefully removed. Clients connecting through the Service experience continuous availability.

Control Parameters

Rolling deployments balance three competing concerns: availability, resource efficiency, and rollout speed. Three parameters control this balance:

maxSurge - Maximum number of Pods that can exist above the desired replica count during updates. Can be an absolute number (e.g., 2) or percentage (e.g., 25%).

Setting maxSurge to 1 with 4 replicas means at most 5 Pods will exist during rollout. Higher values speed up deployments but require more resources. Setting to 0 requires maxUnavailable > 0.

maxUnavailable - Maximum number of Pods that can be unavailable during updates. Can be an absolute number or percentage.

Setting maxUnavailable to 1 with 4 replicas means at least 3 Pods must remain available during rollout. Lower values maintain more capacity but slow deployments. Setting to 0 requires maxSurge > 0.

minReadySeconds - How long a Pod must be ready (passing readiness probes) before being considered available and before the Deployment proceeds to the next Pod.

This crucial parameter prevents rapid rollout of broken versions. If a new Pod passes initial readiness checks but crashes 30 seconds later, a minReadySeconds value of 60 would catch this failure before proceeding.

Example Configurations

Conservative (Maintain Capacity)

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0
    minReadySeconds: 30

Ensures full capacity maintained. At most one extra Pod during rollout. New Pods must be stable for 30 seconds before continuing.

Aggressive (Fast Rollout)

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 50%
    maxUnavailable: 25%
    minReadySeconds: 10

Rapid updates accepting some capacity reduction. For 4 replicas: up to 6 Pods during rollout, minimum 3 available. Quick stability check.

Resource-Constrained

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 0
    maxUnavailable: 1
    minReadySeconds: 60

No extra Pods created. Updates one at a time. Careful stability verification. Appropriate when cluster capacity is tight.

Graceful Termination

Rolling deployment effectiveness depends on proper Pod shutdown:

SIGTERM Handling - When Kubernetes decides to terminate a Pod, it sends SIGTERM to allow graceful shutdown. Applications must handle this signal: finish in-flight requests, close connections cleanly, flush buffers. This is a core requirement of the Managed Lifecycle pattern.

Grace Period - After SIGTERM, Kubernetes waits (default 30 seconds) before sending SIGKILL. Applications have this window to shut down gracefully. Configure terminationGracePeriodSeconds if your application needs more time.

Service Deregistration - Kubernetes removes the Pod from Service endpoints before sending SIGTERM, preventing new connections. However, in-flight requests need graceful completion. PreStop hooks can provide additional connection draining time.

Applications that ignore SIGTERM risk dropped requests, corrupted data, and degraded user experience during deployments. This connects to declarative deployment prerequisites - containers must be lifecycle-aware.

Health Check Dependencies

Rolling deployments rely heavily on accurate health reporting:

Readiness Probes - Determine when new Pods can receive traffic. Until a Pod passes readiness checks, the Deployment doesn’t consider it available and doesn’t proceed with the rollout.

Incorrect readiness probes have severe consequences: too permissive (reporting ready before truly ready) sends traffic to broken Pods; too strict (never reporting ready) stalls rollouts indefinitely.

Liveness Probes - Detect Pods that should be restarted. During rollouts, if new Pods consistently fail liveness checks, this signals a problematic update that may trigger rollback logic.

minReadySeconds Integration - Even after passing readiness checks, Pods must remain ready for minReadySeconds before being considered stable. This catches delayed failures like memory leaks or gradual resource exhaustion.

Trade-offs and Considerations

Version Coexistence - Old and new versions run simultaneously during rollouts. Applications must tolerate this: database schemas must be backward-compatible, APIs must handle version differences, state must be sharable.

This requirement makes rolling deployment unsuitable for some scenarios. Applications with incompatible versions between releases need recreate deployment or more sophisticated blue-green strategies.

Resource Overhead - MaxSurge requires temporary additional capacity. For large deployments or resource-constrained clusters, this overhead may be significant. Plan resource allocation accordingly.

Rollout Duration - Conservative settings (low maxUnavailable, high minReadySeconds) ensure safety but extend deployment time. For applications with many replicas, rollouts can take considerable time.

Partial Rollouts - If a rollout fails partway through (insufficient resources, failing health checks), you end up with a mix of versions. The Deployment controller will maintain this partial state until the issue is resolved.

Advanced Patterns

Rolling deployment provides the foundation, but more sophisticated strategies build on it:

Canary Analysis - Canary deployments use rolling deployment mechanisms but add metric-based gates. Update a small percentage (using maxSurge/maxUnavailable to control count), analyze metrics, then proceed or rollback.

Progressive Delivery - Tools like Flagger automate gradual rollout with traffic shifting. Start with 5% of Pods, monitor success metrics, gradually increase to 100% or rollback automatically.

Blue-Green Hybrid - Some organizations use rolling deployments for minor updates (patch versions) but blue-green deployment for major releases with schema changes.

Gradual Notes

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Kubernetes Batch Jobs

Sidekiq Architecture

Sidekiq Capsules