Sidekiq is a background job processing framework for Ruby that uses threads to handle many jobs concurrently within the same process. Unlike traditional worker frameworks that spawn separate processes for each job, Sidekiq leverages Ruby Concurrency Mechanisms to achieve high throughput with minimal memory overhead.
Core Design Philosophy
Sidekiq’s architecture centers on three fundamental principles: thread-based Concurrency for efficiency, Redis as a job queue and coordination layer, and middleware-based extensibility. The framework can process tens of thousands of jobs per second on a single server by avoiding the memory overhead of process-based concurrency while working within Ruby’s Global Interpreter Lock constraints.
The system follows a client-server model where clients push jobs to Redis queues, and server processes pull jobs for execution. This separation enables horizontal scaling—you can run multiple Sidekiq processes across different machines, all coordinating through a single Redis instance.
Architecture Components
graph TD A[Client Application] -->|push job| B[Sidekiq::Client] B -->|JSON payload| C[Redis] C -->|BRPOP| D[Launcher] D -->|manages| E[Manager] E -->|coordinates| F1[Processor 1] E -->|coordinates| F2[Processor 2] E -->|coordinates| F3[Processor N] F1 -->|executes| G[Job Instance] F2 -->|executes| G F3 -->|executes| G H[Poller] -->|checks schedule| C H -->|enqueues ready jobs| C D -->|heartbeat| C style D fill:#e1f5ff style E fill:#ffe1e1 style C fill:#fff4e1
Sidekiq::Client
The Client component handles job enqueueing with Thread Safety guarantees. When you call MyJob.perform_async(args)
, the Client serializes the job to JSON, generates a unique JID (job ID), and pushes it to a Redis list representing the target queue.
The Client uses pipelined Redis commands to minimize network latency. The push_bulk
method can enqueue thousands of jobs in batches, dramatically reducing Redis round trips. For scheduled jobs, the Client adds them to a Redis sorted set where the score is the execution timestamp.
# Jobs are stored as JSON with metadata
{
"class" => "HardJob",
"args" => [1, 2, 3],
"queue" => "critical",
"jid" => "abc123",
"retry" => 5,
"created_at" => 1697472000.0
}
Sidekiq::Launcher
The Launcher orchestrates the entire server process lifecycle. It initializes Capsules (logical groupings of queues), starts the Poller for scheduled jobs, and manages the heartbeat thread that reports process health to Redis every 10 seconds.
The heartbeat includes critical metrics: memory usage (RSS), number of busy threads, Redis round-trip time, and current job state. This data powers the Web UI’s real-time monitoring and enables automatic cleanup of dead processes.
Sidekiq::Manager
Each Capsule has a Manager that controls a pool of Processor threads. The Manager handles four key lifecycle events: startup (spinning up processors), graceful shutdown (stopping idle processors), hard shutdown (force-killing threads past deadline), and processor death (replacing failed threads).
When a Processor dies unexpectedly, the Manager immediately creates a replacement to maintain the configured concurrency level. This self-healing property ensures that transient errors don’t degrade throughput. The Manager uses a mutex-protected Set to track active processors, providing thread safety for lifecycle operations.
Sidekiq::Processor
The Processor is the workhorse—a single thread that fetches and executes jobs in a loop. Each Processor maintains its own job execution context, including middleware chains, retry logic, and logging. The architecture follows a simple pattern:
sequenceDiagram participant P as Processor participant F as Fetcher participant R as Redis participant J as Job Class loop until shutdown P->>F: retrieve_work F->>R: BRPOP queue:default 2 R-->>F: job JSON F-->>P: UnitOfWork P->>P: parse JSON P->>P: run middleware P->>J: perform(*args) J-->>P: result P->>P: acknowledge end
The Processor uses BRPOP
(blocking right pop) to fetch jobs atomically. This Redis command blocks for up to 2 seconds, creating an efficient polling mechanism without busy-waiting. When shutdown is initiated, the Processor checks its @done
flag after each job completes, enabling graceful draining.
Sidekiq::Fetch
The Fetch component implements queue prioritization strategies. In :strict
mode, queues are polled in order—critical queue always before default queue. In weighted random mode (default), queues are shuffled on each fetch to prevent starvation while respecting relative priorities.
The bulk_requeue
method handles job recovery during hard shutdown. When the timeout expires with jobs still running, Sidekiq pushes unfinished jobs back to Redis before killing threads. This ensures Sidekiq Job Architecture’s “at-least-once” execution guarantee—better to run a job twice than lose it entirely.
Sidekiq::Scheduled::Poller
The Poller thread periodically checks two Redis sorted sets: schedule
(for future jobs) and retry
(for failed jobs awaiting retry). Jobs in these sets have scores representing their execution timestamp. The Poller uses a Lua script to atomically pop and enqueue jobs whose time has come:
local key, now = KEYS[1], ARGV[1]
local jobs = redis.call("zrange", key, "-inf", now, "byscore", "limit", 0, 1)
if jobs[1] then
redis.call("zrem", key, jobs[1])
return jobs[1]
end
The polling interval self-adjusts based on cluster size. With 30 Sidekiq processes, each polls every 450 seconds on average, preventing the thundering herd problem while ensuring jobs are enqueued promptly. This demonstrates Distributed Systems coordination without requiring a leader election protocol.
Job Lifecycle
A job transitions through several states from creation to completion:
- Created: Client validates arguments and generates JID
- Enqueued: Job JSON pushed to Redis queue or schedule set
- Fetched: Processor pops job from queue
- Executing: Job instance created, middleware runs,
perform
called - Completed: Job finishes successfully, acknowledged to Redis
- Failed: Exception raised, retry logic determines next action
When a job fails, Sidekiq uses exponential backoff with jitter. The first retry happens after ~15 seconds, then ~34 seconds, scaling up to hours for later attempts. After exhausting retries (25 by default), jobs move to the Dead set where they’re retained for 6 months before automatic cleanup.
Concurrency Model
Sidekiq’s thread-based concurrency achieves high throughput by targeting I/O-bound workloads within Ruby’s GIL constraints. See Sidekiq Concurrency Model for details on thread-local routing, GIL interaction patterns, and tuning strategies.
Redis Data Structures
Sidekiq leverages Redis’s rich data structures—Lists for queues, Sorted Sets for scheduled jobs, Hashes for process metadata, Sets for registries, and Strings for counters. See Sidekiq Redis Data Model for how these primitives eliminate the need for separate scheduling infrastructure.
Middleware System
Sidekiq uses middleware chains inspired by Rack to implement cross-cutting concerns. See Sidekiq Middleware System for configuration patterns, exception handling, resource management, and thread safety considerations.
Reliability Features
Sidekiq Pro and Enterprise tiers provide mechanisms for reliable job processing in Distributed Systems. See Sidekiq Reliability Features for details on Unique Jobs, Batches, Rate Limiting (concurrent/window/leaky bucket), Reliability Shards, Atomic Scheduling, and transaction-aware clients.
Capsules
Introduced in Sidekiq 7, Capsules allow a single process to operate multiple independent queue pools for multi-tenancy or workload isolation. See Sidekiq Capsules for configuration patterns, thread-local routing, resource management, and migration strategies.
Performance Characteristics
Sidekiq’s architecture delivers impressive performance through careful optimization of the hot path. See Sidekiq Performance Analysis for throughput benchmarks, ActiveJob overhead analysis, memory footprint, latency characteristics, network impact, YJIT gains, concurrency tuning, and optimization strategies.
Web UI and Observability
The Web UI provides real-time visibility into queue depths, job history, retry/scheduled/dead sets, and per-process metrics. It’s a Rack application that can be mounted in Rails or run standalone, querying Redis directly for current state.
The heartbeat mechanism enables live process monitoring. Each process updates its heartbeat hash every 10 seconds with:
{
"hostname" => "worker-1",
"pid" => 12345,
"concurrency" => 10,
"busy" => 3,
"beat" => 1697472000.0,
"rtt_us" => 850,
"rss" => 125000
}
If a heartbeat expires (>60 seconds old), the Web UI marks that process as dead and allows manual cleanup. This demonstrates distributed systems failure detection using timeout-based liveness checks.
Integration with Rails
Sidekiq integrates deeply with Rails through several touchpoints:
- ActiveJob Adapter: Enables
MyJob.perform_later
to enqueue via Sidekiq - Rails Reloader: Wraps job execution for code reloading in development
- CurrentAttributes: Preserves request context across async boundary
- Logger Integration: Uses Rails logger and supports tagged logging
The transaction_aware_client
ensures jobs aren’t enqueued until the current database transaction commits. This prevents race conditions where a job runs before the data it needs is actually persisted.
Comparison with Alternatives
Sidekiq’s thread-based architecture contrasts sharply with process-based alternatives:
Resque (process-per-job): More isolated but 10-20x higher memory usage. No job de-duplication or rate limiting.
Delayed::Job (database-backed): No external dependencies but polling creates database load. Poor performance at scale.
GoodJob (Postgres-based): Elegant for Rails apps already using Postgres. Native advisory locks for reliability but lower throughput than Redis.
Sidekiq’s sweet spot is high-throughput, I/O-bound workloads where thread concurrency provides massive efficiency gains. For CPU-bound jobs, Parallelism via multiple processes or Ractors may be more appropriate.
Sidekiq demonstrates that careful architectural choices—threads over processes, Redis over database, middleware for extensibility—can deliver order-of-magnitude performance improvements while maintaining code simplicity and operational reliability.