Sidekiq Redis Data Model

Data Structures as Infrastructure

Jobs become their own timers when stored as sorted set members with Unix timestamps as scores—eliminating the need for separate scheduling infrastructure by aligning problem space directly with Redis primitives.

Sidekiq leverages Redis’s rich data structures for Distributed Systems coordination, demonstrating Mechanical & Architectural Sympathy by aligning the data model with Redis’s native capabilities. Each structure serves a specific purpose in the job processing lifecycle.

Lists: Job Queues

Redis Lists implement FIFO queues using LPUSH (left push) and BRPOP (blocking right pop). Jobs are pushed to the left and popped from the right, creating natural queue ordering.

queue:default -> ["job3", "job2", "job1"]
                    ↑              ↓
                  LPUSH          BRPOP

The BRPOP command blocks for up to 2 seconds, creating an efficient polling mechanism without busy-waiting. This eliminates the need for aggressive polling—the server wakes up immediately when a job arrives, or after 2 seconds to check for shutdown signals.

Multiple queues can be prioritized by passing queue names to BRPOP:

# Strict priority: critical always before default
redis.brpop('queue:critical', 'queue:default', timeout: 2)

Sorted Sets: Time-Indexed Jobs

Scheduled and retry jobs use Redis Sorted Sets where the score is the Unix timestamp for execution. This elegant design makes jobs self-scheduling—no separate timer infrastructure needed.

schedule -> {
  "job1" => 1697472600.0,  # Oct 16 2025 10:30 AM
  "job2" => 1697476200.0,  # Oct 16 2025 11:30 AM
}

The Poller uses ZRANGEBYSCORE to atomically find and remove jobs whose time has come:

local jobs = redis.call("zrange", "schedule", "-inf", now, "byscore", "limit", 0, 1)
if jobs[1] then
  redis.call("zrem", "schedule", jobs[1])
  return jobs[1]
end

This pattern eliminates the classic “cron scheduler” problem—you don’t need a separate process monitoring a database table. The data structure IS the scheduler.

Hashes: Process Metadata

Each Sidekiq process stores metadata in a hash keyed by its identity:

processes:hostname:pid:uuid -> {
  "hostname" => "worker-1",
  "pid" => 12345,
  "concurrency" => 10,
  "busy" => 3,
  "beat" => 1697472000.0,
  "rtt_us" => 850,        # Redis round-trip time
  "rss" => 125000         # Memory usage
}

In-flight jobs are tracked in a separate hash:

processes:hostname:pid:uuid:work -> {
  "abc123" => {            # job ID
    "queue" => "default",
    "payload" => {...},
    "run_at" => 1697472000.0
  }
}

This enables the Web UI to show real-time process health and job execution. If a heartbeat expires (>60 seconds old), the process is considered dead and can be cleaned up.

Sets: Registry and Indexing

Redis Sets track active queues and processes without duplication:

queues -> {"default", "critical", "mailers"}
processes -> {"hostname:pid:uuid", ...}

The queues set enables dynamic queue discovery—workers can automatically find all queues without configuration. The processes set provides fast membership testing for liveness checks.

Strings: Counters and Statistics

Simple counters use Redis Strings with atomic increment operations:

stat:processed -> "1504923"
stat:failed -> "127"
stat:failed:2025-10-15 -> "23"    # Daily granularity

Daily statistics enable the Web UI’s historical graphs without complex time-series infrastructure. Keys naturally expire using TTL, preventing unbounded growth:

redis.setex("stat:processed:#{date}", 90.days, count)

Dead Set: Long-Term Job Storage

Failed jobs that exhaust retries move to a sorted set (not a separate database!):

dead -> {
  "job_payload" => score  # timestamp of death
}

The dead set is capped at 10,000 jobs and retains entries for 6 months. This prevents unbounded memory growth while providing forensic data for debugging. The sorted set structure enables efficient pagination and time-based cleanup.

Memory Management

Sidekiq’s data model avoids memory bloat through several strategies:

TTL on ephemeral data: Process heartbeats expire after 60 seconds
Capped collections: Dead set limited to 10,000 jobs
Efficient encoding: Job payloads use MessagePack (smaller than JSON)
No job history: Completed jobs are immediately deleted, not archived

This contrasts with database-backed queues that accumulate job records. Redis’s in-memory nature demands careful memory management—every byte counts at scale.

Cluster Compatibility

Some features require Redis Cluster compatibility. Sorted sets for rate limiting must use the same hash slot as their associated keys, achieved through hash tags:

# Force same slot for related keys
redis.zadd("{user:123}:ratelimit", score, member)
redis.get("{user:123}:data")  # Same slot due to {user:123}

See Sidekiq Architecture for how this data model supports the overall system design.

Gradual Notes

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Kubernetes Batch Jobs

Sidekiq Architecture

Sidekiq Capsules

Sidekiq Redis Data Model

Lists: Job Queues

Sorted Sets: Time-Indexed Jobs

Hashes: Process Metadata

Sets: Registry and Indexing

Strings: Counters and Statistics

Dead Set: Long-Term Job Storage

Memory Management

Cluster Compatibility

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Kubernetes Batch Jobs

Sidekiq Architecture

Sidekiq Capsules

Sidekiq Concurrency Model

Graph View

Table of Contents

Backlinks