Sidekiq provides several mechanisms for reliable job processing through its Pro and Enterprise tiers. These features address the fundamental challenge in Distributed Systems: ensuring jobs execute despite network failures, process crashes, and concurrency conflicts.
Unique Jobs (Pro/Enterprise)
Unique Jobs prevent duplicate job enqueueing using Redis locks. A job with unique_for: 10.minutes
won’t be enqueued again if a job with identical arguments is already pending or executing.
class ImportJob
include Sidekiq::Job
sidekiq_options unique_for: 10.minutes
def perform(user_id)
# Only one import per user every 10 minutes
end
end
The uniqueness lock uses a Redis key combining the job class and arguments:
unique:ImportJob:user_id:123 -> expires in 600 seconds
Three uniqueness policies control when locks are created and released:
until_executing
: Lock from enqueue to job startuntil_executed
: Lock from enqueue to job completion (default)until_timeout
: Lock for fixed duration regardless of execution
The until_executing
policy enables “at-most-once” semantics—once a job starts, duplicates can be enqueued. This suits idempotent jobs where re-execution is safe but double-execution wastes resources.
Batches (Pro)
Batches group related jobs and trigger callbacks when all jobs complete. This solves the fan-out/fan-in problem where you need to wait for N async jobs before continuing.
batch = Sidekiq::Batch.new
batch.on(:success, NotificationJob)
batch.on(:complete, CleanupJob) # Fires even if jobs failed
batch.jobs do
1000.times do |i|
ProcessRecordJob.perform_async(i)
end
end
Batch state is tracked in Redis using counters and sets:
batch:BID:pending -> 1000 # Remaining jobs
batch:BID:total -> 1000 # Initial count
batch:BID:failures -> Set of failed JIDs
batch:BID:complete -> timestamp
When the pending counter reaches zero, Sidekiq enqueues the success callback. The complete callback always fires, regardless of failures. This enables cleanup logic that must run even when things fail.
Nested batches enable complex workflows:
outer_batch = Sidekiq::Batch.new
outer_batch.on(:success, SendEmailJob)
outer_batch.jobs do
5.times do |i|
inner_batch = Sidekiq::Batch.new
inner_batch.jobs do
10.times { ProcessChunkJob.perform_async(i) }
end
end
end
The outer batch waits for all inner batches to complete, creating a tree of dependencies.
Rate Limiting (Enterprise)
Rate limiters control job throughput using various algorithms. This prevents overwhelming downstream services or staying within API rate limits.
Concurrent limiter: Restricts how many jobs can execute simultaneously:
limiter = Sidekiq::Limiter.concurrent("api-service", 5)
limiter.within_limit do
# Max 5 jobs call this service at once
external_api.call
end
Implementation uses a Redis set to track active job IDs:
limiter:concurrent:api-service -> Set of active JIDs (max 5)
When 5 jobs are running, new jobs receive Sidekiq::Limiter::OverLimit
exception and are automatically rescheduled after backoff.
Window limiter: Allows N operations per time window:
limiter = Sidekiq::Limiter.window("shopify-api", 100, :minute)
limiter.within_limit do
# Max 100 calls per minute
shopify_api.create_order
end
Uses a sorted set where scores are timestamps:
limiter:window:shopify-api -> Sorted set of timestamps
Expired timestamps are pruned, and new attempts check if the set size < limit.
Leaky bucket: Smooths burst traffic by allowing X requests upfront, then throttling to X per period:
limiter = Sidekiq::Limiter.leaky("external-api", 60, :minute)
This allows bursts up to 60 requests, then throttles to 60/minute sustained. Perfect for APIs that allow bursts but have sustained rate limits.
Reliability Shards (Enterprise)
Reliability Shards provide true “exactly-once” semantics by storing in-progress jobs in a secondary Redis sorted set. If a process crashes, another process can recover and re-enqueue the lost jobs.
When super_fetch pulls a job, it:
- Atomically moves job from queue to working set
- Executes the job
- Removes from working set on success
If step 2 fails (process crash), the job remains in the working set. A recovery process periodically scans for stale jobs (>5 minutes in working set) and re-enqueues them.
This contrasts with standard Sidekiq’s “at-least-once” guarantee where crashed jobs are lost. The trade-off is additional Redis operations per job—acceptable for critical workflows but overkill for idempotent jobs.
Atomic Scheduling
The Atomic Scheduler (Pro 7.2+) eliminates race conditions in the scheduled job poller. Multiple Sidekiq processes poll the scheduled set, but only one should enqueue each job.
The scheduler uses a Lua script for atomic check-and-set:
local job = redis.call("zrange", "schedule", "-inf", now, "byscore", "limit", 0, 1)
if job[1] then
local removed = redis.call("zrem", "schedule", job[1])
if removed == 1 then
redis.call("lpush", queue, job[1])
return job[1]
end
end
The zrem
returns 0 if another process already removed the job, preventing duplicate enqueuing. This atomic operation replaces the previous check-then-remove pattern that had a race window.
Transaction-Aware Client
The transaction-aware client ensures jobs aren’t enqueued until the current database transaction commits:
User.transaction do
user = User.create!(email: "test@example.com")
WelcomeEmailJob.perform_async(user.id)
# Email job only enqueues if transaction commits
end
Without this, the job could run before the transaction commits, failing to find the user record. The client hooks into ActiveRecord’s transaction callbacks to defer Redis operations until after commit.
See Sidekiq Architecture for how these reliability features integrate with the core job processing flow.