Incremental garbage collection breaks the marking phase into smaller, interleaved steps rather than running it all at once. This minimizes pause times by distributing GC work across program execution, addressing the latency problem in traditional stop-the-world collectors.
Ruby 2.2 introduced incremental GC, with Ruby 3 refining the implementation. The core insight: since Ruby has a stop-the-world GC where no Ruby code runs during collection, splitting the marking phase significantly reduces application stutter.
The Tri-Color Marking Algorithm
Incremental marking relies on a tri-color abstraction that categorizes objects into three states:
- White: Objects not yet visited by the GC. These are potential garbage unless proven reachable.
- Gray: Objects visited by the GC but whose references haven’t been fully traced yet. These form the work queue.
- Black: Objects fully processed—visited with all their references traced. These are definitely live.
The algorithm maintains the invariant: black objects never point directly to white objects. This ensures correctness when marking pauses and resumes between execution slices.
Marking Process
- Initial Phase: All objects start white. Root references (stack, globals) are colored gray.
- Incremental Steps: During each marking slice:
- Take gray objects from the work queue
- Trace their references, coloring reachable white objects gray
- Color the now-fully-traced object black
- Termination: When no gray objects remain, all reachable objects are black. Remaining white objects are garbage.
sequenceDiagram participant App as Ruby Application participant GC as Incremental GC participant Heap as Object Heap Note over Heap: All objects WHITE GC->>Heap: Mark roots GRAY Note over Heap: Roots: stack, globals loop Incremental Marking Cycles App->>GC: Pause (short) GC->>Heap: Process gray objects Note over Heap: WHITE → GRAY (discovered)<br/>GRAY → BLACK (processed) GC->>App: Resume App->>App: Execute Ruby code opt Reference Assignment App->>Heap: black_obj.ref = white_obj Note over Heap: Write barrier triggered! Heap->>GC: Re-color BLACK → GRAY end end Note over GC: No gray objects remain GC->>Heap: Scan unprotected objects GC->>Heap: Sweep WHITE objects Note over Heap: WHITE objects = garbage GC->>App: Collection complete
This resembles backpropagation’s gradient flow—information (liveness) propagates through the object graph incrementally, each step building on previous work.
Write Barriers: Maintaining Correctness
The tri-color invariant breaks when the program modifies object references during a marking pause. A black object might gain a reference to a white object, making that white object incorrectly appear as garbage.
Write barriers solve this by intercepting reference assignments. When a black object obtains a reference to a white object:
- The write barrier detects the invariant violation
- The black object is re-colored gray
- The GC will revisit this object in the next marking slice
This forces the collector to trace the new reference, ensuring correctness. The cost: every reference assignment must check if it violates the tri-color invariant.
Write Barrier Unprotected Objects
Some objects skip write barrier protection for performance—typically objects frequently modified or with special semantics. These “write barrier unprotected” objects are scanned completely at the end of incremental marking to catch any missed references.
This optimization recognizes that some objects change too frequently for per-write checks to be worthwhile. The tradeoff between throughput and latency appears again—skip barriers for performance, but pay with final scanning cost.
Implementation in Ruby
Ruby’s incremental GC operates on RVALUEs (Ruby values), the fundamental object representation. The marking phase chunks into increments bounded by time or object count, pausing to let Ruby code execute.
Key implementation details:
- Interleaved Execution: Mark a batch of objects, yield to Ruby execution, repeat
- Stop-the-World Constraint: No Ruby code runs during each marking slice, but slices are short
- Pause Time Reduction: Instead of one 100ms pause, distribute into ten 10ms pauses
- Memory Overhead: Tri-color metadata and write barrier checks add cost
This design balances Ruby’s concurrency model where the GVL controls execution. Incremental GC respects the GVL while reducing how long it’s held during collection.
Generational and Incremental: Complementary Techniques
Ruby 3 combines generational GC with incremental marking:
- Generational GC exploits the weak generational hypothesis (most objects die young) by running minor collections on young objects more frequently than major collections on all objects
- Incremental GC splits the marking phase of these collections into smaller steps
Minor GC cycles mark only young objects incrementally—fast and frequent. Major GC cycles mark both generations incrementally—slower but comprehensive. This hybrid approach optimizes for both throughput and latency.
The interaction creates interesting dynamics. Young objects often die before incremental marking completes, reducing work. Old objects benefit most from incremental marking since scanning long-lived object graphs causes the longest pauses.
Performance Implications
Incremental GC trades throughput for reduced latency:
Benefits:
- Shorter, more predictable pause times
- Better responsiveness for interactive applications
- Reduced tail latencies in web services
- Enables high-concurrency patterns by minimizing GC interruptions
Costs:
- Write barrier overhead on every reference assignment
- More total GC work due to tracking and re-scanning
- Complexity in implementation and debugging
- Memory overhead for tri-color metadata
For GVL-bound workloads, shorter GC pauses mean threads spend less time waiting for GC to release the lock. This improves throughput paradoxically—less efficient GC produces better application performance by reducing queueing delays.
Compaction Integration
Ruby 2.7 introduced opt-in compaction (GC.compact) to defragment the heap. Ruby 3 enables automatic compaction alongside incremental GC when GC.auto_compact = true.
Compaction moves objects to eliminate fragmentation, but requires updating all references. The tri-color marking already tracks object reachability, providing the information needed for safe object movement. This integration leverages incremental marking’s bookkeeping for multiple purposes.
The Immix algorithm offers similar defragmentation through opportunistic evacuation. Ruby’s approach differs by separating concerns: incremental marking handles liveness, compaction handles placement.