source

Incremental garbage collection breaks the marking phase into smaller, interleaved steps rather than running it all at once. This minimizes pause times by distributing GC work across program execution, addressing the latency problem in traditional stop-the-world collectors.

Ruby 2.2 introduced incremental GC, with Ruby 3 refining the implementation. The core insight: since Ruby has a stop-the-world GC where no Ruby code runs during collection, splitting the marking phase significantly reduces application stutter.

The Tri-Color Marking Algorithm

Incremental marking relies on a tri-color abstraction that categorizes objects into three states:

  • White: Objects not yet visited by the GC. These are potential garbage unless proven reachable.
  • Gray: Objects visited by the GC but whose references haven’t been fully traced yet. These form the work queue.
  • Black: Objects fully processed—visited with all their references traced. These are definitely live.

The algorithm maintains the invariant: black objects never point directly to white objects. This ensures correctness when marking pauses and resumes between execution slices.

Marking Process

  1. Initial Phase: All objects start white. Root references (stack, globals) are colored gray.
  2. Incremental Steps: During each marking slice:
    • Take gray objects from the work queue
    • Trace their references, coloring reachable white objects gray
    • Color the now-fully-traced object black
  3. Termination: When no gray objects remain, all reachable objects are black. Remaining white objects are garbage.
sequenceDiagram
    participant App as Ruby Application
    participant GC as Incremental GC
    participant Heap as Object Heap

    Note over Heap: All objects WHITE

    GC->>Heap: Mark roots GRAY
    Note over Heap: Roots: stack, globals

    loop Incremental Marking Cycles
        App->>GC: Pause (short)
        GC->>Heap: Process gray objects
        Note over Heap: WHITE → GRAY (discovered)<br/>GRAY → BLACK (processed)
        GC->>App: Resume

        App->>App: Execute Ruby code

        opt Reference Assignment
            App->>Heap: black_obj.ref = white_obj
            Note over Heap: Write barrier triggered!
            Heap->>GC: Re-color BLACK → GRAY
        end
    end

    Note over GC: No gray objects remain

    GC->>Heap: Scan unprotected objects
    GC->>Heap: Sweep WHITE objects
    Note over Heap: WHITE objects = garbage

    GC->>App: Collection complete

This resembles backpropagation’s gradient flow—information (liveness) propagates through the object graph incrementally, each step building on previous work.

Write Barriers: Maintaining Correctness

The tri-color invariant breaks when the program modifies object references during a marking pause. A black object might gain a reference to a white object, making that white object incorrectly appear as garbage.

Write barriers solve this by intercepting reference assignments. When a black object obtains a reference to a white object:

  1. The write barrier detects the invariant violation
  2. The black object is re-colored gray
  3. The GC will revisit this object in the next marking slice

This forces the collector to trace the new reference, ensuring correctness. The cost: every reference assignment must check if it violates the tri-color invariant.

Write Barrier Unprotected Objects

Some objects skip write barrier protection for performance—typically objects frequently modified or with special semantics. These “write barrier unprotected” objects are scanned completely at the end of incremental marking to catch any missed references.

This optimization recognizes that some objects change too frequently for per-write checks to be worthwhile. The tradeoff between throughput and latency appears again—skip barriers for performance, but pay with final scanning cost.

Implementation in Ruby

Ruby’s incremental GC operates on RVALUEs (Ruby values), the fundamental object representation. The marking phase chunks into increments bounded by time or object count, pausing to let Ruby code execute.

Key implementation details:

  • Interleaved Execution: Mark a batch of objects, yield to Ruby execution, repeat
  • Stop-the-World Constraint: No Ruby code runs during each marking slice, but slices are short
  • Pause Time Reduction: Instead of one 100ms pause, distribute into ten 10ms pauses
  • Memory Overhead: Tri-color metadata and write barrier checks add cost

This design balances Ruby’s concurrency model where the GVL controls execution. Incremental GC respects the GVL while reducing how long it’s held during collection.

Generational and Incremental: Complementary Techniques

Ruby 3 combines generational GC with incremental marking:

  • Generational GC exploits the weak generational hypothesis (most objects die young) by running minor collections on young objects more frequently than major collections on all objects
  • Incremental GC splits the marking phase of these collections into smaller steps

Minor GC cycles mark only young objects incrementally—fast and frequent. Major GC cycles mark both generations incrementally—slower but comprehensive. This hybrid approach optimizes for both throughput and latency.

The interaction creates interesting dynamics. Young objects often die before incremental marking completes, reducing work. Old objects benefit most from incremental marking since scanning long-lived object graphs causes the longest pauses.

Performance Implications

Incremental GC trades throughput for reduced latency:

Benefits:

  • Shorter, more predictable pause times
  • Better responsiveness for interactive applications
  • Reduced tail latencies in web services
  • Enables high-concurrency patterns by minimizing GC interruptions

Costs:

  • Write barrier overhead on every reference assignment
  • More total GC work due to tracking and re-scanning
  • Complexity in implementation and debugging
  • Memory overhead for tri-color metadata

For GVL-bound workloads, shorter GC pauses mean threads spend less time waiting for GC to release the lock. This improves throughput paradoxically—less efficient GC produces better application performance by reducing queueing delays.

Compaction Integration

Ruby 2.7 introduced opt-in compaction (GC.compact) to defragment the heap. Ruby 3 enables automatic compaction alongside incremental GC when GC.auto_compact = true.

Compaction moves objects to eliminate fragmentation, but requires updating all references. The tri-color marking already tracks object reachability, providing the information needed for safe object movement. This integration leverages incremental marking’s bookkeeping for multiple purposes.

The Immix algorithm offers similar defragmentation through opportunistic evacuation. Ruby’s approach differs by separating concerns: incremental marking handles liveness, compaction handles placement.