source

A write barrier is a piece of code executed whenever the program modifies an object reference. It maintains critical invariants for incremental and generational garbage collectors, ensuring correctness when collection is interleaved with program execution.

The name suggests protection—like memory barriers in concurrent programming—but write barriers protect the garbage collector’s assumptions rather than memory ordering.

The Problem: Concurrent Modification

Without write barriers, incremental and generational GC would be unsound:

Incremental GC Problem: During tri-color marking, a black object (fully traced) gains a reference to a white object (not yet traced). The collector assumes black objects point only to black or gray objects. The white object appears unreachable and gets collected incorrectly.

graph LR
    A[Black Object<br/>Fully Traced]
    B[White Object<br/>Not Yet Traced]
    C[GC Assumptions<br/>Broken!]

    A -.->|"Program assigns<br/>new reference"| B
    B -.->|"Appears unreachable"| C

    style A fill:#000,stroke:#333,color:#fff
    style B fill:#fff,stroke:#333,color:#000
    style C fill:#f88,stroke:#333,color:#000

Generational GC Problem: An old generation object gains a reference to a young generation object between minor collections. Minor GC only scans young objects, so it misses this reference and incorrectly collects the young object.

graph TB
    subgraph "Old Generation"
    O1[Old Object]
    end

    subgraph "Young Generation"
    Y1[Young Object A]
    Y2[Young Object B]
    end

    O1 -.->|"New reference<br/>between minor GCs"| Y2
    Y1 --> Y2

    style O1 fill:#8cf,stroke:#333,color:#000
    style Y1 fill:#fc8,stroke:#333,color:#000
    style Y2 fill:#fc8,stroke:#333,color:#000

Both scenarios break the collector’s invariants about what needs scanning. Write barriers detect these invariant violations.

How Write Barriers Work

The barrier intercepts every reference assignment: obj.field = other_obj

For Incremental GC (Tri-Color)

When assigning a reference, check the colors:

def write_barrier_tricolor(obj, field, new_ref)
  if obj.color == BLACK && new_ref.color == WHITE
    obj.color = GRAY  # Re-add to work queue
  end
  obj[field] = new_ref  # Perform actual write
end

This maintains the invariant: black objects never point directly to white objects. By re-graying the black object, we ensure the new reference will be traced.

graph TB
    subgraph "Before Write Barrier"
    A1[Black Object]
    B1[White Object]
    end

    subgraph "After Write Barrier"
    A2[Gray Object<br/>Re-added to queue]
    B2[White Object<br/>Will be traced]
    A2 --> B2
    end

    A1 -.->|"Assignment triggers<br/>write barrier"| A2

    style A1 fill:#000,stroke:#333,color:#fff
    style B1 fill:#fff,stroke:#333,color:#000
    style A2 fill:#888,stroke:#333,color:#fff
    style B2 fill:#fff,stroke:#333,color:#000

For Generational GC

Track cross-generational references:

def write_barrier_generational(obj, field, new_ref)
  if obj.generation == OLD && new_ref.generation == YOUNG
    remember_set.add(obj)  # Track this old object
  end
  obj[field] = new_ref
end

The “remembered set” contains old objects that reference young objects. Minor GC scans the remembered set to find young objects reachable from old objects, avoiding full heap scans.

graph TB
    subgraph "Remembered Set"
    RS[Old Objects<br/>with Young Refs]
    O1[Old Object A]
    O2[Old Object B]
    RS -.-> O1
    RS -.-> O2
    end

    subgraph "Young Generation"
    Y1[Young Object]
    Y2[Young Object]
    Y3[Young Object]
    end

    O1 --> Y1
    O2 --> Y2
    Y1 --> Y3

    GC[Minor GC] -.->|"Scans only"| RS
    GC -.->|"Plus entire"| Y1

    style RS fill:#fc8,stroke:#333,color:#000
    style O1 fill:#8cf,stroke:#333,color:#000
    style O2 fill:#8cf,stroke:#333,color:#000
    style Y1 fill:#fc8,stroke:#333,color:#000
    style Y2 fill:#fc8,stroke:#333,color:#000
    style Y3 fill:#fc8,stroke:#333,color:#000
    style GC fill:#f8f,stroke:#333,color:#000

Performance Costs

Write barriers impose overhead on every reference assignment—potentially the most frequent operation in managed languages. The cost manifests in several ways:

Direct Costs:

  • Conditional checks before each write
  • Potential set operations (remembered set insertion)
  • Cache pressure from barrier bookkeeping

Indirect Costs:

  • Compiler optimization constraints (writes have side effects)
  • Increased code size (barrier inlined at every write site)
  • Branch mispredictions if color/generation checks are unpredictable

For high-concurrency Ruby applications, write barrier overhead multiplies across thousands of fiber switches and object manipulations. The tradeoff: accept per-write costs to enable cheaper collection.

Write Barrier Unprotected Objects

Some objects bypass write barriers for performance. These “write barrier unprotected” objects include:

  • Frequently modified objects: Hash tables, arrays with intensive updates
  • Objects with special semantics: Thread-local storage, VM internals
  • Performance-critical paths: Hot loops where barrier overhead dominates

The collector compensates by scanning all unprotected objects at the end of incremental marking or during minor GC. This works because typically few objects are unprotected—the set remains small enough for final scanning.

This optimization embodies pragmatic performance decisions: violate the abstraction (skip barriers) where profiling proves it worthwhile, but maintain correctness through conservative fallbacks.

Card Marking: Space-Efficient Barriers

For generational GC, maintaining a per-object remembered set grows large. Card marking optimizes space by tracking dirty regions rather than individual objects:

  1. Divide heap into fixed-size cards (e.g., 512 bytes)
  2. Write barrier marks the card containing the modified object as “dirty”
  3. Minor GC scans all dirty cards to find cross-generational references

This trades precision for space: we scan entire cards even if only one object changed, but the card table stays compact. A 1GB heap with 512-byte cards needs only 2MB for the card table.

graph TB
    subgraph "Heap Memory"
    C1["Card 0<br/>(512 bytes)"]
    C2["Card 1<br/>(512 bytes)"]
    C3["Card 2<br/>(512 bytes)"]
    C4["Card 3<br/>(512 bytes)"]
    end

    subgraph "Card Table"
    T1["0: Clean"]
    T2["1: Dirty ✓"]
    T3["2: Clean"]
    T4["3: Dirty ✓"]
    end

    C1 -.-> T1
    C2 -.-> T2
    C3 -.-> T3
    C4 -.-> T4

    WB[Write Barrier] -->|"Marks card dirty<br/>on modification"| T2

    style T2 fill:#f88,stroke:#333,color:#000
    style T4 fill:#f88,stroke:#333,color:#000
    style T1 fill:#8f8,stroke:#333,color:#000
    style T3 fill:#8f8,stroke:#333,color:#000
    style WB fill:#88f,stroke:#333,color:#fff

Card marking resembles architectural sympathy—align data structures with hardware (cache lines, pages) to improve efficiency.

Implementation Strategies

Write barriers can be implemented at different levels:

Compiler-inserted: The compiler injects barrier checks around reference assignments. This enables optimization: elide redundant barriers, combine multiple checks, specialize for object types.

Runtime-check: Every reference write calls a runtime function. Simpler but slower—function call overhead per write.

Hardware-assisted: Some platforms use memory protection or hardware watchpoints. Rare due to portability and performance challenges.

Ruby uses compiler-inserted barriers in its bytecode compiler and C-level object manipulation, balancing portability with performance.

Interaction with Other GC Features

Write barriers enable sophisticated GC strategies by maintaining invariants:

Concurrent GC: Barriers synchronize collector and mutator (application) threads without full synchronization. The mutator helps the collector by maintaining reachability information through barriers.

Compaction: Moving objects requires updating all references. Barriers track which objects contain references, enabling efficient reference fixup during compaction.

Reference Counting: Barriers increment/decrement reference counts on assignments. Every reference modification updates counts, enabling immediate reclamation but struggling with cycles.

The fundamental tradeoffs appear again: write barriers are a form of incremental bookkeeping—spend small costs frequently to avoid large costs later.

Connection to Memory Barriers

The terminology echoes memory barriers in concurrent programming, but serves different purposes:

Memory Barriers (fences): Order memory operations across threads, ensuring visibility and preventing reordering. Protect against race conditions and inconsistencies.

Write Barriers (GC): Inform the garbage collector about reference modifications. Protect against incorrect collection.

Both involve “barrier” semantics—executing extra logic when certain operations occur. Both trade performance for correctness. The conceptual similarity reflects a deeper pattern: complex systems require invariant maintenance, and barriers checkpoint that maintenance at critical operations.

Optimizing Write Barrier Overhead

Modern VMs employ several techniques to reduce barrier costs:

Static Analysis: Prove some writes don’t need barriers. If both objects are young, no generational barrier needed. If we know no GC runs during a scope, skip barriers entirely.

Specialization: Generate different code paths for common cases. Objects with only primitive fields skip reference barriers.

Batching: Accumulate dirty information and process in batches rather than immediately. Trade freshness for throughput.

Hardware Support: Leverage CPU features like conditional moves to reduce branch costs in barrier checks.

These optimizations mirror JIT compilation strategies—profile to find hot paths, specialize for common cases, trade generality for speed.