A write barrier is a piece of code executed whenever the program modifies an object reference. It maintains critical invariants for incremental and generational garbage collectors, ensuring correctness when collection is interleaved with program execution.
The name suggests protection—like memory barriers in concurrent programming—but write barriers protect the garbage collector’s assumptions rather than memory ordering.
The Problem: Concurrent Modification
Without write barriers, incremental and generational GC would be unsound:
Incremental GC Problem: During tri-color marking, a black object (fully traced) gains a reference to a white object (not yet traced). The collector assumes black objects point only to black or gray objects. The white object appears unreachable and gets collected incorrectly.
graph LR A[Black Object<br/>Fully Traced] B[White Object<br/>Not Yet Traced] C[GC Assumptions<br/>Broken!] A -.->|"Program assigns<br/>new reference"| B B -.->|"Appears unreachable"| C style A fill:#000,stroke:#333,color:#fff style B fill:#fff,stroke:#333,color:#000 style C fill:#f88,stroke:#333,color:#000
Generational GC Problem: An old generation object gains a reference to a young generation object between minor collections. Minor GC only scans young objects, so it misses this reference and incorrectly collects the young object.
graph TB subgraph "Old Generation" O1[Old Object] end subgraph "Young Generation" Y1[Young Object A] Y2[Young Object B] end O1 -.->|"New reference<br/>between minor GCs"| Y2 Y1 --> Y2 style O1 fill:#8cf,stroke:#333,color:#000 style Y1 fill:#fc8,stroke:#333,color:#000 style Y2 fill:#fc8,stroke:#333,color:#000
Both scenarios break the collector’s invariants about what needs scanning. Write barriers detect these invariant violations.
How Write Barriers Work
The barrier intercepts every reference assignment: obj.field = other_obj
For Incremental GC (Tri-Color)
When assigning a reference, check the colors:
def write_barrier_tricolor(obj, field, new_ref)
if obj.color == BLACK && new_ref.color == WHITE
obj.color = GRAY # Re-add to work queue
end
obj[field] = new_ref # Perform actual write
end
This maintains the invariant: black objects never point directly to white objects. By re-graying the black object, we ensure the new reference will be traced.
graph TB subgraph "Before Write Barrier" A1[Black Object] B1[White Object] end subgraph "After Write Barrier" A2[Gray Object<br/>Re-added to queue] B2[White Object<br/>Will be traced] A2 --> B2 end A1 -.->|"Assignment triggers<br/>write barrier"| A2 style A1 fill:#000,stroke:#333,color:#fff style B1 fill:#fff,stroke:#333,color:#000 style A2 fill:#888,stroke:#333,color:#fff style B2 fill:#fff,stroke:#333,color:#000
For Generational GC
Track cross-generational references:
def write_barrier_generational(obj, field, new_ref)
if obj.generation == OLD && new_ref.generation == YOUNG
remember_set.add(obj) # Track this old object
end
obj[field] = new_ref
end
The “remembered set” contains old objects that reference young objects. Minor GC scans the remembered set to find young objects reachable from old objects, avoiding full heap scans.
graph TB subgraph "Remembered Set" RS[Old Objects<br/>with Young Refs] O1[Old Object A] O2[Old Object B] RS -.-> O1 RS -.-> O2 end subgraph "Young Generation" Y1[Young Object] Y2[Young Object] Y3[Young Object] end O1 --> Y1 O2 --> Y2 Y1 --> Y3 GC[Minor GC] -.->|"Scans only"| RS GC -.->|"Plus entire"| Y1 style RS fill:#fc8,stroke:#333,color:#000 style O1 fill:#8cf,stroke:#333,color:#000 style O2 fill:#8cf,stroke:#333,color:#000 style Y1 fill:#fc8,stroke:#333,color:#000 style Y2 fill:#fc8,stroke:#333,color:#000 style Y3 fill:#fc8,stroke:#333,color:#000 style GC fill:#f8f,stroke:#333,color:#000
Performance Costs
Write barriers impose overhead on every reference assignment—potentially the most frequent operation in managed languages. The cost manifests in several ways:
Direct Costs:
- Conditional checks before each write
- Potential set operations (remembered set insertion)
- Cache pressure from barrier bookkeeping
Indirect Costs:
- Compiler optimization constraints (writes have side effects)
- Increased code size (barrier inlined at every write site)
- Branch mispredictions if color/generation checks are unpredictable
For high-concurrency Ruby applications, write barrier overhead multiplies across thousands of fiber switches and object manipulations. The tradeoff: accept per-write costs to enable cheaper collection.
Write Barrier Unprotected Objects
Some objects bypass write barriers for performance. These “write barrier unprotected” objects include:
- Frequently modified objects: Hash tables, arrays with intensive updates
- Objects with special semantics: Thread-local storage, VM internals
- Performance-critical paths: Hot loops where barrier overhead dominates
The collector compensates by scanning all unprotected objects at the end of incremental marking or during minor GC. This works because typically few objects are unprotected—the set remains small enough for final scanning.
This optimization embodies pragmatic performance decisions: violate the abstraction (skip barriers) where profiling proves it worthwhile, but maintain correctness through conservative fallbacks.
Card Marking: Space-Efficient Barriers
For generational GC, maintaining a per-object remembered set grows large. Card marking optimizes space by tracking dirty regions rather than individual objects:
- Divide heap into fixed-size cards (e.g., 512 bytes)
- Write barrier marks the card containing the modified object as “dirty”
- Minor GC scans all dirty cards to find cross-generational references
This trades precision for space: we scan entire cards even if only one object changed, but the card table stays compact. A 1GB heap with 512-byte cards needs only 2MB for the card table.
graph TB subgraph "Heap Memory" C1["Card 0<br/>(512 bytes)"] C2["Card 1<br/>(512 bytes)"] C3["Card 2<br/>(512 bytes)"] C4["Card 3<br/>(512 bytes)"] end subgraph "Card Table" T1["0: Clean"] T2["1: Dirty ✓"] T3["2: Clean"] T4["3: Dirty ✓"] end C1 -.-> T1 C2 -.-> T2 C3 -.-> T3 C4 -.-> T4 WB[Write Barrier] -->|"Marks card dirty<br/>on modification"| T2 style T2 fill:#f88,stroke:#333,color:#000 style T4 fill:#f88,stroke:#333,color:#000 style T1 fill:#8f8,stroke:#333,color:#000 style T3 fill:#8f8,stroke:#333,color:#000 style WB fill:#88f,stroke:#333,color:#fff
Card marking resembles architectural sympathy—align data structures with hardware (cache lines, pages) to improve efficiency.
Implementation Strategies
Write barriers can be implemented at different levels:
Compiler-inserted: The compiler injects barrier checks around reference assignments. This enables optimization: elide redundant barriers, combine multiple checks, specialize for object types.
Runtime-check: Every reference write calls a runtime function. Simpler but slower—function call overhead per write.
Hardware-assisted: Some platforms use memory protection or hardware watchpoints. Rare due to portability and performance challenges.
Ruby uses compiler-inserted barriers in its bytecode compiler and C-level object manipulation, balancing portability with performance.
Interaction with Other GC Features
Write barriers enable sophisticated GC strategies by maintaining invariants:
Concurrent GC: Barriers synchronize collector and mutator (application) threads without full synchronization. The mutator helps the collector by maintaining reachability information through barriers.
Compaction: Moving objects requires updating all references. Barriers track which objects contain references, enabling efficient reference fixup during compaction.
Reference Counting: Barriers increment/decrement reference counts on assignments. Every reference modification updates counts, enabling immediate reclamation but struggling with cycles.
The fundamental tradeoffs appear again: write barriers are a form of incremental bookkeeping—spend small costs frequently to avoid large costs later.
Connection to Memory Barriers
The terminology echoes memory barriers in concurrent programming, but serves different purposes:
Memory Barriers (fences): Order memory operations across threads, ensuring visibility and preventing reordering. Protect against race conditions and inconsistencies.
Write Barriers (GC): Inform the garbage collector about reference modifications. Protect against incorrect collection.
Both involve “barrier” semantics—executing extra logic when certain operations occur. Both trade performance for correctness. The conceptual similarity reflects a deeper pattern: complex systems require invariant maintenance, and barriers checkpoint that maintenance at critical operations.
Optimizing Write Barrier Overhead
Modern VMs employ several techniques to reduce barrier costs:
Static Analysis: Prove some writes don’t need barriers. If both objects are young, no generational barrier needed. If we know no GC runs during a scope, skip barriers entirely.
Specialization: Generate different code paths for common cases. Objects with only primitive fields skip reference barriers.
Batching: Accumulate dirty information and process in batches rather than immediately. Trade freshness for throughput.
Hardware Support: Leverage CPU features like conditional moves to reduce branch costs in barrier checks.
These optimizations mirror JIT compilation strategies—profile to find hot paths, specialize for common cases, trade generality for speed.