YJIT Execution Mechanics

source

YJIT (Yet another JIT) is Ruby’s modern JIT compiler that transforms YARV bytecode into native machine code for improved performance. Understanding how Ruby actually executes JIT-compiled code reveals the sophisticated interplay between interpretation and compilation at runtime.

The Dual Execution Model

Ruby maintains two execution paths for every method:

Interpretation path: Execute YARV bytecode through the VM
JIT path: Execute native machine code directly on CPU

The magic lies in how Ruby decides which path to take and seamlessly switches between them.

Method Execution Decision Flow:
┌─────────────────────┐
│  Method called      │
└──────────┬──────────┘
           │
           ▼
    ┌──────────────┐
    │ Check ISEQ   │
    │ jit_entry    │
    └──────┬───────┘
           │
      ┌────┴────┐
      │         │
   NULL?    Points to code?
      │         │
      ▼         ▼
  ┌────────┐  ┌──────────────┐
  │Interpret│  │Execute native│
  │bytecode │  │machine code  │
  └────────┘  └──────────────┘

ISEQ: The Central Data Structure

Every method in Ruby has an instruction sequence (ISEQ) - a compiled representation containing:

Bytecode instructions: The YARV bytecode representing the method logic
Call counter: Tracks how many times the method has been called
JIT entry point: A pointer that’s either NULL (not compiled) or points to native machine code

Think of the ISEQ as a method’s “execution record” that evolves as the program runs:

ISEQ Structure:
┌─────────────────────────────────┐
│ ISEQ (Instruction Sequence)     │
├─────────────────────────────────┤
│ bytecode: [putself, opt_send,   │ ← Original YARV instructions
│            leave]                │
├─────────────────────────────────┤
│ call_counter: 25                │ ← Incremented each call
├─────────────────────────────────┤
│ jit_entry: NULL                 │ ← Initially NULL
│            or                    │
│            0x7f8e2c004000       │ ← After JIT: points to native code
└─────────────────────────────────┘

The instruction sequence acts as a bridge between Ruby’s high-level semantics and low-level machine execution.

The Two-Phase Compilation Strategy

YJIT uses a carefully orchestrated two-phase approach that balances compilation overhead with optimization potential:

Phase 1: Profiling (25 calls)

After a method reaches the JIT compilation threshold of 25 calls, YJIT begins profiling:

Observes actual runtime types and values
Tracks which code paths execute most frequently
Collects data for type assumptions
Does NOT generate machine code yet

This profiling phase is crucial because it allows YJIT to make informed decisions about what to optimize. Compiling too early means missing optimization opportunities; compiling too late wastes interpreter cycles.

Phase 2: Compilation (30 calls)

After 30 total calls, YJIT compiles the method into native machine code:

Analyzes profiled type information
Generates optimized machine code based on observed patterns
Inserts type guards to verify assumptions
Updates the jit_entry pointer to the new native code
Future calls execute the compiled version directly

graph LR
    A[Method defined] --> B[Calls: 0-24<br/>Interpret]
    B --> C[Call 25<br/>Start profiling]
    C --> D[Calls: 25-29<br/>Profile + Interpret]
    D --> E[Call 30<br/>Compile to native]
    E --> F[Calls: 30+<br/>Execute native code]
    F --> G{Type guard<br/>fails?}
    G -->|Yes| H[Deoptimize]
    G -->|No| F
    H --> B

This adaptive compilation strategy is a key pattern in modern JIT compilers - gather data first, optimize second.

Execution Dispatch: The Hot Path

When Ruby calls a method, it follows this sequence:

// Simplified execution logic
if (iseq->jit_entry != NULL) {
    // Fast path: Jump directly to native code
    result = iseq->jit_entry(args);
} else {
    // Slow path: Interpret bytecode
    result = vm_exec_core(iseq);
 
    // Maybe increment counter and trigger compilation
    if (++iseq->call_counter == YJIT_COMPILE_THRESHOLD) {
        yjit_compile(iseq);
    }
}

The jit_entry field acts as a dispatch switch. This simple pointer check determines whether the CPU executes:

Interpreted bytecode (hundreds of C instructions per Ruby operation)
Native machine code (dozens of CPU instructions per Ruby operation)

This dispatch mechanism is remarkably efficient - a single NULL check determines the execution path.

Type Guards and Speculative Optimization

YJIT’s performance comes from making assumptions about types. But Ruby is dynamically typed - types can change at runtime. How does YJIT handle this?

Enter type guards - runtime checks that verify assumptions:

def calculate(x)
  x + 10  # YJIT assumes x is Integer
end

YJIT generates code like:

; Type guard - verify x is Integer
test    rdi, 0x1          ; Check if Fixnum (LSB = 1)
jz      side_exit         ; If not, take side exit
 
; Fast path - native integer addition
lea     rax, [rdi + 20]   ; x + 10 (shifted for Fixnum encoding)
ret
 
side_exit:
; Side exit - restore interpreter state and fall back
; 1. Restore stack pointer, instruction pointer
; 2. Clear jit_entry pointer
jmp     vm_exec_core      ; Return to interpreter

The type guards with side exits enable YJIT to:

Generate fast specialized code for common cases
Take pre-defined exit routes when assumptions break (side exits)
Restore interpreter state seamlessly
Maintain Ruby’s dynamic semantics

A side exit is the mechanism that transitions from compiled code back to the interpreter when a guard fails. It restores VM state (stack, instruction pointer) before returning control to the bytecode interpreter.

This is the core pattern of speculative optimization: optimize for the common case, handle edge cases correctly via side exits.

De-optimization: When JIT Code Becomes Invalid

JIT-compiled code can become invalid and must be de-optimized (thrown away). Several triggers cause de-optimization:

1. Type Assumption Violations

When a type guard fails, YJIT must abandon the compiled code:

def process(x)
  x * 2
end
 
process(5)      # Call 30: Compiled assuming Integer
process(5)      # Fast: Type guard passes
process(3.14)   # Type guard fails! De-optimize

After de-optimization, the method returns to interpretation and may be re-compiled with new type information.

2. TracePoint Activation

Ruby’s TracePoint API allows observing program execution. When activated, YJIT must de-optimize to ensure the VM properly triggers events:

# JIT compiled method
def work
  computation
end
 
# Activate tracing - all JIT code de-optimized!
TracePoint.trace(:call) do |tp|
  puts "Called: #{tp.method_id}"
end

The YARV events system requires interpreter execution to fire correctly, so JIT code is incompatible with active tracing.

3. Method Redefinition

Ruby allows redefining methods at runtime. When a method is redefined, all JIT code referencing it becomes invalid:

class Calculator
  def add(x, y)
    x + y
  end
end
 
# Method gets JIT compiled...
 
# Redefinition invalidates JIT code
class Calculator
  def add(x, y)
    x + y + 1  # Changed behavior
  end
end

YJIT tracks method dependencies and invalidates affected compiled code when definitions change.

4. Ractor Usage

Ractors (Ruby’s parallel execution model) introduce concurrency concerns. YJIT currently de-optimizes when Ractors are used to maintain thread safety guarantees.

See JIT de-optimization triggers for detailed exploration of each scenario.

Memory Management for JIT Code

YJIT’s compiled code lives in executable memory pages allocated from the OS. This introduces interesting memory management challenges:

Code cache: YJIT maintains a pool of executable memory
Code garbage collection: Unused compiled code must be reclaimed
Code patching: Updating jump targets when invalidating code

Unlike regular Ruby objects managed by the garbage collector, native code requires special handling. YJIT must:

Track which compiled code is still reachable
Unlink invalidated code from dispatch paths
Reclaim executable memory for new compilations
Handle concurrent execution during updates

The code cache acts as a fixed-size buffer. When full, YJIT must evict older compiled code to make room for new compilations - a tradeoff between compilation overhead and memory usage.

Performance Characteristics

Understanding YJIT’s mechanics reveals its performance profile:

When YJIT Excels

Hot loops: Methods called repeatedly benefit from compilation overhead being amortized across many calls.

# Excellent YJIT candidate
def fibonacci(n)
  return n if n <= 1
  fibonacci(n-1) + fibonacci(n-2)
end
 
fibonacci(30)  # Method called thousands of times

Type-stable code: Code where types remain consistent enables aggressive optimization.

# Type-stable - YJIT can optimize well
def process_numbers(array)
  array.map { |x| x * 2 }  # Always integers
end

Arithmetic and method calls: Native code for these operations is significantly faster than interpretation.

When YJIT Struggles

Rarely-called methods: Methods called < 30 times never reach compilation, wasting profiling overhead.

Highly polymorphic code: Frequent type changes trigger de-optimization, wasting compilation effort.

# Polymorphic - poor YJIT candidate
def process(x)
  x.process  # x changes type constantly
end
 
process(Integer.new)
process(String.new)
process(CustomObject.new)  # De-optimize repeatedly

TracePoint-heavy code: Debugging/profiling tools force de-optimization, eliminating JIT benefits.

Cold code paths: Branches that rarely execute get compiled but rarely used - wasted memory.

See JIT compilation threshold for more on YJIT’s compilation strategy tradeoffs.

Comparing YJIT to Other Ruby JITs

MJIT (Ruby 2.6-3.0)

Strategy: Method-based compilation to C, then native code
Threshold: 10,000 calls (much higher than YJIT)
Performance: Good speedup, but high compilation overhead
Status: Deprecated in favor of YJIT

YJIT (Ruby 3.1+)

Strategy: Direct bytecode → machine code compilation
Threshold: 30 calls (much lower than MJIT)
Performance: Lower compilation overhead, faster warmup
Current: Default JIT in Ruby 3.3+

ZJIT (Experimental, upstreamed)

Strategy: Profiling-focused compilation
Threshold: 2 calls for compilation after profiling
Innovation: More aggressive profiling, faster decisions
Status: Experimental, merged into Ruby 3.4

The evolution shows a clear pattern: lower compilation thresholds, more sophisticated profiling, faster warmup times. Each generation learns from the previous, balancing compilation overhead against optimization potential.

The Future: Deferred Compilation

An emerging pattern in JIT design is deferred compilation - delaying compilation decisions until more information is available:

Tiered compilation: Multiple optimization levels (interpreter → basic JIT → optimizing JIT)
Online feedback: Continuous profiling even after compilation
Adaptive re-compilation: Re-optimize based on new patterns
Speculative inlining: Inline methods based on call patterns

YJIT’s two-phase approach hints at this direction. Future versions may add more tiers, more sophisticated profiling, and more aggressive optimization.

Practical Implications

Understanding YJIT mechanics informs practical decisions:

1. Profile before optimizing: YJIT already optimizes hot paths. Manual optimization may be wasted effort.

2. Keep types stable: Consistent types enable better JIT optimization than Ruby-level tricks.

# Bad: Type instability
def calculate(x)
  x.is_a?(Integer) ? x * 2 : x.to_i * 2
end
 
# Good: Type-stable
def calculate(x)
  x.to_i * 2  # Always returns Integer
end

3. Be aware of de-optimization triggers: TracePoint, method redefinition, and Ractors impact YJIT.

4. Warmup matters: Applications benefit from warmup periods before peak performance.

5. Monitor compilation: YJIT stats (via --yjit-stats) reveal what’s being compiled and why.

Understanding how Ruby executes JIT code reveals the sophisticated runtime optimization happening beneath Ruby’s simple syntax. The interplay between interpretation, profiling, compilation, and de-optimization creates a dynamic execution model that adapts to program behavior.

Key Patterns Identified

This exploration reveals several recurring patterns in JIT compilation:

Adaptive compilation: Gather data first, optimize second (profiling → compilation)

Speculative optimization: Assume common case, verify with guards, handle failures gracefully

Lazy optimization: Only compile hot code, ignore cold paths

Graceful degradation: De-optimize when assumptions break, allow re-optimization

Dual execution paths: Maintain both interpreted and compiled paths for flexibility

These patterns appear across modern JIT compilers (JavaScript V8, Java HotSpot, .NET CLR) - representing battle-tested solutions to the fundamental challenge of optimizing dynamic languages at runtime.

Recent Writing

Recent Notes