An instruction sequence (ISEQ) is the core data structure in YARV that represents compiled Ruby code. Every method, block, and class body is compiled into its own ISEQ, serving as the blueprint for execution. The ISEQ is what the VM actually executes, not the Ruby source code.
What is an Instruction Sequence?
An instruction sequence is the compiled form of Ruby code:
# Ruby source:
def add(a, b)
a + b
end
# Compiled to instruction sequence:
# == disasm: #<ISeq:add>
# 0000 getlocal a ( 2)[Li]
# 0002 getlocal b
# 0004 opt_plus <calldata>
# 0006 leave
The bytecode compilation process transforms Ruby’s abstract syntax tree into this sequential instruction format that the YARV VM can execute efficiently.
Structure Components
An ISEQ contains everything the VM needs to execute a piece of Ruby code:
ISEQ Components:
┌──────────────────────────────────────┐
│ Bytecode Instructions │
│ ┌──────────────────────────────────┐ │
│ │ [putself, opt_send_without_block,│ │
│ │ leave] │ │
│ └──────────────────────────────────┘ │
├──────────────────────────────────────┤
│ Call Counter (for JIT) │
│ ┌──────────────────────────────────┐ │
│ │ calls: 25 │ │
│ └──────────────────────────────────┘ │
├──────────────────────────────────────┤
│ JIT Entry Point │
│ ┌──────────────────────────────────┐ │
│ │ NULL or 0x7f8e2c004000 │ │
│ └──────────────────────────────────┘ │
├──────────────────────────────────────┤
│ Metadata │
│ - Local variable table │
│ - Argument information │
│ - Source location info │
│ - Catch table (exception handling) │
│ - Constant pool │
└──────────────────────────────────────┘
Bytecode Instructions
The heart of an ISEQ is its array of YARV stack instructions. These instructions represent the compiled Ruby code with operands embedded inline:
# Ruby:
x = 42
# Instruction sequence:
# 0000 putobject 42 ← '42' is an operand
# 0002 setlocal x
The operand 42
is known at compile-time and built into the ISEQ. At runtime, the VM reads it from the ISEQ. This differs from stack values, which are computed at runtime.
Local Variable Table
The ISEQ maintains a local variable table mapping variable names to stack positions:
def calculate(x, y)
sum = x + y
product = x * y
sum + product
end
# Local table:
# 0: x (parameter)
# 1: y (parameter)
# 2: sum (local variable)
# 3: product (local variable)
This table enables the VM to:
- Access variables by offset (fast)
- Allocate the right amount of stack space
- Provide debugging information
Argument Information
The ISEQ stores detailed argument metadata for method dispatch and validation:
def method(req, opt = 1, *rest, key:, key_opt: 2, **kwrest, &block)
# ...
end
# Argument info:
# - Required positional: 1 (req)
# - Optional positional: 1 (opt)
# - Rest parameter: yes (*rest)
# - Required keyword: 1 (key:)
# - Optional keyword: 1 (key_opt:)
# - Keyword rest: yes (**kwrest)
# - Block parameter: yes (&block)
This information guides argument validation at call sites, stack frame setup, and method dispatch optimization.
Constant Pool
The constant pool stores compile-time known values:
def example
puts "Hello", 42, :symbol
end
# Constant pool:
# - "Hello" (string literal)
# - 42 (integer literal)
# - :symbol (symbol literal)
# - :puts (method name)
#
# Instructions reference pool by index:
# putstring @0 ← "Hello"
# putobject @1 ← 42
# putobject @2 ← :symbol
This makes bytecode compact - values are stored once and referenced by index. Nested ISEQs are also stored in the parent’s constant pool.
Source Location Information
For debugging and error reporting, ISEQs maintain source location data:
# Associates each instruction with source location
{
instruction_index: 0,
source_file: "example.rb",
line_number: 10,
column: 5
}
This enables meaningful stack traces, debugger breakpoints, coverage analysis, and TracePoint events.
Catch Table (Exception Handling)
The catch table maps instruction ranges to exception handlers:
def risky
dangerous_operation
rescue StandardError => e
handle_error(e)
ensure
cleanup
end
# Catch table:
# [
# { type: :rescue, range: 0..5, target: 6 },
# { type: :ensure, range: 0..8, target: 9 }
# ]
When an exception occurs, the VM checks the current instruction index against the catch table, finds the matching handler, and jumps to the handler’s target instruction.
JIT Integration
The ISEQ serves as the coordination point between interpretation and compilation.
Call Counter: The JIT Trigger
The call counter tracks how many times this ISEQ has been executed. This simple integer drives the JIT compilation decision:
struct rb_iseq_constant_body {
// ... other fields
unsigned int call_counter;
// ...
};
The YJIT execution mechanics use this counter to implement the two-phase compilation strategy:
- 25 calls: Begin profiling
- 30 calls: Compile to native code
Each time the ISEQ executes, the VM increments this counter and checks if compilation should trigger.
JIT Entry Point: The Execution Switch
The jit_entry field is a function pointer that’s either NULL (not compiled) or points to native machine code:
struct rb_iseq_constant_body {
// ... other fields
void *jit_entry; // NULL or native code address
// ...
};
This single field transforms how Ruby executes code:
Execution Decision:
┌─────────────────┐
│ iseq->jit_entry │
└────────┬────────┘
│
┌────┴─────┐
│ │
NULL? Address?
│ │
▼ ▼
┌────────┐ ┌──────────────┐
│Interpret│ │Jump to native│
│bytecode │ │machine code │
└────────┘ └──────────────┘
The YJIT execution mechanics leverage this pointer to seamlessly switch between interpreted and compiled execution.
ISEQ as JIT Bridge
The ISEQ serves as the coordination point between interpretation and compilation:
ISEQ as JIT Bridge:
┌─────────────────────┐
│ Ruby Source Code │
└──────────┬──────────┘
▼
┌──────────────┐
│ Parse │
└──────┬───────┘
▼
┌──────────────┐
│ ISEQ │ ◄─── Central structure
└──────┬───────┘
│
┌─────┴──────┐
▼ ▼
┌─────────┐ ┌──────────┐
│Interpreter│ │ YJIT │
│reads │ │generates │
│bytecode │ │native │
│ │ │code │
└─────────┘ └─────┬────┘
│
▼
┌──────────────┐
│jit_entry ptr │
│updated │
└──────────────┘
The ISEQ provides:
- Bytecode for interpretation
- Profiling data for optimization decisions
- Storage for the jit_entry pointer
- Metadata for code generation
ISEQ Lifecycle
An ISEQ progresses through several states:
- Creation: Compiled from Ruby source during parse/load
- Interpretation: Executed by VM, call counter increments
- Profiling: YJIT observes types and patterns (25+ calls)
- Compilation: Native code generated, jit_entry populated (30+ calls)
- Execution: Direct jump to native code
- De-optimization: Native code invalidated, jit_entry cleared
- Garbage Collection: ISEQ freed when no longer referenced
The ISEQ persists across this lifecycle, acting as the stable reference point as execution strategies change.
Compilation Process
Bytecode compilation transforms Ruby AST to instruction sequences:
graph LR A[Ruby Source] --> B[Parser] B --> C[AST] C --> D[Compiler] D --> E[Instruction Sequence] E --> F[YARV VM] style A fill:#e1f5ff style C fill:#fff4e1 style E fill:#e8f5e9 style F fill:#fce4ec
Each node in the AST potentially generates a new ISEQ:
- Method definitions → method iseq
- Block expressions → block iseq
- Class/module bodies → class iseq
YARV caches instruction sequences to avoid recompilation:
# First time this code executes:
eval("1 + 2") # Parse → Compile → Execute
# Second time (same string):
eval("1 + 2") # Use cached iseq → Execute
This caching is why require
is fast for already-loaded files.
Hierarchy and Nesting
Ruby programs form a tree of instruction sequences mirroring the code structure:
class MyClass # Root ISEQ (class body)
def method1 # Child ISEQ (method)
[1, 2].map do |x| # Grandchild ISEQ (block)
x * 2
end
end
def method2 # Child ISEQ (method)
if condition
# code here # Child ISEQ (block)
end
end
end
Each ISEQ:
- Has a parent ISEQ reference
- Contains child ISEQs for nested code (stored in constant pool)
- Maintains its own execution context
This hierarchy enables lexical scope resolution, closure variable capture, and proper frame management.
The tree structure looks like:
<main>
└── <class:MyClass>
├── method1
│ └── block in method1
└── method2
└── block (0) in method2
The Program Counter and ISEQs
A frame executes one instruction sequence at a time, with the program counter pointing to the current position:
def example
a = 1 # ← PC at instruction 0
b = 2 # ← PC at instruction 2
a + b # ← PC at instruction 4
end
# Frame for 'example':
# iseq: <ISeq:example>
# PC: 4 (currently executing 'add')
When a call site invokes a method, a new frame is created with a different ISEQ and its own PC.
Performance Characteristics
Understanding ISEQ structure clarifies performance characteristics:
Compact Representation
ISEQs are designed for efficient memory usage:
Memory Layout:
┌──────────────────────────┐
│ ISEQ Header (~200 bytes) │ ← Frequently accessed
├──────────────────────────┤
│ Instruction Array │ ← Hot path during execution
├──────────────────────────┤
│ Operand Pool │ ← Referenced by instructions
├──────────────────────────┤
│ Metadata (cold data) │ ← Rarely accessed during execution
└──────────────────────────┘
This layout reflects mechanical sympathy:
- Hot data (instructions, operands) packed together for cache locality
- Cold data (debug info, source locations) separated
- Minimal indirection for common operations
- Variable-width encoding
- Shared constant pools
- Operands embedded directly
Execution Efficiency
Sequential access: The program counter increments predictably, making execution cache-friendly.
Instruction count matters: Each instruction has overhead, even if small. Fewer instructions = faster execution.
JIT threshold is per-ISEQ: Each method/block has its own counter, so small frequently-called methods JIT compile quickly.
Metadata is cheap: Source location and debug info don’t impact hot path performance - they’re only accessed on errors/debugging.
ISEQ size affects memory: Large methods create large ISEQs. Consider breaking up massive methods.
Viewing and Inspecting ISEQs
Ruby provides tools to inspect ISEQs:
# Using RubyVM::InstructionSequence
code = "1 + 2"
iseq = RubyVM::InstructionSequence.compile(code)
puts iseq.disasm
# Output:
# == disasm: #<ISeq:<compiled>>
# 0000 putobject 1
# 0002 putobject 2
# 0004 opt_plus <calldata>
# 0006 leave
# Using --dump=insns flag
ruby --dump=insns -e '1 + 2'
This is essential for understanding how YARV executes your code.
Key Insights
- Compiled Form: ISEQs are Ruby’s compiled bytecode representation
- Hierarchical: Nested scopes create nested ISEQs stored in constant pools
- Self-Contained: Each ISEQ includes instructions, constants, metadata, and JIT integration
- Execution Unit: Frames execute one ISEQ at a time via the program counter
- JIT Coordination: The ISEQ bridges interpretation and compilation via call counter and jit_entry pointer
- Optimization Target: JIT and other optimizations work on ISEQs
- Inspectable: Ruby provides tools to examine ISEQs
- Cached: ISEQs are compiled once and reused
- Performance-Oriented: Memory layout optimized for hot/cold data separation
Understanding instruction sequences is key to grasping how YARV translates Ruby source code into executable bytecode, manages program execution, and coordinates the transition from interpretation to JIT-compiled code.