The fundamental operations in YARV revolve around manipulating the stack data structure. This note explores the primary stack manipulation instructions that push values onto the stack - the building blocks of Ruby program execution.
Instruction Overview
All YARV instructions follow a pattern:
- Read values from the stack (if needed)
- Perform an operation
- Push result(s) back onto the stack
The “push” family of instructions are the simplest - they only push values without reading from the stack first.
putnil - Pushing Nil
The putnil
instruction is the simplest YARV instruction. It pushes Ruby’s nil
value onto the stack.
Pseudocode:
# instruction: putnil
stack.push(nil)
Example:
$ ruby --dump=insns -e 'nil'
Stack visualization:
Before putnil: After putnil:
┌─────────────┐ ┌─────────────┐
│ (empty) │ │ nil │ ← top
└─────────────┘ └─────────────┘
Characteristics:
- No operands (simplest instruction)
- Always pushes the exact same value
- Used when
nil
is explicitly referenced in code
putobject - Pushing Known Objects
The putobject
instruction pushes compile-time known values onto the stack. This includes:
- Booleans (
true
,false
) - Numbers (integers, floats)
- Symbols (
:symbol
) - Frozen strings with the
frozen_string_literal: true
pragma
Pseudocode:
# instruction: putobject object
stack.push(object)
Examples:
# Boolean
$ ruby --dump=insns -e 'true'
# => putobject true
# Number
$ ruby --dump=insns -e '42'
# => putobject_INT2FIX_1_ # optimized for 1
# => putobject 42 # general case
# Symbol
$ ruby --dump=insns -e ':hello'
# => putobject :hello
Stack visualization:
putobject 42:
┌─────────────┐
│ 42 │ ← top
└─────────────┘
Multiple objects:
┌─────────────┐
│ :symbol │ ← top (most recent)
├─────────────┤
│ 42 │
├─────────────┤
│ true │ (oldest)
└─────────────┘
Optimized Variants
YARV includes specialized instructions for common values:
Instruction | Purpose | Value |
---|---|---|
putobject_INT2FIX_0_ | Push integer 0 | 0 |
putobject_INT2FIX_1_ | Push integer 1 | 1 |
These optimizations reduce bytecode size and improve performance for frequently-used values.
putstring - Pushing Unfrozen Strings
Unlike putobject
, the putstring
instruction handles mutable (unfrozen) strings. Critically, it duplicates the string during the push operation.
Pseudocode:
# instruction: putstring string
stack.push(string.dup)
Why duplication?
Ruby strings are mutable by default. If the VM pushed the same string instance every time, mutations would affect all references:
# Without duplication (incorrect):
str = "hello"
str << " world" # Would modify the original string object!
# With duplication (correct):
str = "hello" # Gets a new copy
str << " world" # Modifies only this copy
Example:
$ ruby --dump=insns -e '"hello"'
# => putstring "hello"
Stack visualization:
putstring "hello":
┌─────────────────┐
│ "hello" (copy) │ ← top
└─────────────────┘
Each execution creates a new string instance
Performance consideration:
String duplication has overhead. This is why frozen strings (using putobject
) are faster - they can be reused without copying.
See string mutability in Ruby for deeper exploration.
duparray - Duplicating Arrays
The duparray
instruction duplicates and pushes array literals containing compile-time known values.
Pseudocode:
# instruction: duparray array
stack.push(array.dup)
Example:
$ ruby --dump=insns -e '[1, 2, 3]'
# => duparray [1, 2, 3]
Stack visualization:
duparray [1, 2, 3]:
┌──────────────────┐
│ [1, 2, 3] (dup) │ ← top
└──────────────────┘
Why duplication?
Like strings, arrays are mutable. Each array literal in the source code should produce a distinct array object:
arr = [1, 2, 3]
arr << 4 # Doesn't affect other [1,2,3] literals
Limitation:
Only works for arrays with compile-time known elements. Dynamic arrays require different instructions:
[1, 2, 3] # duparray
[1, x, 3] # newarray (different instruction)
duphash - Duplicating Hashes
The duphash
instruction mirrors duparray
but for hash literals.
Pseudocode:
# instruction: duphash hash
stack.push(hash.dup)
Example:
$ ruby --dump=insns -e '{a: 1, b: 2}'
# => duphash {a: 1, b: 2}
Stack visualization:
duphash {a: 1, b: 2}:
┌─────────────────────┐
│ {a:1, b:2} (dup) │ ← top
└─────────────────────┘
Characteristics:
- Duplicates the hash on each execution
- Keys and values must be compile-time known
- Maintains hash insertion order (Ruby 1.9+)
Instruction Design Patterns
These instructions reveal important YARV design principles:
1. Immutable vs Mutable
- Immutable objects (numbers, symbols,
nil
):putobject
- Mutable objects (strings, arrays, hashes):
dup*
instructions
2. Optimization Opportunities
- Specialized instructions for common cases
- Reduced operand encoding for frequent values
- Frozen string literals avoid duplication overhead
3. Stack as Communication
- Instructions communicate through the stack data structure
- Subsequent instructions consume values left by previous ones
- The stack top always contains the most recent result
Instruction Operands
Instructions can have operands - additional data that parameterizes the operation:
putobject 42
^ ^
| |
| +-- Operand (the object to push)
+-- Instruction
Operands are encoded in the bytecode and decoded during execution.