source

The fundamental operations in YARV revolve around manipulating the stack data structure. This note explores the primary stack manipulation instructions that push values onto the stack - the building blocks of Ruby program execution.

Instruction Overview

All YARV instructions follow a pattern:

  1. Read values from the stack (if needed)
  2. Perform an operation
  3. Push result(s) back onto the stack

The “push” family of instructions are the simplest - they only push values without reading from the stack first.

putnil - Pushing Nil

The putnil instruction is the simplest YARV instruction. It pushes Ruby’s nil value onto the stack.

Pseudocode:

# instruction: putnil
stack.push(nil)

Example:

$ ruby --dump=insns -e 'nil'

Stack visualization:

Before putnil:          After putnil:
┌─────────────┐        ┌─────────────┐
│   (empty)   │        │     nil     │ ← top
└─────────────┘        └─────────────┘

Characteristics:

  • No operands (simplest instruction)
  • Always pushes the exact same value
  • Used when nil is explicitly referenced in code

putobject - Pushing Known Objects

The putobject instruction pushes compile-time known values onto the stack. This includes:

  • Booleans (true, false)
  • Numbers (integers, floats)
  • Symbols (:symbol)
  • Frozen strings with the frozen_string_literal: true pragma

Pseudocode:

# instruction: putobject object
stack.push(object)

Examples:

# Boolean
$ ruby --dump=insns -e 'true'
# => putobject true
 
# Number
$ ruby --dump=insns -e '42'
# => putobject_INT2FIX_1_  # optimized for 1
# => putobject 42          # general case
 
# Symbol
$ ruby --dump=insns -e ':hello'
# => putobject :hello

Stack visualization:

putobject 42:
┌─────────────┐
│     42      │ ← top
└─────────────┘

Multiple objects:
┌─────────────┐
│   :symbol   │ ← top (most recent)
├─────────────┤
│     42      │
├─────────────┤
│    true     │ (oldest)
└─────────────┘

Optimized Variants

YARV includes specialized instructions for common values:

InstructionPurposeValue
putobject_INT2FIX_0_Push integer 00
putobject_INT2FIX_1_Push integer 11

These optimizations reduce bytecode size and improve performance for frequently-used values.

putstring - Pushing Unfrozen Strings

Unlike putobject, the putstring instruction handles mutable (unfrozen) strings. Critically, it duplicates the string during the push operation.

Pseudocode:

# instruction: putstring string
stack.push(string.dup)

Why duplication?

Ruby strings are mutable by default. If the VM pushed the same string instance every time, mutations would affect all references:

# Without duplication (incorrect):
str = "hello"
str << " world"  # Would modify the original string object!
 
# With duplication (correct):
str = "hello"      # Gets a new copy
str << " world"    # Modifies only this copy

Example:

$ ruby --dump=insns -e '"hello"'
# => putstring "hello"

Stack visualization:

putstring "hello":
┌─────────────────┐
│ "hello" (copy)  │ ← top
└─────────────────┘

Each execution creates a new string instance

Performance consideration:

String duplication has overhead. This is why frozen strings (using putobject) are faster - they can be reused without copying.

See string mutability in Ruby for deeper exploration.

duparray - Duplicating Arrays

The duparray instruction duplicates and pushes array literals containing compile-time known values.

Pseudocode:

# instruction: duparray array
stack.push(array.dup)

Example:

$ ruby --dump=insns -e '[1, 2, 3]'
# => duparray [1, 2, 3]

Stack visualization:

duparray [1, 2, 3]:
┌──────────────────┐
│  [1, 2, 3] (dup) │ ← top
└──────────────────┘

Why duplication?

Like strings, arrays are mutable. Each array literal in the source code should produce a distinct array object:

arr = [1, 2, 3]
arr << 4           # Doesn't affect other [1,2,3] literals

Limitation:

Only works for arrays with compile-time known elements. Dynamic arrays require different instructions:

[1, 2, 3]        # duparray
[1, x, 3]        # newarray (different instruction)

duphash - Duplicating Hashes

The duphash instruction mirrors duparray but for hash literals.

Pseudocode:

# instruction: duphash hash
stack.push(hash.dup)

Example:

$ ruby --dump=insns -e '{a: 1, b: 2}'
# => duphash {a: 1, b: 2}

Stack visualization:

duphash {a: 1, b: 2}:
┌─────────────────────┐
│ {a:1, b:2} (dup)    │ ← top
└─────────────────────┘

Characteristics:

  • Duplicates the hash on each execution
  • Keys and values must be compile-time known
  • Maintains hash insertion order (Ruby 1.9+)

Instruction Design Patterns

These instructions reveal important YARV design principles:

1. Immutable vs Mutable

  • Immutable objects (numbers, symbols, nil): putobject
  • Mutable objects (strings, arrays, hashes): dup* instructions

2. Optimization Opportunities

  • Specialized instructions for common cases
  • Reduced operand encoding for frequent values
  • Frozen string literals avoid duplication overhead

3. Stack as Communication

  • Instructions communicate through the stack data structure
  • Subsequent instructions consume values left by previous ones
  • The stack top always contains the most recent result

Instruction Operands

Instructions can have operands - additional data that parameterizes the operation:

putobject 42
  ^         ^
  |         |
  |         +-- Operand (the object to push)
  +-- Instruction

Operands are encoded in the bytecode and decoded during execution.