YARV Stack Instructions

source

The fundamental operations in YARV revolve around manipulating the stack data structure. This note explores the primary stack manipulation instructions that push values onto the stack - the building blocks of Ruby program execution.

Instruction Overview

All YARV instructions follow a pattern:

Read values from the stack (if needed)
Perform an operation
Push result(s) back onto the stack

The “push” family of instructions are the simplest - they only push values without reading from the stack first.

putnil - Pushing Nil

The putnil instruction is the simplest YARV instruction. It pushes Ruby’s nil value onto the stack.

Pseudocode:

# instruction: putnil
stack.push(nil)

Example:

$ ruby --dump=insns -e 'nil'

Stack visualization:

Before putnil:          After putnil:
┌─────────────┐        ┌─────────────┐
│   (empty)   │        │     nil     │ ← top
└─────────────┘        └─────────────┘

Characteristics:

No operands (simplest instruction)
Always pushes the exact same value
Used when nil is explicitly referenced in code

putobject - Pushing Known Objects

The putobject instruction pushes compile-time known values onto the stack. This includes:

Booleans (true, false)
Numbers (integers, floats)
Symbols (:symbol)
Frozen strings with the frozen_string_literal: true pragma

Pseudocode:

# instruction: putobject object
stack.push(object)

Examples:

# Boolean
$ ruby --dump=insns -e 'true'
# => putobject true
 
# Number
$ ruby --dump=insns -e '42'
# => putobject_INT2FIX_1_  # optimized for 1
# => putobject 42          # general case
 
# Symbol
$ ruby --dump=insns -e ':hello'
# => putobject :hello

Stack visualization:

putobject 42:
┌─────────────┐
│     42      │ ← top
└─────────────┘

Multiple objects:
┌─────────────┐
│   :symbol   │ ← top (most recent)
├─────────────┤
│     42      │
├─────────────┤
│    true     │ (oldest)
└─────────────┘

Optimized Variants

YARV includes specialized instructions for common values:

Instruction	Purpose	Value
`putobject_INT2FIX_0_`	Push integer 0	`0`
`putobject_INT2FIX_1_`	Push integer 1	`1`

These optimizations reduce bytecode size and improve performance for frequently-used values.

putstring - Pushing Unfrozen Strings

Unlike putobject, the putstring instruction handles mutable (unfrozen) strings. Critically, it duplicates the string during the push operation.

Pseudocode:

# instruction: putstring string
stack.push(string.dup)

Why duplication?

Ruby strings are mutable by default. If the VM pushed the same string instance every time, mutations would affect all references:

# Without duplication (incorrect):
str = "hello"
str << " world"  # Would modify the original string object!
 
# With duplication (correct):
str = "hello"      # Gets a new copy
str << " world"    # Modifies only this copy

Example:

$ ruby --dump=insns -e '"hello"'
# => putstring "hello"

Stack visualization:

putstring "hello":
┌─────────────────┐
│ "hello" (copy)  │ ← top
└─────────────────┘

Each execution creates a new string instance

Performance consideration:

String duplication has overhead. This is why frozen strings (using putobject) are faster - they can be reused without copying.

See string mutability in Ruby for deeper exploration.

duparray - Duplicating Arrays

The duparray instruction duplicates and pushes array literals containing compile-time known values.

Pseudocode:

# instruction: duparray array
stack.push(array.dup)

Example:

$ ruby --dump=insns -e '[1, 2, 3]'
# => duparray [1, 2, 3]

Stack visualization:

duparray [1, 2, 3]:
┌──────────────────┐
│  [1, 2, 3] (dup) │ ← top
└──────────────────┘

Why duplication?

Like strings, arrays are mutable. Each array literal in the source code should produce a distinct array object:

arr = [1, 2, 3]
arr << 4           # Doesn't affect other [1,2,3] literals

Limitation:

Only works for arrays with compile-time known elements. Dynamic arrays require different instructions:

[1, 2, 3]        # duparray
[1, x, 3]        # newarray (different instruction)

duphash - Duplicating Hashes

The duphash instruction mirrors duparray but for hash literals.

Pseudocode:

# instruction: duphash hash
stack.push(hash.dup)

Example:

$ ruby --dump=insns -e '{a: 1, b: 2}'
# => duphash {a: 1, b: 2}

Stack visualization:

duphash {a: 1, b: 2}:
┌─────────────────────┐
│ {a:1, b:2} (dup)    │ ← top
└─────────────────────┘

Characteristics:

Duplicates the hash on each execution
Keys and values must be compile-time known
Maintains hash insertion order (Ruby 1.9+)

Instruction Design Patterns

These instructions reveal important YARV design principles:

1. Immutable vs Mutable

Immutable objects (numbers, symbols, nil): putobject
Mutable objects (strings, arrays, hashes): dup* instructions

2. Optimization Opportunities

Specialized instructions for common cases
Reduced operand encoding for frequent values
Frozen string literals avoid duplication overhead

3. Stack as Communication

Instructions communicate through the stack data structure
Subsequent instructions consume values left by previous ones
The stack top always contains the most recent result

Instruction Operands

Instructions can have operands - additional data that parameterizes the operation:

putobject 42
  ^         ^
  |         |
  |         +-- Operand (the object to push)
  +-- Instruction

Operands are encoded in the bytecode and decoded during execution.

Gradual Notes

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Reentrant Code

Zeigarnik Effect

Migrating from Resque to Sidekiq

YARV Stack Instructions

Instruction Overview

putnil - Pushing Nil

putobject - Pushing Known Objects

Optimized Variants

putstring - Pushing Unfrozen Strings

duparray - Duplicating Arrays

duphash - Duplicating Hashes

Instruction Design Patterns

Instruction Operands

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Reentrant Code

Zeigarnik Effect

Migrating from Resque to Sidekiq

Kubernetes Batch Jobs

Graph View

Table of Contents

Backlinks