A stack-based virtual machine is a type of virtual machine architecture that uses a stack data structure as its primary mechanism for storing and manipulating values during program execution. This contrasts with register-based VMs that use named registers (like physical CPUs).
Core Concept
In a stack-based VM, all operations work with values on a stack:
- Operands are pushed onto the stack
- Operations pop operands from the stack
- Results are pushed back onto the stack
graph LR A[Push 5] --> B[Push 3] B --> C[Add] C --> D[Result: 8] style A fill:#e1f5ff style B fill:#e1f5ff style C fill:#fff4e1 style D fill:#e8f5e9
Example: Adding Two Numbers
Stack-based approach:
Instructions: Stack state:
------------- ------------
push 5 [5]
push 3 [5, 3]
add [8] # pop 5 and 3, push result
Register-based approach:
Instructions:
-------------
load r1, 5 # r1 = 5
load r2, 3 # r2 = 3
add r3, r1, r2 # r3 = r1 + r2
Visual Representation
Stack operations for: (2 + 3) * 4
Step 1: push 2
┌─────┐
│ 2 │ ← top
└─────┘
Step 2: push 3
┌─────┐
│ 3 │ ← top
├─────┤
│ 2 │
└─────┘
Step 3: add (pops 3 and 2, pushes 5)
┌─────┐
│ 5 │ ← top
└─────┘
Step 4: push 4
┌─────┐
│ 4 │ ← top
├─────┤
│ 5 │
└─────┘
Step 5: multiply (pops 4 and 5, pushes 20)
┌─────┐
│ 20 │ ← top (final result)
└─────┘
Advantages
1. Simplicity
- Fewer concepts to implement (just a stack)
- No register allocation needed
- Straightforward instruction encoding
2. Compact Bytecode
- Instructions don’t need to specify register names
add
instead ofadd r3, r1, r2
- Smaller bytecode files
3. Easy to Generate
- Compilers can generate code without register allocation
- Direct translation from abstract syntax tree expressions
- Natural fit for recursive descent compilation
4. JIT-Friendly
- Just-in-time compilation can optimize stack operations
- Easy to convert to register-based code during JIT
- Simple intermediate representation
5. Platform Independence
- No dependency on physical register count
- Same bytecode works on different architectures
- True virtualization
Disadvantages
1. More Instructions
- Separate push/pop for each value
- Register-based VMs can operate on values in place
- Higher instruction count for equivalent operations
2. Stack Manipulation Overhead
- Constant pushing and popping
- Memory traffic for temporary values
- Cache pressure from stack accesses
3. Less Direct Hardware Mapping
- Physical CPUs use registers
- Translation layer needed for execution
- Potentially slower without JIT
4. Limited Optimization
- Hard to optimize stack accesses
- Register-based code can reuse values in registers
- Value lifetime tracking is implicit
Examples in Practice
Stack-based VMs:
- YARV (Ruby)
- Java Virtual Machine (JVM)
- Python’s CPython VM
- WebAssembly (WASM)
- PostScript
Register-based VMs:
- Lua VM
- Dalvik (Android’s original VM)
- .NET CLR
Implementation Example
A simple stack-based VM in pseudocode:
class StackVM
def initialize
@stack = []
@program = []
@pc = 0 # Program counter
end
def push(value)
@stack.push(value)
end
def pop
@stack.pop
end
def add
b = pop
a = pop
push(a + b)
end
def multiply
b = pop
a = pop
push(a * b)
end
def run(program)
@program = program
@pc = 0
while @pc < @program.length
instruction = @program[@pc]
execute(instruction)
@pc += 1
end
@stack.last # Return top of stack
end
def execute(instruction)
case instruction.type
when :push
push(instruction.value)
when :add
add
when :multiply
multiply
end
end
end
# Calculate (2 + 3) * 4
vm = StackVM.new
result = vm.run([
{type: :push, value: 2},
{type: :push, value: 3},
{type: :add},
{type: :push, value: 4},
{type: :multiply}
])
# => 20
The Stack Contract
Stack-based VMs typically enforce a stack discipline:
- Functions must clean up their stack usage
- Stack height should be predictable
- Unbalanced stack operations cause errors
Example of stack contract:
def add_and_square(a, b)
# Stack on entry: [a, b]
add # Stack: [a+b]
dup # Stack: [a+b, a+b]
multiply # Stack: [(a+b)²]
# Stack on exit: [(a+b)²]
# Net effect: consumed 2 values, produced 1
end
See YARV for how Ruby’s VM implements this contract.
Stack vs Expression Evaluation
Stack-based VMs naturally mirror how mathematical expressions are evaluated:
Infix notation: 2 + 3 * 4
Postfix notation (RPN): 2 3 4 * +
Stack operations: Same as postfix!
push 2
push 3
push 4
multiply # 3 * 4 = 12
add # 2 + 12 = 14
This connection to reverse Polish notation (RPN) explains why stack-based VMs are intuitive for expression evaluation.
Memory Hierarchy
Stack-based VMs still use other memory structures:
┌──────────────────────┐
│ Stack (values) │ ← Primary operand storage
├──────────────────────┤
│ Call Stack │ ← Function calls, return addresses
├──────────────────────┤
│ Heap │ ← Object allocation
├──────────────────────┤
│ Constant Pool │ ← Compile-time constants
└──────────────────────┘
The “stack” in “stack-based VM” refers specifically to the operand stack, not the entire memory model.
Modern Hybrid Approaches
Many modern VMs combine approaches:
- JVM: Stack-based bytecode, register-based JIT
- YARV: Stack-based with JIT optimization
- WebAssembly: Stack-based with register hints
This gives the benefits of simple bytecode with the performance of register-based execution.