Inline caching is an optimization technique in virtual machines where method lookup results are cached at call sites. When a method is called repeatedly on objects of the same type, the cached lookup avoids expensive method resolution.

The Problem: Method Lookup Cost

In dynamic languages like Ruby, method dispatch involves:

  1. Get the receiver object
  2. Look up the receiver’s class
  3. Search the class’s method table
  4. Search parent classes if not found
  5. Handle method_missing if still not found
user.name    # Every call: lookup 'name' on user's class

This lookup happens on every method call, which is expensive.

The Solution: Cache at Call Sites

Inline caching stores lookup results at each call site:

# First call at this call site:
user.name
# 1. Lookup 'name' method on User class
# 2. Cache: "if receiver is User, method is at address X"
 
# Subsequent calls at same site:
user.name
# 1. Check: is receiver a User? Yes!
# 2. Jump directly to cached address X (skip lookup!)

The cache is “inline” because it’s stored right at the call site in the code.

How It Works in YARV

YARV uses inline caching for method calls:

def process_users(users)
  users.each do |user|
    user.name    # ← This call site caches User#name lookup
  end
end

First iteration:

Call site: user.name
  Cache empty → perform full lookup
  Found: User#name at address 0x1234
  Cache: [User class, 0x1234]

Second iteration:

Call site: user.name
  Check cache: receiver class == User? ✓
  Use cached address: 0x1234
  Skip expensive lookup!

This can be 2-10x faster than uncached dispatch.

Cache Invalidation

The cache must be invalidated when assumptions break:

# Build up cache
1000.times { user.name }  # Cache: User → User#name
 
# Cache invalidation!
class User
  def name
    "new implementation"  # Method redefined!
  end
end
 
user.name  # Cache invalid! Must re-lookup and re-cache

YARV tracks class modifications and invalidates affected caches.

Monomorphic vs Polymorphic Sites

Monomorphic (single type):

users = User.all
users.each { |u| u.name }  # Always User → easy to cache

Polymorphic (multiple types):

objects = [user, admin, guest]
objects.each { |o| o.name }  # User, Admin, Guest → harder to cache

VMs handle these differently:

  • Monomorphic cache: Single type, single cached method
  • Polymorphic inline cache (PIC): Multiple types, multiple cached methods
  • Megamorphic: Too many types, caching disabled

Inline Cache Structure

In YARV, a call site’s cache contains:

class InlineCache
  @receiver_class    # Expected class of receiver
  @method_entry      # Cached method to call
  @serial_number     # Class version (for invalidation)
end

When executing:

# Check cache
if receiver.class == cache.receiver_class &&
   receiver.class.serial == cache.serial_number
  # Cache hit! Use cached method
  call cache.method_entry
else
  # Cache miss! Perform lookup and update cache
  method = lookup_method(receiver, method_name)
  cache.receiver_class = receiver.class
  cache.method_entry = method
  cache.serial_number = receiver.class.serial
  call method
end

Performance Impact

Inline caching dramatically improves method call performance:

Without caching:

1,000,000 method calls
= 1,000,000 method lookups
= ~500ms (lookup overhead)

With caching (monomorphic):

1,000,000 method calls
= 1 method lookup + 999,999 cache hits
= ~50ms (10x faster!)

With caching (polymorphic):

1,000,000 method calls
= 3 method lookups + 999,997 cache checks
= ~100ms (5x faster)

This is why YARV and similar VMs rely heavily on inline caching.

Call Site Specialization

Inline caching enables call site specialization:

# Call site A:
user.name     # Cached for User#name
 
# Call site B (different location):
admin.name    # Cached separately for Admin#name

Each call site maintains its own cache, specializing to the types it sees.

JIT and Inline Caching

JIT compilers use inline cache data:

# Observed at call site:
1000.times { user.name }  # Always User
 
# YJIT decision:
# "This site is monomorphic for User#name"
# → Compile specialized native code assuming User

The JIT can generate optimized code based on cached type information.