Oblivious Computation and ORAM

Summary

Oblivious RAM (ORAM) is a general-purpose compiler that transforms any program into an “oblivious” version whose memory access patterns reveal nothing about the input data. Elaine Shi (CMU / Oblivious Labs) presented a framework at MEV-SBC 2025 for building practically efficient oblivious programs by combining: (1) information-flow analysis to avoid ORAM where unnecessary, (2) programming abstractions that enable oblivious algorithms without ORAM overhead, and (3) coalesced oblivious data structures that save a log factor over recursive ORAM. Applications include block building inside TEEs — where access patterns to EVM storage can reveal which contracts are being traded — and private key-value stores orders of magnitude faster than today’s alternatives.

Why “Oblivious” Matters for TEEs

A Trusted Execution Environment (TEE) encrypts memory contents — the host OS cannot read the data inside an enclave. But the host can observe:

Memory access patterns: which addresses are read or written, in what order
Cache line access patterns: which cache lines are evicted and loaded
Timing patterns: how long operations take

These side channels can reveal sensitive information even when the underlying data is encrypted. Classic example: a TEE building a block for a DEX swap accesses storage for TokenA and TokenB. The access pattern reveals that those tokens are being traded, even if the actual amounts are hidden.

Oblivious computation eliminates this: the access pattern is provably independent of the input (it looks identical regardless of which tokens or amounts are processed).

ORAM: The General Solution

ORAM (Oblivious RAM) transforms any program P that makes T memory accesses into a program P’ that:

Makes the same memory accesses as P logically
Has a physical access pattern that is statistically independent of the input
Overhead: O(log²T) per logical memory access

Path ORAM (Core Algorithm)

Data is stored in a binary tree of buckets. Each logical address maps to a random path in the tree.

On each access:

Read the entire path from root to leaf (all buckets on the path)
Find the requested item somewhere on the path
Return it to the root level (remapping with a fresh random path)
Evict older items down the tree to make room

Why this works: every access reads the same number of tree nodes (one full path = log N nodes). The path is randomized independently per access. An observer sees O(log N) uniform random locations accessed each time — no correlation to the item accessed.

Overhead: each logical access requires O(log N) physical accesses × O(log N) stash management = O(log² N) total.

Recursion Tree Optimization

The tree’s position map (which path holds which item) is itself stored in ORAM. This creates a recursion: level k stores the position map for level k+1.

Standard recursive ORAM has depth O(log N) levels → total overhead O(log² N).

Coalesced ORAM merges the recursion tree with the logical data structure (e.g., a binary search tree). This saves one log factor: O(log N) per access instead of O(log² N) for certain data structures.

Beyond ORAM: When Not to Use It

Key insight from Shi’s framework: ORAM is a brute-force solution. Most programs have structure that can be exploited to avoid ORAM entirely.

Three Optimization Strategies

1. Information-Flow Analysis

Not all memory accesses leak sensitive information. A variable is “information-leaking” only if its access pattern depends on secret data.

Example in block building:

Accessing the block gas limit (constant) → not information-leaking
Accessing the ERC-20 token address being traded → information-leaking

The compiler runs taint analysis to identify which accesses need ORAM protection. Accesses to public data (always-constant addresses, protocol constants) bypass ORAM.

2. Programming Abstractions

Certain computational patterns admit oblivious algorithms without ORAM:

Map-Reduce: sorting is the key oblivious primitive. An oblivious sort (e.g., AKS network, bitonic sort) has O(N log N) overhead and is access-pattern-independent. Map-Reduce built on oblivious sort avoids ORAM entirely.
Fork-Join parallelism: independent subtasks can be executed in parallel with fixed-size allocations. If each subtask uses a known, fixed memory layout, no ORAM is needed.
Oblivious queue/stack: bounded-size queues with fixed access patterns. Cheaper than ORAM for streaming computations.

3. Oblivious Data Structures (ODS)

For specific abstract data types, ODS implementations coalesce the ORAM recursion tree with the logical structure:

Oblivious Binary Search Tree: search, insert, delete in O(log N) vs. O(log² N) for naive ORAM on a BST
Oblivious Hash Map: O(1) amortized access (with ORAM overhead hidden in the constant)
Oblivious Priority Queue: useful for block building (sorting transactions by priority fee)

The key: the ODS “knows” which ORAM paths will be accessed for a given logical operation and can pre-fetch them in one pass.

Differential Obliviousness

For programs where full obliviousness is too expensive, differential obliviousness provides a relaxed guarantee:

Access pattern reveals at most ε bits of information about the input
Analogous to differential privacy but for access patterns
Allows trading off security strength for performance

Useful for: programs with high-variance output sizes (e.g., sorting N items where N itself is secret). A differentially oblivious sort reveals “approximately N” but not the exact value.

Roadmap: Practical Targets

From Oblivious Labs’ roadmap (2025-2026):

Fast oblivious KV store: target 10× faster than Signal’s private database, 100× faster than Meta’s equivalent. The bottleneck is ORAM overhead on read/write operations.
Oblivious STL (Standard Template Library): drop-in replacements for C++ containers (maps, vectors, sets) with oblivious access patterns. Enables “oblivious by default” programming.
Block builder in TEE: the key bottleneck is EVM state access during simulation. An oblivious EVM would process transactions in a fixed-access-pattern manner regardless of which contracts are called.

Application: Oblivious Block Building

Block builders inside TEEs (BuilderNet, Signal-Boost) simulate transactions to determine block ordering. The EVM’s memory access pattern during simulation reveals:

Which contract addresses are involved
Which storage slots are read/written
Indirectly: what assets are being traded and at what price

An oblivious EVM would use:

ORAM for EVM storage (most expensive; applies to hot state only)
Oblivious sort for transaction ordering by priority fee (O(N log N) with fixed access pattern)
Information-flow analysis to bypass ORAM for public EVM state (code, known constants)

Estimated overhead (from Shi’s group): 3-10× slowdown vs. a non-oblivious EVM. Acceptable for block building given the 12-second slot time.

ORAM vs. PIR

Property	ORAM	PIR
Model	General computation (reads + writes)	Read-only queries
Protects against	Server observing access pattern	Server learning what was queried
Trust model	Single server (ORAM); Multi-server (MPCORAM)	Single server or multi-server
Overhead	O(log² N)	Varies by scheme
Practical use	TEE memory access hiding	Light client queries

PIR is the right tool for read-only access; ORAM is needed for programs that read and write state.

Open Questions

❓ What is the concrete overhead of an oblivious EVM simulation for a typical Ethereum block (1000+ transactions, millions of storage accesses)?

❓ Can differential obliviousness provide sufficient practical security for block building while reducing overhead to <2×?

❓ How do oblivious data structures interact with Ethereum’s Merkle proof requirements? (Merkle tree traversal has a predictable structure that may be exploitable.)

Timeline

2025-08-08 — Oblivious computation compiler framework presented by Elaine Shi (CMU) at MEV-SBC 2025