Pointers
This page explains the mental model behind ethdebug/format pointer representations. For reference documentation on regions, expressions, and collections, see the Pointers reference.
Pointers are recipes for finding bytes
A pointer describes where data lives, not what it means. It's a recipe that a debugger can follow to locate bytes in EVM state.
Simple pointers specify static locations:
{
"location": "storage",
"slot": "0x0"
}
This says "the data is in storage slot 0." The debugger can resolve this directly against the EVM state.
Complex pointers describe dynamic locations that depend on runtime values:
{
"location": "storage",
"slot": {
"$keccak256": [{ "$read": "key-value" }, { "$wordsized": 3 }]
}
}
This says "compute the slot by hashing a runtime value with the number 3." The debugger must evaluate this expression against the current machine state.
Pointers are self-contained — they include everything needed to resolve the location.
EVM data locations
The EVM stores data in several distinct locations. Understanding these is essential for working with pointers.
Storage
Storage is persistent data associated with a contract. It survives transaction boundaries and is where contracts store their state.
- Persistent — values remain until explicitly changed
- Slot-based — organized into 32-byte slots numbered from 0
- Contract-specific — each contract has its own storage space
Storage is where you find contract state variables, mapping contents, and dynamic array contents.
Memory
Memory is temporary data that exists only during execution. It's cleared between calls.
- Temporary — cleared after each external call returns
- Byte-addressable — accessed by byte offset, not slots
- Linear — grows as needed from offset 0
Memory holds function arguments, return data being prepared, and temporary values.
Stack
The stack is where the EVM performs computations. It holds operands and intermediate results.
- 256-bit words — each stack item is 32 bytes
- Limited depth — maximum 1024 items
- LIFO — last in, first out access pattern
The stack contains function arguments (for internal calls), local variables, and intermediate computation results.
Calldata
Calldata is the read-only input data sent to a contract when called.
- Read-only — cannot be modified during execution
- Byte-addressable — accessed by byte offset
- Cheap to read — cheaper than memory or storage reads
Calldata contains the function selector (first 4 bytes) and ABI-encoded function arguments.
Returndata
Returndata is the output from the most recent external call.
- Read-only — set by called contract, read by caller
- Replaced on each call — each external call overwrites previous returndata
- Byte-addressable — accessed by byte offset
Code
Code refers to the contract's bytecode itself. Sometimes data is embedded in bytecode.
- Immutable — cannot change after deployment
- Byte-addressable — accessed by byte offset
Code is where you find immutable variables and embedded constants.
Transient storage
Transient storage (EIP-1153) is storage that persists within a transaction but is cleared afterward.
- Transaction-scoped — persists across calls within a transaction
- Cleared after transaction — does not persist to the next transaction
- Slot-based — like storage, organized into 32-byte slots
Summary
| Location | Persistence | Addressing | Primary use |
|---|---|---|---|
| Storage | Permanent | 32-byte slots | Contract state |
| Memory | Single call | Byte offset | Temporary data |
| Stack | Instruction-level | Position index | Computation |
| Calldata | Single call | Byte offset | Input parameters |
| Returndata | Until next call | Byte offset | Call results |
| Code | Permanent | Byte offset | Bytecode/immutables |
| Transient | Single transaction | 32-byte slots | Tx-scoped state |
A pointer is a region or a collection
The ethdebug/format/pointer schema is recursive: a pointer is either a region (a single continuous byte range) or a collection (an aggregation of other pointers).
Regions
A region represents a single continuous range of bytes at a specific location. Different locations use different region schemas:
Slice-based regions (memory, calldata, returndata, code) specify an offset and length:
{
"location": "memory",
"offset": "0x80",
"length": 32
}
Segment-based regions (storage, stack, transient) specify a slot, with optional offset and length for packed values:
{
"location": "storage",
"slot": 5,
"offset": 0,
"length": 16
}
Regions can be named for reference elsewhere in the pointer:
{
"name": "array-length",
"location": "storage",
"slot": 0
}
Collections
A collection aggregates multiple pointers. Six collection types exist for different purposes:
- group — combine pointers statically (e.g., struct members)
- list — generate a sequence of pointers (e.g., array elements)
- conditional — choose between pointers based on a runtime condition
- scope — define variables for use in nested pointers
- reference — refer to a previously defined template
- templates — define reusable pointer patterns with expected variables
Expressions enable dynamic computation
Static offsets and slots aren't enough for real-world data. Array elements, mapping values, and many other locations depend on runtime values.
Expressions let pointers compute addresses dynamically:
{
"location": "storage",
"slot": {
"$sum": [{ "$keccak256": [{ "$wordsized": 5 }] }, "element-index"]
}
}
Expressions support:
- Arithmetic:
$sum,$difference,$product,$quotient,$remainder - Reading values:
$readretrieves bytes from a named region - Region properties:
.offset,.length,.slotreference region fields - Hashing:
$keccak256computes storage slots for dynamic data - Data manipulation:
$concat,$sized<N>,$wordsized
Variables in expressions come from list iteration (each) or scope definitions
(define).
Addressing schemes
Regions use one of two addressing schemes based on their location:
Slice-based addressing (memory, calldata, returndata, code):
offset— byte position from the startlength— number of bytes
Segment-based addressing (storage, stack, transient):
slot— 256-bit slot numberoffset(optional) — byte offset within the slot for packed valueslength(optional) — number of bytes for packed values
Named regions enable composition
Giving a region a name lets you reference it elsewhere:
{
"group": [
{
"name": "base-pointer",
"location": "stack",
"slot": 0
},
{
"location": "memory",
"offset": { "$read": "base-pointer" },
"length": 32
}
]
}
The second region's offset comes from reading the value in the first region. This pattern is essential for describing data whose location is stored in another location (like memory pointers on the stack).
Next steps
- Regions — Reference for all region types
- Expressions — Reference for expression syntax
- Collections — Reference for collection types
- Pointer specification — Formal schema definitions
- Challenges — Why EVM data locations are complex enough to need this