Skip to main content

Pointers

This page explains the mental model behind ethdebug/format pointer representations. For reference documentation on regions, expressions, and collections, see the Pointers reference.

Pointers are recipes for finding bytes

A pointer describes where data lives, not what it means. It's a recipe that a debugger can follow to locate bytes in EVM state.

Simple pointers specify static locations:

{
"location": "storage",
"slot": "0x0"
}

This says "the data is in storage slot 0." The debugger can resolve this directly against the EVM state.

Complex pointers describe dynamic locations that depend on runtime values:

{
"location": "storage",
"slot": {
"$keccak256": [{ "$read": "key-value" }, { "$wordsized": 3 }]
}
}

This says "compute the slot by hashing a runtime value with the number 3." The debugger must evaluate this expression against the current machine state.

Pointers are self-contained — they include everything needed to resolve the location.

EVM data locations

The EVM stores data in several distinct locations. Understanding these is essential for working with pointers.

Storage

Storage is persistent data associated with a contract. It survives transaction boundaries and is where contracts store their state.

  • Persistent — values remain until explicitly changed
  • Slot-based — organized into 32-byte slots numbered from 0
  • Contract-specific — each contract has its own storage space

Storage is where you find contract state variables, mapping contents, and dynamic array contents.

Memory

Memory is temporary data that exists only during execution. It's cleared between calls.

  • Temporary — cleared after each external call returns
  • Byte-addressable — accessed by byte offset, not slots
  • Linear — grows as needed from offset 0

Memory holds function arguments, return data being prepared, and temporary values.

Stack

The stack is where the EVM performs computations. It holds operands and intermediate results.

  • 256-bit words — each stack item is 32 bytes
  • Limited depth — maximum 1024 items
  • LIFO — last in, first out access pattern

The stack contains function arguments (for internal calls), local variables, and intermediate computation results.

Calldata

Calldata is the read-only input data sent to a contract when called.

  • Read-only — cannot be modified during execution
  • Byte-addressable — accessed by byte offset
  • Cheap to read — cheaper than memory or storage reads

Calldata contains the function selector (first 4 bytes) and ABI-encoded function arguments.

Returndata

Returndata is the output from the most recent external call.

  • Read-only — set by called contract, read by caller
  • Replaced on each call — each external call overwrites previous returndata
  • Byte-addressable — accessed by byte offset

Code

Code refers to the contract's bytecode itself. Sometimes data is embedded in bytecode.

  • Immutable — cannot change after deployment
  • Byte-addressable — accessed by byte offset

Code is where you find immutable variables and embedded constants.

Transient storage

Transient storage (EIP-1153) is storage that persists within a transaction but is cleared afterward.

  • Transaction-scoped — persists across calls within a transaction
  • Cleared after transaction — does not persist to the next transaction
  • Slot-based — like storage, organized into 32-byte slots

Summary

LocationPersistenceAddressingPrimary use
StoragePermanent32-byte slotsContract state
MemorySingle callByte offsetTemporary data
StackInstruction-levelPosition indexComputation
CalldataSingle callByte offsetInput parameters
ReturndataUntil next callByte offsetCall results
CodePermanentByte offsetBytecode/immutables
TransientSingle transaction32-byte slotsTx-scoped state

A pointer is a region or a collection

The ethdebug/format/pointer schema is recursive: a pointer is either a region (a single continuous byte range) or a collection (an aggregation of other pointers).

Regions

A region represents a single continuous range of bytes at a specific location. Different locations use different region schemas:

Slice-based regions (memory, calldata, returndata, code) specify an offset and length:

{
"location": "memory",
"offset": "0x80",
"length": 32
}

Segment-based regions (storage, stack, transient) specify a slot, with optional offset and length for packed values:

{
"location": "storage",
"slot": 5,
"offset": 0,
"length": 16
}

Regions can be named for reference elsewhere in the pointer:

{
"name": "array-length",
"location": "storage",
"slot": 0
}

Collections

A collection aggregates multiple pointers. Six collection types exist for different purposes:

  • group — combine pointers statically (e.g., struct members)
  • list — generate a sequence of pointers (e.g., array elements)
  • conditional — choose between pointers based on a runtime condition
  • scope — define variables for use in nested pointers
  • reference — refer to a previously defined template
  • templates — define reusable pointer patterns with expected variables

Expressions enable dynamic computation

Static offsets and slots aren't enough for real-world data. Array elements, mapping values, and many other locations depend on runtime values.

Expressions let pointers compute addresses dynamically:

{
"location": "storage",
"slot": {
"$sum": [{ "$keccak256": [{ "$wordsized": 5 }] }, "element-index"]
}
}

Expressions support:

  • Arithmetic: $sum, $difference, $product, $quotient, $remainder
  • Reading values: $read retrieves bytes from a named region
  • Region properties: .offset, .length, .slot reference region fields
  • Hashing: $keccak256 computes storage slots for dynamic data
  • Data manipulation: $concat, $sized<N>, $wordsized

Variables in expressions come from list iteration (each) or scope definitions (define).

Addressing schemes

Regions use one of two addressing schemes based on their location:

Slice-based addressing (memory, calldata, returndata, code):

  • offset — byte position from the start
  • length — number of bytes

Segment-based addressing (storage, stack, transient):

  • slot — 256-bit slot number
  • offset (optional) — byte offset within the slot for packed values
  • length (optional) — number of bytes for packed values

Named regions enable composition

Giving a region a name lets you reference it elsewhere:

{
"group": [
{
"name": "base-pointer",
"location": "stack",
"slot": 0
},
{
"location": "memory",
"offset": { "$read": "base-pointer" },
"length": 32
}
]
}

The second region's offset comes from reading the value in the first region. This pattern is essential for describing data whose location is stored in another location (like memory pointers on the stack).

Next steps