Skip to main content

Key concepts

Programs are associated with a contract's compiled bytecode

This bytecode might either be the call bytecode, executed when a contract account with this bytecode receives a message on-chain, or the create bytecode, executed as part of deploying the contract associated with the bytecode.

Reflecting this relationship, ethdebug/format/program records contain a reference to the concrete contract (i.e., not an abstract contract or interface), the environment the bytecode will be executed (call or create), and the compilation that yielded the contract and bytecode.

Programs contain instruction listings for debuggers to reference

Programs contain a list of ethdebug/format/program/instruction objects, where each instruction corresponds to one machine instruction in the associated bytecode.

These instructions are ordered sequentially, matching the order and corresponding one-to-one with the encoded binary machine instructions in the bytecode. Instructions specify the byte offset at which they appear in the bytecode; this offset is equivalent to program counter on non-EOF EVMs.

By indexing these instructions by their offset, ethdebug/format programs allow debuggers to lookup high-level information at any point during machine execution.

Instructions describe high-level context details

Each instruction object in a program contains crucial information about the high-level language state at that point in the bytecode execution. Instructions represent these details using the ethdebug/format/program/context schema, and these details may include:

  • Source code ranges associated with the instruction (i.e., "source mappings")
  • Variables known to be in scope following the instruction and where to find those variable's values in the machine state
  • Control flow information such as an instruction being associated with the process of calling from one function to another

This information serves as a compile-time guarantee about the high-level state of the world that exists following each instruction.

Contexts inform high-level language semantics during machine tracing

The context information provided for each instruction serves as a bridge between low-level EVM execution and high-level language constructs. Debuggers can use these strong compile-time guarantees to piece together a useful and consistent model of the high-level language code behind the running machine binary.

By following the state of machine execution, a debugger can use context information to stay apprised of the changing compile-time facts over the course of the trace. Each successively-encountered context serves as the source of an observed state transition in the debugger's high-level state model. This allows the debugger to maintain an ever-changing and coherent view of the high-level language runtime.

In essence, the information provided by objects in this schema serves as a means of reducing over state transitions, yielding a dynamic and accurate representation of the program's high-level state. This enables debugging tools to:

  1. Map the current execution point back to the original source code
  2. Reconstruct the state of variables at any given point
  3. Provide meaningful stack traces that reference function names and source locations
  4. Offer insights into control flow, such as entering or exiting functions, or iterating through loops
  5. Present data structures (like arrays or mappings) in a way that reflects their high-level representation, rather than their low-level storage

By leveraging these contexts, debugging tools can offer a more intuitive and developer-friendly experience when working with EVM bytecode, effectively translating between the machine-level execution and the high-level code that developers write and understand. This continuous mapping between low-level execution and high-level semantics allows developers to debug their smart contracts more effectively, working with familiar concepts and structures even as they delve into the intricacies of EVM operation.