Transform contexts
A transform context annotates an instruction with the compiler
transformations that produced it. The value is a list of short
identifiers; the list may repeat the same identifier when the
transformation has been applied multiple times—for example,
doubly-inlined code carries transform: ["inline", "inline"].
- Explore
- View source
- Playground
- YAML
- JSON
$schema: "https://json-schema.org/draft/2020-12/schema"
$id: "schema:ethdebug/format/program/context/transform"
title: ethdebug/format/program/context/transform
description: |
Annotates an instruction with compiler transformations that
produced it. The value is a list of short identifiers naming
each transformation; the list may repeat an identifier when
the same transformation has been applied more than once (e.g.,
`["inline", "inline"]` for doubly-inlined code).
A transform context is *additional* annotation — it does not
replace semantic contexts. When the compiler inlines a
function, the invoke/return contexts for the logical call
should still be emitted at the call boundary so the debugger's
source-level call stack remains coherent. The transform
context tells debuggers **how** the call was realized.
Combine a transform with other discriminator keys (`invoke`,
`return`, `code`, etc.) by placing them side-by-side on the
same context object — `gather` is only needed when two
contexts would collide on the same key.
Consumers that ignore transform contexts still get a sound
source-level view from the invoke/return contexts alone.
Consumers that understand transform contexts can offer
optimization-aware presentations — e.g., rendering inlined
code as a collapsible block, or reconciling tail-call-optimized
back-edges with the logical call stack.
The identifier set is extensible. v1 defines:
- `"inline"` — the marked instruction is part of an inlined
function body. Surrounding invoke/return contexts name the
inlined callee.
- `"tailcall"` — the marked instruction is a
tail-call-optimized back-edge JUMP or continuation, where
the call was realized as a direct jump (or reuse of the
caller's frame) rather than a standard call/return sequence.
- `"fold"` — the marked instruction carries the result of a
compile-time constant fold. Typically a PUSH of the folded
value, replacing a compute sequence that appeared in source.
- `"coalesce"` — the marked instruction is part of a
read-write merging sequence (e.g., SHL/OR sequences packing
narrower fields into a wider word) that the user did not
explicitly write; the compiler introduced it to combine
adjacent source-level reads or writes.
Debuggers unfamiliar with a given identifier should preserve
it as an opaque label.
Order in the array is not semantically significant — only the
multiset of identifiers matters.
type: object
properties:
transform:
title: Applied transformations
description: |
List of transformation identifiers. Identifiers may
repeat; order is not semantically significant.
type: array
items:
type: string
minLength: 1
minItems: 1
required:
- transform
examples:
- transform: ["inline"]
- transform: ["tailcall"]
- transform: ["fold"]
- transform: ["coalesce"]
- transform: ["inline", "inline"]
- transform: ["inline", "tailcall"]
- transform: ["inline", "fold"]
- transform: ["coalesce", "coalesce"]
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "schema:ethdebug/format/program/context/transform",
"title": "ethdebug/format/program/context/transform",
"description": "Annotates an instruction with compiler transformations that\nproduced it. The value is a list of short identifiers naming\neach transformation; the list may repeat an identifier when\nthe same transformation has been applied more than once (e.g.,\n`[\"inline\", \"inline\"]` for doubly-inlined code).\n\nA transform context is *additional* annotation — it does not\nreplace semantic contexts. When the compiler inlines a\nfunction, the invoke/return contexts for the logical call\nshould still be emitted at the call boundary so the debugger's\nsource-level call stack remains coherent. The transform\ncontext tells debuggers **how** the call was realized.\n\nCombine a transform with other discriminator keys (`invoke`,\n`return`, `code`, etc.) by placing them side-by-side on the\nsame context object — `gather` is only needed when two\ncontexts would collide on the same key.\n\nConsumers that ignore transform contexts still get a sound\nsource-level view from the invoke/return contexts alone.\nConsumers that understand transform contexts can offer\noptimization-aware presentations — e.g., rendering inlined\ncode as a collapsible block, or reconciling tail-call-optimized\nback-edges with the logical call stack.\n\nThe identifier set is extensible. v1 defines:\n\n- `\"inline\"` — the marked instruction is part of an inlined\n function body. Surrounding invoke/return contexts name the\n inlined callee.\n- `\"tailcall\"` — the marked instruction is a\n tail-call-optimized back-edge JUMP or continuation, where\n the call was realized as a direct jump (or reuse of the\n caller's frame) rather than a standard call/return sequence.\n- `\"fold\"` — the marked instruction carries the result of a\n compile-time constant fold. Typically a PUSH of the folded\n value, replacing a compute sequence that appeared in source.\n- `\"coalesce\"` — the marked instruction is part of a\n read-write merging sequence (e.g., SHL/OR sequences packing\n narrower fields into a wider word) that the user did not\n explicitly write; the compiler introduced it to combine\n adjacent source-level reads or writes.\n\nDebuggers unfamiliar with a given identifier should preserve\nit as an opaque label.\n\nOrder in the array is not semantically significant — only the\nmultiset of identifiers matters.\n",
"type": "object",
"properties": {
"transform": {
"title": "Applied transformations",
"description": "List of transformation identifiers. Identifiers may\nrepeat; order is not semantically significant.\n",
"type": "array",
"items": {
"type": "string",
"minLength": 1
},
"minItems": 1
}
},
"required": [
"transform"
],
"examples": [
{
"transform": [
"inline"
]
},
{
"transform": [
"tailcall"
]
},
{
"transform": [
"fold"
]
},
{
"transform": [
"coalesce"
]
},
{
"transform": [
"inline",
"inline"
]
},
{
"transform": [
"inline",
"tailcall"
]
},
{
"transform": [
"inline",
"fold"
]
},
{
"transform": [
"coalesce",
"coalesce"
]
}
]
}
Role: additional annotation
A transform context does not replace semantic contexts. When the compiler inlines a function, the caller's debug info should still carry invoke/return contexts naming the inlined callee at the call boundary—so the debugger's logical call stack reflects the source-level structure. The transform context is additional information telling the debugger how the call was realized.
Consumers are free to ignore transform contexts entirely; the invoke/return contexts alone always give a sound source-level view. Consumers that understand transform contexts can offer optimization-aware presentations:
- Render inlined code as a collapsible block tied to the original callee's source location.
- Show which call sites were tail-call-optimized vs. realized as full call/return sequences.
- Explain apparent anomalies in the trace (e.g., a JUMP that carries an invoke context is a TCO back-edge).
v1 identifiers
Four identifiers are recognized in v1:
"inline"— the marked instruction is part of an inlined function body. Surrounding invoke/return contexts name the inlined callee; this marker tells the debugger the physical code does not correspond to a separate activation record."tailcall"— the marked instruction is a tail-call-optimized back-edge JUMP or continuation, where the call was realized without pushing/popping a full activation. A JUMP carrying atailcalltransform typically sits on a context that also carries both areturn(from the previous iteration) and aninvoke(of the new iteration)."fold"— the marked instruction carries the result of a compile-time constant fold. Typically a PUSH of the folded value replacing a compute sequence (e.g.,ADDover two known constants) that appeared in source. The instruction's surroundingcodecontext, if present, points to the original expression."coalesce"— the marked instruction is part of a read-write merging sequence the compiler introduced to combine adjacent source-level reads or writes. Common examples include SHL/OR sequences that pack narrower fields into a single storage slot, or wider loads split into narrower field extractions. The user did not write these instructions directly; thecoalescemarker lets a debugger present the sequence as one source-level operation rather than stepping through each byte-shuffling opcode.
The identifier set is extensible. Compilers may emit additional identifiers for optimizations not yet standardized; debuggers should preserve unfamiliar identifiers as opaque labels rather than rejecting them.
Repetition and composition
Identifiers may repeat. A function inlined into another inlined
function produces transform: ["inline", "inline"]. A coalesce
sequence nested inside another coalesced region produces
transform: ["coalesce", "coalesce"].
Different transformations compose:
transform: ["inline", "tailcall"] marks an instruction inside
an inlined body that was itself a TCO back-edge in the callee;
transform: ["inline", "fold"] marks a constant-folded PUSH
sitting inside an inlined body.
Order in the array is not semantically significant—only the multiset of identifiers matters.
Composing with other contexts
A context object can carry several discriminator keys at once —
code, variables, invoke, return, transform, and so on
all live in the same object. A TCO back-edge JUMP, for example,
typically combines three facts as sibling keys on a single
context:
return:
identifier: "fact"
declaration: { ... }
invoke:
jump: true
identifier: "fact"
target: { pointer: { location: code, offset: ... } }
transform: ["tailcall"]
The return and invoke state the source-level facts
(iteration N returned, iteration N+1 was invoked); the
transform explains how the compiler realized that pair as a
single JUMP.
Reach for gather only when
two contexts would collide on the same key — e.g., two
independent variables blocks or two frames from different
pipeline stages. When keys don't collide, the flat form is
preferred.