Skip to main content

Making regions concrete

There are two main aspects involved when converting from a Pointer.Region, which is full of properties whose values are the dynamic Pointer.Expression objects, into a Cursor.Region, whose expression properties have been replaced with actual bytes Data:

Fixing stack-located regions' slot offset

Since stack pointers are expected to be declared at one time yet evaluated later, the relative offset that stack pointers use must be adjusted based on the initial stack length vs. the current stack length.

This behavior is encapsulated by the adjustStackLength function:

/**
* Detect a stack region and modify its `slot` expression to include the
* appropriate sum or difference based on the machine stack length change
* since the Cursor was originally created
*/
export function adjustStackLength<R extends Pointer.Region>(
region: R,
stackLengthChange: bigint
): R {
if (Pointer.Region.isStack(region)) {
const slot: Pointer.Expression = stackLengthChange === 0n
? region.slot
: stackLengthChange > 0n
? { $sum: [region.slot, `0x${stackLengthChange.toString(16)}`] }
: { $difference: [region.slot, `0x${-stackLengthChange.toString(16)}`] };

return {
...region,
slot
};
}

return region;
}

Evaluating region property expressions

The more substantial aspect of making a region concrete, however, is the process by which this implementation evaluates each of the Pointer.Region's expression properties and converts them into their Data values.

This process would be very straightforward, except that pointer expressions may reference the region in which they are specified by use of the special region identifier "$this".

Fortunately, the schema does not allow any kind of circular reference, so a more robust implementation could pre-process a region's properties to detect cycles and determine the evaluation order for each property based on which property references which other property. That is, a robust implementation might take this pointer:

{
"location": "memory",
"offset": {
"$sum": [
0x60,
{ ".length": "$this" }
]
},
"length": "$wordsize"
}

... and detect that it must evaluate length before evaluating offset.

The @ethdebug/pointers reference implementation does not do any such smart thing. Instead, it pushes each of the three possible expression properties ("slot", "offset", and "length") into a queue, and then proceeds to evaluate properties from the queue one at a time.

When evaluating a particular property, if evaluate() fails, it adds this property to the end of the queue to try again later, counting the number of times this attempt has been made for this property. Because the number of properties is at most 3, if the number of attempts ever reaches 3, the implementation can infer that there must be a circular reference.

/**
* Evaluate all Pointer.Expression-value properties on a given region
*
* Due to the availability of `$this` as a builtin allowable by the schema,
* this function evaluates each property as part of a queue. If a property's
* expression fails to evaluate due to a missing reference, the property is
* added to the end of the queue.
*
* Circular dependencies are detected naïvely by counting evaluation attempts
* for each property, since the maximum length of a chain of $this references
* within a single region is one less than the number of properties that
* require evaluation). Exceeding this many attempts indicates circularity.
*/
export async function evaluateRegion<R extends Pointer.Region>(
region: R,
options: EvaluateOptions
): Promise<Cursor.Region<R>> {
const evaluatedProperties: {
[K in keyof R]?: Data
} = {};
const propertyAttempts: {
[K in keyof R]?: number
} = {};

const partialRegion: Cursor.Region<R> = new Proxy(
{ ...region } as Cursor.Region<R>,
{
get(target, property) {
if (property in evaluatedProperties) {
return evaluatedProperties[property as keyof R];
}
throw new Error(`Property not evaluated yet: $this.${property.toString()}`)
},
}
);

const propertiesRequiringEvaluation = ["slot", "offset", "length"] as const;

const expressionQueue: [keyof R, Pointer.Expression][] =
propertiesRequiringEvaluation
.filter(property => property in region)
.map(
property => [property, region[property as keyof R]]
) as [keyof R, Pointer.Expression][];

while (expressionQueue.length > 0) {
const [property, expression] = expressionQueue.shift()!;

try {
const data = await evaluate(expression, {
...options,
regions: {
...options.regions,
$this: partialRegion,
},
});

evaluatedProperties[property as keyof R] = data;
} catch (error) {
if (
error instanceof Error &&
error.message.startsWith("Property not evaluated yet: $this.")
) {
const attempts = propertyAttempts[property] || 0;
// fields may reference each other, but the chain of references
// should not exceed the number of fields minus 1
if (attempts > propertiesRequiringEvaluation.length - 1) {
throw new Error(`Circular reference detected: $this.${property.toString()}`);
}

propertyAttempts[property] = attempts + 1;
expressionQueue.push([property, expression]);
} else {
throw error;
}
}
}

return {
...region,
...evaluatedProperties,
} as Cursor.Region<R>;
}