Skip to main content

Evaluating pointer expressions

Expression evaluation is a bit more interesting than reading raw region data, but, still, performing this evaluation becomes relatively straightforward if variable and region references are pre-evaluated:

export interface EvaluateOptions {
state: Machine.State;
regions: {
[identifier: string]: Cursor.Region;
};
variables: {
[identifier: string]: Data;
};
}

The main evaluate() function uses type guards to dispatch to the appropriate specific logic based on the kind of expression:

Source code of evaluate(expression: Pointer.Expression, options: EvaluateOptions)
export async function evaluate(
expression: Pointer.Expression,
options: EvaluateOptions
): Promise<Data> {
if (Pointer.Expression.isLiteral(expression)) {
return evaluateLiteral(expression);
}

if (Pointer.Expression.isConstant(expression)) {
return evaluateConstant(expression);
}

if (Pointer.Expression.isVariable(expression)) {
return evaluateVariable(expression, options);
}

if (Pointer.Expression.isArithmetic(expression)) {
if (Pointer.Expression.Arithmetic.isSum(expression)) {
return evaluateArithmeticSum(expression, options);
}

if (Pointer.Expression.Arithmetic.isDifference(expression)) {
return evaluateArithmeticDifference(expression, options);
}

if (Pointer.Expression.Arithmetic.isProduct(expression)) {
return evaluateArithmeticProduct(expression, options);
}

if (Pointer.Expression.Arithmetic.isQuotient(expression)) {
return evaluateArithmeticQuotient(expression, options);
}

if (Pointer.Expression.Arithmetic.isRemainder(expression)) {
return evaluateArithmeticRemainder(expression, options);
}
}

if (Pointer.Expression.isKeccak256(expression)) {
return evaluateKeccak256(expression, options);
}

if (Pointer.Expression.isResize(expression)) {
return evaluateResize(expression, options);
}

if (Pointer.Expression.isLookup(expression)) {
if (Pointer.Expression.Lookup.isOffset(expression)) {
return evaluateLookup(".offset", expression, options);
}

if (Pointer.Expression.Lookup.isLength(expression)) {
return evaluateLookup(".length", expression, options);
}

if (Pointer.Expression.Lookup.isSlot(expression)) {
return evaluateLookup(".slot", expression, options);
}
}

if (Pointer.Expression.isRead(expression)) {
return evaluateRead(expression, options);
}

throw new Error(
`Unexpected runtime failure to recognize kind of expression: ${
JSON.stringify(expression)
}`
);
}

Evaluating constants, literals, and variables

Evaluating constant expressions is quite straightforward:

async function evaluateConstant(
constant: Pointer.Expression.Constant
): Promise<Data> {
switch (constant) {
case "$wordsize":
return Data.fromHex("0x20");
}
}

Evaluating literals involves detecting hex string vs. number and converting appropriate to bytes:

async function evaluateLiteral(
literal: Pointer.Expression.Literal
): Promise<Data> {
switch (typeof literal) {
case "string":
return Data.fromHex(literal);
case "number":
return Data.fromNumber(literal);
}
}

Variable lookups, of course, require consulting the variables map passed in EvaluateOptions:

async function evaluateVariable(
identifier: Pointer.Expression.Variable,
{ variables }: EvaluateOptions
): Promise<Data> {
const data = variables[identifier];
if (typeof data === "undefined") {
throw new Error(`Unknown variable with identifier ${identifier}`);
}

return data;
}

Evaluating arithmetic operations

Doing arithmetic operations follows the logic one might expect: recurse on the operands of the expression and join the results appropriately. Note the slight differences in implementation for operations that accept any number of operands (sums, products), vs. operations that only accept two operands (differences, quotients, remainders).

Evaluating sums:

async function evaluateArithmeticSum(
expression: Pointer.Expression.Arithmetic.Sum,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$sum.map(
async expression => await evaluate(expression, options)
));

const maxLength = operands
.reduce((max, { length }) => length > max ? length : max, 0);

const data = Data
.fromUint(operands.reduce((sum, data) => sum + data.asUint(), 0n))
.padUntilAtLeast(maxLength);

return data;
}

Evaluating products:

async function evaluateArithmeticProduct(
expression: Pointer.Expression.Arithmetic.Product,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$product.map(
async expression => await evaluate(expression, options)
));

const maxLength = operands
.reduce((max, { length }) => length > max ? length : max, 0);

return Data
.fromUint(operands.reduce((product, data) => product * data.asUint(), 1n))
.padUntilAtLeast(maxLength);
}

Evaluating differences:

async function evaluateArithmeticDifference(
expression: Pointer.Expression.Arithmetic.Difference,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$difference.map(
async expression => await evaluate(expression, options)
));

const maxLength = a.length > b.length ? a.length : b.length;

const unpadded = a.asUint() > b.asUint()
? Data.fromUint(a.asUint() - b.asUint())
: Data.fromNumber(0);

const data = unpadded.padUntilAtLeast(maxLength);
return data;
}

Note how this function operates on unsigned values only by bounding the result below at 0.

Evaluating quotients:

async function evaluateArithmeticQuotient(
expression: Pointer.Expression.Arithmetic.Quotient,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$quotient.map(
async expression => (await evaluate(expression, options))
));

const maxLength = a.length > b.length ? a.length : b.length;

const data = Data
.fromUint(a.asUint() / b.asUint())
.padUntilAtLeast(maxLength);

return data;
}

(Quotients of course use integer division only.)

Evaluating remainders:

async function evaluateArithmeticRemainder(
expression: Pointer.Expression.Arithmetic.Remainder,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$remainder.map(
async expression => await evaluate(expression, options)
));

const maxLength = a.length > b.length ? a.length : b.length;

const data = Data
.fromUint(a.asUint() % b.asUint())
.padUntilAtLeast(maxLength);

return data;
}

Evaluating resize expressions

This schema provides the { "$sized<N>": <expression> } construct to allow explicitly resizing a subexpression. This implementation uses the Data.prototype.resizeTo() method to perform this operation.

async function evaluateResize(
expression: Pointer.Expression.Resize,
options: EvaluateOptions
): Promise<Data> {
const [[operation, subexpression]] = Object.entries(expression);

const newLength = Pointer.Expression.Resize.isToNumber(expression)
? Number(operation.match(/^\$sized([1-9]+[0-9]*)$/)![1])
: 32;

return (await evaluate(subexpression, options)).resizeTo(newLength);
}

Evaluating keccak256 hashes

Many data types in storage are addressed by way of keccak256 hashing. This process is somewhat non-trivial because the bytes width of the inputs and the process for concatenating them must match compiler behavior exactly.

See Solidity's Layout of State Variables in Storage documentation for an example of how one high-level EVM language makes heavy use of hashing to allocate persistent data.

async function evaluateKeccak256(
expression: Pointer.Expression.Keccak256,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$keccak256.map(
async expression => await evaluate(expression, options)
));

const preimage = Data.zero().concat(...operands);
const hash = Data.fromBytes(keccak256(preimage));

return hash;
}

Evaluating property lookups

Pointer expressions can compose values taken from the properties of other, named regions. This not only provides a convenient way to avoid duplication when writing pointer expressions, but also it is necessary for types with particularly complex data allocations.

Currently, the specification defines lookup operations for three properties: offset, length, and slot. Runtime checks are required to prevent accessing properties that aren't available on the target region (e.g. memory regions do not contain a slot property).

Since all of these lookups function in the same way, this reference implementation needs only a single evaluateLookup<O extends "slot" | "offset" | "length"> function:

async function evaluateLookup<O extends Pointer.Expression.Lookup.Operation>(
operation: O,
lookup: Pointer.Expression.Lookup.ForOperation<O>,
options: EvaluateOptions
): Promise<Data> {
const { regions } = options;

const identifier = lookup[operation];
const region = regions[identifier];
if (!region) {
throw new Error(`Region not found: ${identifier}`);
}

const property = Pointer.Expression.Lookup.propertyFrom(operation);

const data = region[property as keyof typeof region] as Data | undefined;

if (typeof data === "undefined") {
throw new Error(
`Region named ${identifier} does not have ${property} needed by lookup`
);
}

return data;
}

(The use of generic types here serves mostly to appease the type-checker; the minimal type safety it affords is insignificant compared to runtime data consistency concerns, which hopefully the implementation makes clear via its use of runtime definedness checks.)

Evaluating machine state reads

Finally, the last kind of expression defined by this specification is for reading raw data from the machine state. A Pointer.Expression.Read should evaluate to the raw bytes stored at runtime in the region identified by a particular name.

Thanks to evaluate()'s requirement that its input regions-by-name map contains only concrete Cursor.Region objects, and by leveraging the existing read() functionality, this function presents no surprises:

async function evaluateRead(
expression: Pointer.Expression.Read,
options: EvaluateOptions
): Promise<Data> {
const { state, regions } = options;

const identifier = expression.$read;
const region = regions[identifier];
if (!region) {
throw new Error(`Region not found: ${identifier}`);
}

return await read(region, options);
}

Note on "$this" region lookups

Astute readers might notice that these docs contain no mention until now about how to implement support for expressions that reference the region in which they are defined, a mechanism the schema permits via the special region name identifier "$this".

Performing read operations against "$this" region is meaningless since this schema does not afford any mechanism for defining regions recursively down to a base case (or similar composition). Thus, the only syntactic construct for self-referential reads resembles, e.g., defining a storage region whose slot is { $read: "$this" }. Evaluating this slot would require knowing the slot before knowing where to read, and knowing the slow requires knowing the machine value, ad nauseum.

Property lookup expressions, on the other hand, are completely acceptable—provided they do not include circular references of any cycle length.

Since the evaluate<.*>() functions here are written to accept only one expression at a time, this reference implementation relegates this concern to a higher-level module; proper use of evaluate() here requires its options.regions map to include a pre-evaluated (albeit partial) "$this" region.

The logic for creating "$this" regions and calling evaluate() correctly is described in the section pertaining to that area of the code. Be forewarned that this reference implementation takes a naïve trial-and-error approach for determining property evaluation order; implementations requiring a more robust strategy will need to do some amount of pre-processing.