Evaluating pointer expressions
Expression evaluation is a bit more interesting than reading raw region data, but, still, performing this evaluation becomes relatively straightforward if variable and region references are pre-evaluated:
export interface EvaluateOptions {
state: Machine.State;
regions: {
[identifier: string]: Cursor.Region;
};
variables: {
[identifier: string]: Data;
};
}
The main evaluate()
function uses type guards to dispatch to the appropriate
specific logic based on the kind of expression:
Source code of evaluate(expression: Pointer.Expression, options: EvaluateOptions)
export async function evaluate(
expression: Pointer.Expression,
options: EvaluateOptions
): Promise<Data> {
if (Pointer.Expression.isLiteral(expression)) {
return evaluateLiteral(expression);
}
if (Pointer.Expression.isConstant(expression)) {
return evaluateConstant(expression);
}
if (Pointer.Expression.isVariable(expression)) {
return evaluateVariable(expression, options);
}
if (Pointer.Expression.isArithmetic(expression)) {
if (Pointer.Expression.Arithmetic.isSum(expression)) {
return evaluateArithmeticSum(expression, options);
}
if (Pointer.Expression.Arithmetic.isDifference(expression)) {
return evaluateArithmeticDifference(expression, options);
}
if (Pointer.Expression.Arithmetic.isProduct(expression)) {
return evaluateArithmeticProduct(expression, options);
}
if (Pointer.Expression.Arithmetic.isQuotient(expression)) {
return evaluateArithmeticQuotient(expression, options);
}
if (Pointer.Expression.Arithmetic.isRemainder(expression)) {
return evaluateArithmeticRemainder(expression, options);
}
}
if (Pointer.Expression.isKeccak256(expression)) {
return evaluateKeccak256(expression, options);
}
if (Pointer.Expression.isResize(expression)) {
return evaluateResize(expression, options);
}
if (Pointer.Expression.isLookup(expression)) {
if (Pointer.Expression.Lookup.isOffset(expression)) {
return evaluateLookup(".offset", expression, options);
}
if (Pointer.Expression.Lookup.isLength(expression)) {
return evaluateLookup(".length", expression, options);
}
if (Pointer.Expression.Lookup.isSlot(expression)) {
return evaluateLookup(".slot", expression, options);
}
}
if (Pointer.Expression.isRead(expression)) {
return evaluateRead(expression, options);
}
throw new Error(
`Unexpected runtime failure to recognize kind of expression: ${
JSON.stringify(expression)
}`
);
}
Evaluating constants, literals, and variables
Evaluating constant expressions is quite straightforward:
async function evaluateConstant(
constant: Pointer.Expression.Constant
): Promise<Data> {
switch (constant) {
case "$wordsize":
return Data.fromHex("0x20");
}
}
Evaluating literals involves detecting hex string vs. number and converting appropriate to bytes:
async function evaluateLiteral(
literal: Pointer.Expression.Literal
): Promise<Data> {
switch (typeof literal) {
case "string":
return Data.fromHex(literal);
case "number":
return Data.fromNumber(literal);
}
}
Variable lookups, of course, require consulting the variables
map passed
in EvaluateOptions
:
async function evaluateVariable(
identifier: Pointer.Expression.Variable,
{ variables }: EvaluateOptions
): Promise<Data> {
const data = variables[identifier];
if (typeof data === "undefined") {
throw new Error(`Unknown variable with identifier ${identifier}`);
}
return data;
}
Evaluating arithmetic operations
Doing arithmetic operations follows the logic one might expect: recurse on the operands of the expression and join the results appropriately. Note the slight differences in implementation for operations that accept any number of operands (sums, products), vs. operations that only accept two operands (differences, quotients, remainders).
Evaluating sums:
async function evaluateArithmeticSum(
expression: Pointer.Expression.Arithmetic.Sum,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$sum.map(
async expression => await evaluate(expression, options)
));
const maxLength = operands
.reduce((max, { length }) => length > max ? length : max, 0);
const data = Data
.fromUint(operands.reduce((sum, data) => sum + data.asUint(), 0n))
.padUntilAtLeast(maxLength);
return data;
}
Evaluating products:
async function evaluateArithmeticProduct(
expression: Pointer.Expression.Arithmetic.Product,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$product.map(
async expression => await evaluate(expression, options)
));
const maxLength = operands
.reduce((max, { length }) => length > max ? length : max, 0);
return Data
.fromUint(operands.reduce((product, data) => product * data.asUint(), 1n))
.padUntilAtLeast(maxLength);
}
Evaluating differences:
async function evaluateArithmeticDifference(
expression: Pointer.Expression.Arithmetic.Difference,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$difference.map(
async expression => await evaluate(expression, options)
));
const maxLength = a.length > b.length ? a.length : b.length;
const unpadded = a.asUint() > b.asUint()
? Data.fromUint(a.asUint() - b.asUint())
: Data.fromNumber(0);
const data = unpadded.padUntilAtLeast(maxLength);
return data;
}
Note how this function operates on unsigned values only by bounding the result below at 0.
Evaluating quotients:
async function evaluateArithmeticQuotient(
expression: Pointer.Expression.Arithmetic.Quotient,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$quotient.map(
async expression => (await evaluate(expression, options))
));
const maxLength = a.length > b.length ? a.length : b.length;
const data = Data
.fromUint(a.asUint() / b.asUint())
.padUntilAtLeast(maxLength);
return data;
}
(Quotients of course use integer division only.)
Evaluating remainders:
async function evaluateArithmeticRemainder(
expression: Pointer.Expression.Arithmetic.Remainder,
options: EvaluateOptions
): Promise<Data> {
const [a, b] = await Promise.all(expression.$remainder.map(
async expression => await evaluate(expression, options)
));
const maxLength = a.length > b.length ? a.length : b.length;
const data = Data
.fromUint(a.asUint() % b.asUint())
.padUntilAtLeast(maxLength);
return data;
}
Evaluating resize expressions
This schema provides the { "$sized<N>": <expression> }
construct to allow
explicitly resizing a subexpression. This implementation uses the
Data.prototype.resizeTo()
method to perform this operation.
async function evaluateResize(
expression: Pointer.Expression.Resize,
options: EvaluateOptions
): Promise<Data> {
const [[operation, subexpression]] = Object.entries(expression);
const newLength = Pointer.Expression.Resize.isToNumber(expression)
? Number(operation.match(/^\$sized([1-9]+[0-9]*)$/)![1])
: 32;
return (await evaluate(subexpression, options)).resizeTo(newLength);
}
Evaluating keccak256 hashes
Many data types in storage are addressed by way of keccak256 hashing. This process is somewhat non-trivial because the bytes width of the inputs and the process for concatenating them must match compiler behavior exactly.
See Solidity's Layout of State Variables in Storage documentation for an example of how one high-level EVM language makes heavy use of hashing to allocate persistent data.
async function evaluateKeccak256(
expression: Pointer.Expression.Keccak256,
options: EvaluateOptions
): Promise<Data> {
const operands = await Promise.all(expression.$keccak256.map(
async expression => await evaluate(expression, options)
));
const preimage = Data.zero().concat(...operands);
const hash = Data.fromBytes(keccak256(preimage));
return hash;
}
Evaluating property lookups
Pointer expressions can compose values taken from the properties of other, named regions. This not only provides a convenient way to avoid duplication when writing pointer expressions, but also it is necessary for types with particularly complex data allocations.
Currently, the specification defines lookup operations for three properties:
offset
, length
, and slot
. Runtime checks are required to prevent
accessing properties that aren't available on the target region (e.g.
memory regions do not contain a slot
property).
Since all of these lookups function in the same way, this reference
implementation needs only a single
evaluateLookup<O extends "slot" | "offset" | "length">
function:
async function evaluateLookup<O extends Pointer.Expression.Lookup.Operation>(
operation: O,
lookup: Pointer.Expression.Lookup.ForOperation<O>,
options: EvaluateOptions
): Promise<Data> {
const { regions } = options;
const identifier = lookup[operation];
const region = regions[identifier];
if (!region) {
throw new Error(`Region not found: ${identifier}`);
}
const property = Pointer.Expression.Lookup.propertyFrom(operation);
const data = region[property as keyof typeof region] as Data | undefined;
if (typeof data === "undefined") {
throw new Error(
`Region named ${identifier} does not have ${property} needed by lookup`
);
}
return data;
}
(The use of generic types here serves mostly to appease the type-checker; the minimal type safety it affords is insignificant compared to runtime data consistency concerns, which hopefully the implementation makes clear via its use of runtime definedness checks.)
Evaluating machine state reads
Finally, the last kind of expression defined by this specification is for
reading raw data from the machine state. A Pointer.Expression.Read
should
evaluate to the raw bytes stored at runtime in the region identified by a
particular name.
Thanks to evaluate()
's requirement that its input regions-by-name map
contains only concrete
Cursor.Region
objects, and by leveraging the existing
read()
functionality,
this function presents no surprises:
async function evaluateRead(
expression: Pointer.Expression.Read,
options: EvaluateOptions
): Promise<Data> {
const { state, regions } = options;
const identifier = expression.$read;
const region = regions[identifier];
if (!region) {
throw new Error(`Region not found: ${identifier}`);
}
return await read(region, options);
}
Note on "$this"
region lookups
Astute readers might notice that these docs contain no mention until now
about how to implement support for expressions that reference the region in
which they are defined, a mechanism the schema permits via the special region
name identifier "$this"
.
Performing read operations against "$this"
region is meaningless since
this schema does not afford any mechanism for defining regions recursively
down to a base case (or similar composition). Thus, the only syntactic
construct for self-referential reads resembles, e.g., defining a storage region
whose slot
is { $read: "$this" }
. Evaluating this slot
would require
knowing the slot before knowing where to read, and knowing the slow requires
knowing the machine value, ad nauseum.
Property lookup expressions, on the other hand, are completely acceptable—provided they do not include circular references of any cycle length.
Since the evaluate<.*>()
functions here are written to accept only one
expression at a time, this reference implementation relegates this concern to a
higher-level module; proper use of evaluate()
here requires its
options.regions
map to include a pre-evaluated (albeit partial)
"$this"
region.
The logic for creating "$this"
regions and calling evaluate()
correctly
is described in the section pertaining to that area of the code. Be
forewarned that this reference implementation takes a naïve trial-and-error
approach for determining property evaluation order; implementations requiring
a more robust strategy will need to do some amount of pre-processing.