Pre-requisites

Basics of Lean theorem proving and understanding of the EStateM Monad. See TODO: Bhavik Mehta's refresher for Lean basics.

With the state definitions out of the way, let's get to an actual theorem statement. The theorem below says that Jolt's sequence of instructions for ADDW leaves the final Sail state component of the Jolt state machine in the same way as the ADDW instruction defined by Sail. In formal theorem proving language, this looks like the following.

theorem jolt_addw_eq_sail (rs2 rs1 rd : regidx) (hrd : rd ≠ regidx.Regidx 0)
    (js : SailJoltState) (hwf : WellFormed js) :
    projectResult ((jolt_addw rs2 rs1 rd).run js) =
    (execute_RTYPEW rs2 rs1 rd ropw.ADDW).run js.sail := by

Now let's take apart this definition.

The Inputs

The bits in between () on Line 87.

rs2 rs1 rd : regidx

Source and destination register indices — all of type regidx which is a wrapper for BitVec 5. These are 5-bit indices into the general-purpose register file. rs1 and rs2 are the two source registers whose values will be added, and rd is the destination register where the result is written.

hrd : rd ≠ regidx.Regidx 0

A proof that the destination register is not x0. In RISC-V, x0 is hardwired to zero — writes to it are discarded. This hypothesis is needed because the Jolt sequence relies on reading back the value written to rd, and if we write to register x0, we always read back 0. To not handle this redundant case explicitly, we just don't allow jolt sequences to write to register 0.

js : SailJoltState

The initial Jolt machine state, containing both the Sail state (js.sail) and the virtual registers (js.vregs).

hwf : WellFormed js

WellFormed is defined as:

def WellFormed (js : SailJoltState) : Prop :=
  ∀ r : regidx, ∃ v, rX_bits r js.sail = .ok v js.sail

It says: for every register index r, reading that register via rX_bits succeeds — it returns .ok v js.sail for some value v, and does not modify the state. Without this, rX_bits could return .error, and the proof would not go through.1 Simply put, this is saying that we are able to read the values of the registers without anything going wrong. Why might it go wrong? See Appendix: Why rX_bits can fail for a detailed explanation. The short answer is that readReg does a hash-map lookup that throws Error.Unreachable if the key is missing. So say you asked to read register x56, which does not exist — it will error. All that WellFormed is really saying is that if you write your instruction correctly, then every general-purpose register is present in the hash map, and reads will go through without issues.

1

Syntax Alert: Note that rX_bits r js.sail is direct function application — rX_bits r produces a SailM (BitVec 64) which is a term of the function type that models a step of the CPU i.e. SailState → Result ..., and js.sail is passed as the argument. This is definitionally equal to (rX_bits r).run js.sail. We can do this here because we are in a Prop, not inside a do block. Inside a do block, state threading is handled implicitly by bind.

The Theorem Statement

    projectResult ((jolt_addw rs2 rs1 rd).run js) =
    (execute_RTYPEW rs2 rs1 rd ropw.ADDW).run js.sail

The Right-Hand Side

The right-hand side just invokes functions exposed by the Sail transpilation. As we treat the transpilation as the ground truth, we will NEVER write custom code to define RISCV operations. The way Sail implements a step of the ADDW instruction is shown in the following code.

(execute_RTYPEW rs2 rs1 rd ropw.ADDW).run js.sail

execute_RTYPEW handles all R-type word (32-bit) instructions. It takes two source registers rs1 and rs2, a destination register rd, and an opcode op, and returns a monadic computation in SailM. The transpiled code:

def execute_RTYPEW (rs2 : regidx) (rs1 : regidx) (rd : regidx) (op : ropw) : SailM ExecutionResult := do
  let rs1_val ← do (pure (Sail.BitVec.extractLsb (← (rX_bits rs1)) 31 0))
  let rs2_val ← do (pure (Sail.BitVec.extractLsb (← (rX_bits rs2)) 31 0))
  let result : (BitVec 32) :=
    match op with
    | ADDW => (rs1_val + rs2_val)
    | SUBW => (rs1_val - rs2_val)
    | SLLW => (shift_bits_left rs1_val (Sail.BitVec.extractLsb rs2_val 4 0))
    | SRLW => (shift_bits_right rs1_val (Sail.BitVec.extractLsb rs2_val 4 0))
    | SRAW => (shift_bits_right_arith rs1_val (Sail.BitVec.extractLsb rs2_val 4 0))
  (wX_bits rd (sign_extend (m := 64) result))
  (pure RETIRE_SUCCESS)

Line 67347:

let rs1_val ← do
  pure (Sail.BitVec.extractLsb (← rX_bits rs1) 31 0)

← rX_bits rs1 reads the full 64-bit value from register rs1. Sail.BitVec.extractLsb ... 31 0 extracts bits 31 down to 0 — the lower 32 bits. So rs1_val is a BitVec 32.

Line 67348:

let rs2_val ← do
  pure (Sail.BitVec.extractLsb (← rX_bits rs2) 31 0)

Same for rs2: read the 64-bit value, extract the lower 32 bits.

Lines 67349–67355:

let result : (BitVec 32) :=
  match op with
  | ADDW => (rs1_val + rs2_val)
  | ...

For ropw.ADDW, the match selects rs1_val + rs2_val — a 32-bit addition of the two truncated values. The result type is BitVec 32.

Line 67356:

wX_bits rd (sign_extend (m := 64) result)

sign_extend (m := 64) sign-extends the 32-bit result back to 64 bits. wX_bits rd writes this value to the destination register rd.

Line 67357: pure RETIRE_SUCCESS — returns RETIRE_SUCCESS as the final value of the computation.

Summary

The do block composes four monadic operations, threading the SailState through sequentially:

  1. Read rs1 — read register rs1 and extract lower 32 bits. State unchanged.
  2. Read rs2 — read register rs2 and extract lower 32 bits. State unchanged.
  3. Write — add the two 32-bit values, sign-extend back to 64, and write to rd. State changes.
  4. Return — return RETIRE_SUCCESS. State unchanged.

Only step 3 mutates the state. Note that the addition is 32-bit — both operands are truncated before adding.

The Left-Hand Side

projectResult ((jolt_addw rs2 rs1 rd).run js)

This is Jolt's version of the same instruction. Jolt decomposes ADDW into two simpler operations: a 64-bit ADD followed by a virtual sign-extend of the lower 32 bits. Here is jolt_addw:

def jolt_addw (rs2 rs1 rd : regidx) : JoltMonad ExecutionResult := do
  let _ ← liftSail (execute_RTYPE rs2 rs1 rd rop.ADD)
  jolt_virtual_sign_extend_word rd
  pure RETIRE_SUCCESS

Note this returns JoltMonad ExecutionResult, not SailM ExecutionResult — it operates on SailJoltState, not SailState. Going line by line, the three steps are:

  1. liftSail (execute_RTYPE rs2 rs1 rd rop.ADD) — runs Sail's 64-bit ADD instruction, lifted into the Jolt monad via liftSail2. The relevant branch of execute_RTYPE:
2

See helpers for details on how liftSail works.

def execute_RTYPE (rs2 : regidx) (rs1 : regidx) (rd : regidx) (op : rop) : SailM ExecutionResult := do
  (wX_bits rd
    (← do
      match op with
      | ADD => (pure ((← (rX_bits rs1)) + (← (rX_bits rs2))))
      ...))
  (pure RETIRE_SUCCESS)

For rop.ADD, this reads both rs1 and rs2 as full 64-bit values, adds them, and writes the 64-bit sum to rd. No truncation happens here — this is a full-width addition.

  1. jolt_virtual_sign_extend_word rd — reads rd, extracts the lower 32 bits, sign-extends back to 64 bits, and writes the result back to rd:
def jolt_virtual_sign_extend_word (rd : regidx) : JoltMonad Unit := do
  let v ← liftSail (rX_bits rd)
  liftSail (wX_bits rd (sign_extend (m := 64) (Sail.BitVec.extractLsb v 31 0)))
  1. pure RETIRE_SUCCESS — returns the success value.

.run js runs this computation on the full SailJoltState (not just js.sail), producing an EStateM.Result (Error exception) SailJoltState ExecutionResult.

Then projectResult strips out the Jolt-specific state (the virtual registers), keeping only the SailState component.

What the theorem says

Both sides produce an EStateM.Result (Error exception) SailState ExecutionResult. The theorem asserts these are equal, which means all three components match:

  1. Same branch — if one returns .ok, so does the other; if one returns .error, so does the other.
  2. Same return value — the ExecutionResult (e.g. RETIRE_SUCCESS) is identical.
  3. Same new state — the resulting SailState is identical.

In other words, Jolt's decomposition of ADDW into ADD + sign-extend is faithful — after projecting away the virtual registers, it leaves the Sail state in exactly the same condition as running the native ADDW instruction directly.

Appendices

Appendix: Why rX_bits can fail

The call stack from rX_bits down to the point of failure:

rX_bits:

def rX_bits (app_0 : regidx) : SailM (BitVec 64) := do
  let .Regidx i := app_0
  (rX (Regno (BitVec.toNatInt i)))

which converts the 5-bit index to a Nat and calls rX. regno is just a Nat wrapper (inductive regno where | Regno (_ : Nat)):

def rX (app_0 : regno) : SailM (BitVec 64) := do
  let .Regno r := app_0
  let v ← (( do
    match r with
    | 0 => (pure zero_reg)
    | 1 => readReg x1
    | 2 => readReg x2
    -- ... cases 330 ...
    | _ => readReg x31 ) : SailM regtype )
  (pure (regval_from_reg v))

where regval_from_reg is just the identity:

def regval_from_reg (r : (BitVec 64)) : (BitVec 64) :=
  r

rX then calls readReg. Note the return type uses PreSailM rather than SailMreadReg is defined in the Sail library before the RISC-V-specific types exist, so it must use the generic version (see The Sail Monad).

Register, RegisterType, c, and ue in the signature below are all implicit. Lean infers their concrete values because rX returns SailM (RegisterType r), which fixes the monad to EStateM (Error exception) SailState (RegisterType r). This gives us ue = exception, c = trivialChoiceSource, and RegisterType = the RISC-V register type function described in the state post.

def readReg (r : Register) : PreSailM RegisterType c ue (RegisterType r) := do
  let .some s := (← get).regs.get? r
    | throw .Unreachable
  pure s
Lean syntax: do-notation pattern match

The syntax let .some s := expr | throw .Unreachable is Lean's do-notation pattern matching with a failure branch. It is equivalent to:

match expr with
| .some s => ... rest of do block ...
| .none => throw .Unreachable

The | line is the else branch — what to do if the pattern doesn't match.

Recall that on the .ok branch of EStateM.Result ε σ α, the value carried has type α. Here α = RegisterType r, which depends on which register r you pass in. For x1x31, RegisterType r resolves to BitVec 64 (see State).

Breaking down (← get).regs.get? r:

  • ← get retrieves the current SequentialState from the monad
  • .regs accesses the dependent hash map field
  • .get? r does a hash map lookup, returning Option (RegisterType r)

If the key is missing — i.e. .get? returns none — it throws Error.Unreachable. So WellFormed is really saying: every general-purpose register is present in the hash map. This is a reasonable assumption because we only ever start from states where the register file has been fully initialised.

EStateM vs Result

pure s returns an EStateM value, not a Result. Remember EStateM ε σ α is just σ → Result ε σ α — it's a function. So pure s constructs the function fun state => .ok s state. You only get a Result when you actually run the computation via .run, which applies that function to a state.

Appendix: Do-notation and bind

In Lean's do-notation, let x ← e desugars to e >>= fun x => ..., where e is some monadic expression — that is, e has type M α for some monad M. For example, e could be lookup "name" which has type Option Nat, or rX_bits rs1 which has type SailM (BitVec 64). If you're familiar with Rust, >>= (bind) is the same idea as and_then on Option — if the value is Some, unwrap it and pass it to the next function; if it's None, short-circuit.

To build intuition, let's use the Option monad. Consider a dictionary lookup that might fail:

def lookup (key : String) : Option Nat := ...

lookup returns some v if the key exists, or none if it doesn't. Now suppose we want to look up two keys and add their values:

def addEntries (k1 k2 : String) : Option Nat := do
  let v1 ← lookup k1
  let v2 ← lookup k2
  pure (v1 + v2)

let v1 ← lookup k1 means: run lookup k1. If it returns some v, bind v1 := v and continue. If it returns none, short-circuit — the whole function returns none. So v1 is a plain Nat, not an Option Nat.

This desugars to:

def addEntries (k1 k2 : String) : Option Nat :=
  (lookup k1) >>= fun v1 =>
    (lookup k2) >>= fun v2 =>
      pure (v1 + v2)

>>= (bind) for Option is:

  • some v >>= f evaluates to f v
  • none >>= f evaluates to none

The same pattern applies to EStateM. Instead of some/none, we have .ok value newState/.error e newState. The extracts value from the .ok branch and threads the state through. If the computation errors, the whole do block short-circuits with that error.

For a gentler introduction to monads and bind, see this post. For the formal Lean syntax, see the official Lean reference on do-notation.