Instruction Ra

Concrete Examples with Real Data

Example 1: ADDI (Type 1 - Arithmetic)

From the trace:

Cycle 1: ADDI {
    operands: { rd: 2, rs1: 2, imm: 288 }
    register_state: {
        rd: (2147487744, 2147488032),  // before → after
        rs1: 2147487744
    }
}

What happens:

Step 1: Get inputs
   rs1 = 2147487744
   imm = 288

Step 2: Compute in field (addition is cheap!)
   result = 2147487744 + 288 = 2147488032

Step 3: Form lookup address
   address = result = 2147488032
           = 0x80001120

Step 4: Chunk into 16 bytes
   0x00 00 00 00 00 00 00 00 00 00 00 00 80 00 11 20
   └─┬─┘                           └─┬─┘ └─┬─┘ └─┬─┘
    b0                              b12   b14   b15

InstructionRa values for cycle 1:

InstructionRa[0][1]  = 0    (0x00)
InstructionRa[1][1]  = 0    (0x00)
...
InstructionRa[11][1] = 0    (0x00)
InstructionRa[12][1] = 128  (0x80)
InstructionRa[13][1] = 0    (0x00)
InstructionRa[14][1] = 17   (0x11)
InstructionRa[15][1] = 32   (0x20)

Reconstructed address:

0×256¹⁵ + 0×256¹⁴ + ... + 128×256³ + 0×256² + 17×256¹ + 32×256⁰
= 2147488032 ✓

This equals the RESULT!

Example 2: MUL (Type 1 - Arithmetic)

From the trace:

Cycle 14: MUL {
    operands: { rd: 13, rs1: 34, rs2: 36 }
    register_state: {
        rd: (0, 144115188075855872),    // before → after
        rs1: 2,
        rs2: 72057594037927936
    }
}

What happens:

Step 1: Get inputs
   rs1 = 2
   rs2 = 72057594037927936 = 0x0100000000000000 (2⁵⁶)

Step 2: Compute (multiplication is done in field)
   result = 2 × 72057594037927936 = 144115188075855872
          = 0x0200000000000000 (2⁵⁷)

Step 3: Form lookup address
   address = result = 144115188075855872

Step 4: Chunk into bytes
   0x00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
                           └─┬─┘
                             b8

InstructionRa values for cycle 14:

InstructionRa[0][14]  = 0
InstructionRa[1][14]  = 0
...
InstructionRa[7][14]  = 0
InstructionRa[8][14]  = 2    (0x02) ← The product is here!
InstructionRa[9][14]  = 0
...
InstructionRa[15][14] = 0

Reconstructed address:

2 × 256⁷ = 2 × 72057594037927936 = 144115188075855872 ✓

This equals the RESULT of 2 × 72057594037927936!

Example 3: ANDI (Type 2 - Bitwise)

From the trace:

Cycle 9: ANDI {
    operands: { rd: 33, rs1: 32, imm: 0xFFFFFFFFFFFFFFF8 }
    register_state: {
        rd: (0, 2147459072),
        rs1: 2147459072
    }
}

What happens:

Step 1: Get inputs
   rs1 = 2147459072  = 0x000000007FFFA000
   imm = -8 (as u64) = 0xFFFFFFFFFFFFFFF8

Step 2: CAN'T compute AND efficiently in field!
   Need to look up: (rs1, imm) → result

Step 3: Form lookup address by INTERLEAVING inputs
   Interleave bits of (imm, rs1)  [note: order is (y, x)]
   
   rs1 = 0x000000007FFFA000
       = 0000...0111111111111111101000000000000
   
   imm = 0xFFFFFFFFFFFFFFF8
       = 1111...1111111111111111111111111000
   
   Interleaved (alternating bits: imm₀ rs1₀ imm₁ rs1₁ ...):
       = 0x55555555555555557FFFFFFFDD555540

Step 4: Chunk this interleaved value

InstructionRa values for cycle 9:

Address: 0x55555555555555557FFFFFFFDD555540
         └────────────────────┬──────────────┘
         These are the interleaved input bits!

NOT the result (which would be 0x7FFFA000)

Example 4: BEQ (Type 2 - Comparison)

From the trace:

Cycle 7: BEQ {
    operands: { rs1: 10, rs2: 12, imm: 84 }
    register_state: {
        rs1: 18446744073709551615,  // -1 as unsigned (all 1s)
        rs2: 4
    }
}

What happens:

Step 1: Get inputs
   rs1 = 18446744073709551615 = 0xFFFFFFFFFFFFFFFF (all 1s)
   rs2 = 4                    = 0x0000000000000004

Step 2: CAN'T compute equality efficiently
   Need to check: does rs1 == rs2?

Step 3: Form lookup address by INTERLEAVING
   rs1 = 1111111111111111111111111111111111111111111111111111111111111111
   rs2 = 0000000000000000000000000000000000000000000000000000000000000100
   
   Interleave (rs2, rs1):
   = 1010101010101010101010101010101010101010101010101010101010111010
   = 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABA

Step 4: Chunk this interleaved value

InstructionRa values for cycle 7:

InstructionRa[0][7]  = 170  (0xAA = 10101010)
InstructionRa[1][7]  = 170  (0xAA)
InstructionRa[2][7]  = 170  (0xAA)
...
InstructionRa[14][7] = 170  (0xAA)
InstructionRa[15][7] = 186  (0xBA = 10111010)

Pattern: 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABA
         └──────────────┬──────────────────┘
         Interleaved input bits!

The lookup:

Address: 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABA
         = interleave(4, 0xFFFFFFFFFFFFFFFF)

### Example 5: AND (Type 2 - Bitwise R-format)

**From the trace:**
Cycle 58: AND {
    operands: { rd: 37, rs1: 37, rs2: 36 }
    register_state: {
        rd: (1, 1),
        rs1: 1,
        rs2: 255
    }
}

What happens:

Inputs: rs1 = 1, rs2 = 255
Result: 1 & 255 = 1

InstructionRa stores:
   interleave(rs2, rs1) = interleave(255, 1)
   = 0x0000000000000000000000000000000000000000000000000000000000005557

Visual: The Chunking Process

Regardless of type, the 128-bit address gets chunked into 16 bytes:

128-bit Lookup Address
┌──────────────────────────────────────────────────────────────┐
│ Either: RESULT (Type 1) or INTERLEAVED_INPUTS (Type 2)      │
└──────────────────────────────────────────────────────────────┘
                           │
                           │ Split into 16 chunks
                           ▼
    ┌────┬────┬────┬────┬────┬─  ─┬────┬────┐
    │ b0 │ b1 │ b2 │ b3 │ b4 │ ... │b14 │b15 │
    └────┴────┴────┴────┴────┴─  ─┴────┴────┘
     8bit 8bit 8bit 8bit 8bit      8bit 8bit

Each byte goes into one polynomial:
InstructionRa[0][cycle] = b0  (high byte)
InstructionRa[1][cycle] = b1
...
InstructionRa[15][cycle] = b15 (low byte)