Example #3: more arithmetic

The expected DUT output in the previous example was derived as ```sum += (a *\$8 b)``` (in other words, `sum = sum + (a *\$8 b)`). The multiply operator here is not just *; it is explicitly given as an unsigned, 8-bit multiply. This is an important point: Maia operators look like hardware function units, and coding complex arithmetic looks like hardware. You can, if necessary, use plain operators and plain (signed) int variables, and the language will then try to work out what you actually wanted, in much the same way as C, but this may not be the right thing to do.

In short: the language does not, in general, attempt to deduce your intent. Variables ('objects') do not have properties apart from their size. They are not signed, or unsigned, or floating-point, or 1's complement, or 2's complement: they are just data. The relationship between variables and operators is the same as the relationship between hardware memory and hardware function units: in other words, 'memory' simply stores arbitrary data patterns.

This is generally a good fit for modelling and verifying hardware, but does have its own issues: carrying out floating-point arithmetic is long-winded, for example. On the other hand, FP arithmetic is of little use in hardware verification (you can't, for example, verify a floating-point multiplier by comparing the FPU output against the result which is produced by the * operator in whatever CPU, language, and FPU setup you're using). This is, in software enginering terms, pretty much the exact opposite of being 'Object Oriented'.

This example shows some more complex arithmetic. It calculates the 130-bit result of a single round of a Poly1305 (RFC8439) message authenticator. I use this function as part of a unit test of my own VHDL Poly1305 implementation. The RTL is difficult, because of the requirement to keep track of the bit widths and limits of the operations, and the modulo reduction operation. The RTL result is compared against this Maia implementation:

```bit130 round(bit128 r, bit128 block, int nbytes) {
static bit128 clamp  = 128'h0fff_fffc_0fff_fffc_0fff_fffc_0fff_ffff;
static bit130 primep = 130'h3_ffff_ffff_ffff_ffff_ffff_ffff_ffff_fffb;
static bit130 acc130 = 0;              // static variables maintain their
bit129 blk129;                         //   values between function calls
bit131 sum131;
bit256 mul256;

r &= clamp;                            // down to 124 bits

blk129 = byte_swap(block, nbytes);     // byte swap and expand by 1 bit
sum131 = acc130 +\$131 blk129;          // 130+129 bits = 131 bits
mul256 = sum131 *\$256 r;               // result known to fit in 255 bits
acc130 = mul256 % primep;              // modulo reduction
assert(acc130 < primep);
return acc130;
}
```