 # Formal Verification of Smart Contracts

Hi!

I’ve been taking a look at the Formal Verification that Maker uses and trying to learn more about KEVM and the K language in general.

I’m trying to start (small) and take a look at a simple proof in mkr-authority’s bytes4 support, but I have a few questions regarding the code:

1. Rule for setRoot proof (magic number is bitwise not of 2^160) - how is this number calculated?

``````rule 115792089237316195423570985007226406215939081747436879206741300988257197096960 &Int X => 0
Lemmas for bytes4 support
syntax TypedArg ::= #bytes4( Int )

``````

I found one of the lemmas provided some more information regarding the calculation of the `bytes4` support in the comments - which is calculated from: `(2^32 - 1) * 2^224`. How does this derivation work?

``````syntax Int ::= "maxBytes4"
rule maxBytes4 => 115792089210356248756420345214020892766250353992003419616917011526809519390720 [macro] /* (2^32 - 1) * 2^224 */
``````

2. minUint256 value in lemmas

I noticed the `minUint256` gets used when checking the `#getValue` lemma. Is the value of a minimum uint256 a constant that’s made accessible by KEVM/K?

``````syntax Int ::= #getValue ( TypedArg ) [function]
rule #getValue(#bytes4( DATA )) => DATA
requires minUInt256 <=Int DATA andBool DATA <=Int maxBytes4
``````

Thanks!
Natalie

6 Likes

Hi Natalie, let me know if the answers below clear things up at all :

1. So for this rule:
``````rule 115792089237316195423570985007226406215939081747436879206741300988257197096960 &Int X => 0
``````

This tells `K` how to simplify a certain operation that occurs in the Solidity-generated bytecode, which is this bitwise “and” of the constant 1157…6960 and some EVM word `X`. So the constant originally comes out of the code. It happens, the magic number here is the bitwise not of 2^160 - 1 as a 256-bit word (yes, the description is missing a “- 1”, which is maybe the motivation for the question…good catch, I will correct that). So what the bytecode is trying to do is throw away the lower-order 160 bits of `X`. What the rule then says is that if `X` (represented in `K` as a plain integer) is already known to be in the range of an address (an Ethereum address is 160 bits), then this expression (`1157...6960 &Int X`) rewrites to `0` (i.e. no bits set). This might seem very obvious, but sometimes `K` needs help with obvious things :).

1. Now for this rule:
``````syntax Int ::= "maxBytes4"
rule maxBytes4 => 115792089210356248756420345214020892766250353992003419616917011526809519390720 [macro] /* (2^32 - 1) * 2^224 */
``````

It’s just defining a constant. As I mentioned above, in `K` we represent EVM words with plain integers, so this is supposed to be the largest valid value for an integer representing a `bytes4` data type. In Solidity, a `bytesN` data type stores its data in the highest-order N bytes of a word. So the maximum `K` integer that corresponds to a valid `bytes4` value is constructed by setting the highest-order 32 bits of a 256 bit word. The (2^32 - 1) * 2^224 expression just describes this mathematically–(2^32 - 1) is a word with the lowest-order 32 bits set; multiplying by 2^224 shifts these bits to the left so that highest-order 32 bits are set (224 + 32 = 256, so the 32nd bit moves to be the 256th bit, etc).

1. `minUInt256`, along with many other constants, is defined in `KEVM`:
https://github.com/dapphub/evm-semantics/blob/0bbee05fb3ce15e3313604ed3679e2c08d8e4c35/evm-types.md
(note that `klab` uses a fork of `evm-semantics`, the original `evm-semantics` repo is here)
2 Likes

`seth` is a great tool to help with this (you can install over at dapp.tools

``````[email protected]:~\$ seth --to-hex 115792089237316195423570985007226406215939081747436879206741300988257197096960
0xffffffffffffffffffffffff0000000000000000000000000000000000000000
[email protected]:~\$ seth --to-hex 115792089210356248756420345214020892766250353992003419616917011526809519390720
0xffffffff00000000000000000000000000000000000000000000000000000000
``````
3 Likes

Thank so much for the quick response @Kurt_Barry & @Mariano_Conti! Definitely clears up my questions about the lemmas. I’ll definitely check out `seth` as well!

I have a few additional questions regarding the proof for the `canCall` specs:

``````behaviour canCall-true of MkrAuthority
...
iff
VCallValue == 0
if
(src == Root) or
sig == #asWord(#padRightToWidth(32, #take(4, #abiCallData("burn", #uint256(0)))))) or
sig <= 115792089210356248756420345214020892766250353992003419616917011526809519390720
sig >= 0

returns 1
``````
1. It seems like the contents inside an `iff` section represent preconditions for a proof to start running. Does the `VCallValue==0` mean that the tx must not have been sent with Ether (ie: similar to a check for `tx.value==0`)?

2. Is it fair to say that the `(sig == #asWord(#padRightToWidth(32, #take(4, #abiCallData("burn", #address(0), #uint256(0)))))` essentially:

• `#abiCallData` - pads the arguments sent into the function
• `#take (N, WS)` - truncates and/or pads depending on the size of the data to N=4
• `padRightToWidth(N, WS)` - after truncating, ensures that the result of `#take(4..)` is 32 bytes.
• From the name, it sounds like its purpose is to add additional padding (if the lengths are not equal) - but the documentation seems to suggest that it’s a check of length equality. If the output of `#take()..` doesn’t match the expected length of `N`, does the `padRightToWidth` add additional zeroes for padding?
1 Like
1. Yes, `VCallValue == 0` states that 0 ETH was sent. `iff` does not quite behave how you conjecture–how `iff` conditions are used (simplifying slightly) is that `klab`, for each act, creates a `pass` spec where the `iff` conditions are true, and a `fail` spec where they are false. The `fail` spec of course asserts that the function reverts, whereas the `pass` spec asserts that it does not revert (along with other conditions defined in the spec like storage reads and rewrites, return values, etc).

2. Here’s what each piece does in the RHS expressions of the `sig` comparisons:

• `#abiCallData` : computes the Solidity call data as a `K` byte array (see definition here).
• `#take (N, WS)` : just takes the first `N` bytes (in the spec, this is 4, so it’s picking out the function selector); you can see how it’s defined here.
• `#padRightToWidth (N, WS)` : pads to the right with zero bytes, up to the specified width (`N`); defined here.
• `#asWord` : converts an array of bytes to an EVM word (so, in the `K` representation, just an integer between 0 and (2^256 - 1) inclusive), defined here.
2 Likes

Ah, very interesting. Yeah, I took a brief glance at the KEVM whitepaper and I noticed the distinction between the fail and the pass spec, but I didn’t realize that’s how it worked. Will definitely have to play around with it.

Got it - thanks so much for those links and the short description of the functions. Helps a lot!! I really appreciate the explanation - thank you @Kurt_Barry!

2 Likes

Hmm… I’m still sort of trying to understand the RHS. I’ve sort of psuedo-coded (pseudo-mathed?) it out, but having a bit of trouble seeing how this works:

Step 1: `#abiCallData`:
According to the formula in the EVM semantics link, `#abiCallData = F1 : F2 : F3 : F4 : T1 : ... : T32 : V1 : ... : V32`, where:

• `F1 : F2 : F3 : F4 = keccak("burn",#address(0),#uint256(0))`
• `T1 : ... : T32 = bytes(#address(0))`
• `V1 : ... V32 = bytes(#UINT256(0))`

`abiCallDataResult = [keccak("burn",#address(0),#uint256(0),bytes(#address(0)),bytes(#uint256(0))]`

Step 2: #take(4, abiCallDataResult):
From the description above and the description in the semantics, pulls out the function name which means the result should be:

• `#take(4, abiCallDataResult) = "burn"`

Though I’m a bit confused as to whether the result of `#take(4, abiCallDataResult)` is equal to the plaintext function selector (`'burn'`) or equivalent to the keccak of the burn function selector (`keccak256('burn')`)?

EDIT: After going through this flow a bit more, is `N=4` chosen because want to specifically check lemmas only on `bytes4`, which would result in the bytes of `addr(0)` and `uint256(0)` being dropped?

(Sidenote: For simplicity, I’m referring to this result as `takeResult` in future steps.)

Step 3: `padRightToWidth(32, takeResult)`
If the function pads 0’s to the right, up to 32 bytes, then would the result of this be:

``````padRightResult = #padRightToWidth(32, `takeResult`)
padRightResult = [takeResult] + [bytes(0)]*(offset to make this length 32 bytes)
``````

Why would the bytes array need to be padded with zeroes up to 32 bytes? Does this step need to be done to convert the array into an EVM word? On that note, is an EVM word essentially a `bytes32` var in Solidity?

Step 4: `#asWord`
The `padRightResult` is a bytes array, so the `#asWord` converts this to an EVM word. And then when `K` parses the word, it pulls it in as an integer value, which is checked on the bounds of (0, maxBytes4). Is this something that happens under the hood where if `K` reads in a word and the result of that is an integer?

Thanks! 