Postmortem: Bug in 2020/08/07 Executive Spell Contract

What Happened?

On the evening of August 8th, 2020, a community member indicated that the week’s executive spell, which had been successfully cast approximately 11 minutes earlier, appeared to set the WBTC debt ceiling from 40 million Dai to zero. This was quickly identified to be a discrepancy in the precision of the value set by the weekly executive smart contract. This was confirmed in the code and the Smart Contracts Domain Team was notified and responded shortly thereafter by preparing and reviewing a new executive spell that corrected the error. Due to the quick action of MKR holders, the new spell was voted on and passed the MKR threshold a little less than 12 hours later.

No user funds were at risk as a result of this bug.

Screenshot from 2020-08-17 20-21-10

How did it happen?

The Smart Contracts Domain Team develops its executive spells in response to governance actions, and typically prepares the weekly executive on Thursday afternoon after the weekly governance call, which is to be deployed the following morning. Team members take turns creating the spell in order to keep everyone up to speed with the process. The developer who takes the current week’s spell typically uses the previous week’s spell as a template. Since the spell structure evolves week-to-week, the template spell may look different than the last iteration the developer worked on, potentially adding some cognitive overhead in learning the most recent format. In this case, the developers working on the executive adapted the previous week’s spell for the variables that were expected to change for the week, including the WBTC-A debt ceiling, the global debt ceiling, and the ETH-A liquidation lot size. At the time of development, the weekly values that would be changing had not been confirmed by governance, so placeholder values were used and the pull request was marked with a note to review the numbers when they were confirmed the next morning. The placeholder values were not validated for precision at this time, and the WBTC-A debt ceiling was set as a WAD (18 decimal digits of precision) instead of a RAD (45 decimal digits of precision), which the Vat expects for this variable. This is an easy thing to miss because the Vat expects different variables to be in WAD (18), RAY (27), or RAD (45), and the only way to confirm which precision a given variable is in is to consult the vat code.

Screenshot from 2020-08-17 15-42-37

The numbers from governance were confirmed the following morning, but the team realized that we needed to mention the overflow issue that was being addressed in the spell and a forum post was prepared and the test repository was made public to the community. By the time the forum topic was posted, it was already time to deploy the spell. Several members of the Smart Contracts Domain Team reviewed the spell and approved the code, missing the precision error, and the contract was deployed as usual.

How did we fix it?

The problem was discovered very quickly after the spell was cast and a new spell containing only the correct variable was created, reviewed, and deployed by the team.

How do we prevent it?

This issue highlights that the Smart Contracts Domain Team must continue to improve its testing and automation, and remain diligent with audits. Additional tests have been added to the weekly executive test suite that check that all system variables fall within expected tolerances. We are also relying on some additional tooling for visualizing the on-chain effects of spells, and although these tools are currently in development and internal to the Foundation, we expect to make them public in the future. Steps will additionally be taken to prepare a canonical template for spellcrafting and testing, rather than re-using previous spells. In the longer term, the team hopes to automate or script the spell generation in a way that can be handed to the community to permit the safe creation of MCD executive spells.

6 Likes

Glad to hear work is being done to prevent this in the future. We are very lucky this occurred in a mostly harmless way. If this precision error was made on a collateral duty, it could have been devastating to the protocol. I trust the smart contract team will not take this lightly. This is also a good lesson to the community that third parties should be verifying executives have sane values. If you are a large MKR holder, this is in your best interest. The more eyes the better.

2 Likes

“Several members” is not quantifiable :wink:

By “reviewed the spell” it probably means they’ve read the spell, not actually written their own “pseudocode” and compared it. It’s very hard for most humans to review code, especially “boring code” as spells are. It’s a mental struggle against your brain constantly telling you to finish the review and “free itself”. I am not sure if “non-developer” part of the community understands that “remain diligent with audits” is not so trivial as it might sound. Unfortunately, the better the tooling/automation, the more trust is put in such systems -> less likely human audit catches potential error. Interesting problem.

Ideally, 2 (or more) independent dev teams would have completely independent tooling/processes and would not communicate with each other. Then their submitted spells would be compared. And i am not saying this is feasible, i guess i am just paranoid atm :grin:

2 Likes