July 17th, 2021 Replacement Executive

Governance Debrief

While a post mortem will be coming out from @Protocol-Engineering, I wanted to take a moment to highlight what happened this weekend from the Governance perspective with putting up an urgent executive - FLAP Auction Adjustment and RWA-02 (NS-DROP) Debt Ceiling Increase - July 17, 2021 to replace the regular July 16th Executive Proposal.

I hope that this post will highlight what happened, the Governance response to it, and some areas for improvement should a similar situation occur in the future.

Executive Summary

  • An error was discovered in the July 16th Executive Proposal and reported to the community on the morning of July 17th.
  • Through internal coordination and limited public updates, a new spell was posted by Saturday evening UTC, providing the possible execution of the original spell’s intent.
  • As the first person to notice this report from GovAlpha, I took point and made unilateral decisions on behalf of the Governance Facilitators.
  • The process was lacking some transparency, specifically around the clarification of the July 17th Executive Proposal being a replacement for the July 16th Exec.

What Happened

At approximately 9:18 am UTC on Saturday, July 17th @spin made a post in the #governance-and-risk chat citing some concerns with the Executive proposal that went live for voting about 15 hours earlier. Namely, that there was an element missing in the code necessary for MIP21 to allow the Asset Originator (in this case New Silver) to utilize the increased Debt Ceiling.

Within a few hours, this omission in the executive spell was confirmed as well as the ability to fix the error. At approximately 13:20 UTC @brianmcmichael posted publicly in the #governance-and-risk chat that the team was aware of this problem and working on a fix, with details forthcoming.

At approximately 14:43 UTC, I gave an update in the chat channel that the July 16th Executive had been pushed down from the “active proposals” section of the Voting Portal, to prevent people who wanted to see the listed changes take effect from voting for the changes when they would not fully enact.

By 19:16 UTC the new Executive Proposal was available on the Voting Portal and I posted a link to the chat to let people know they could now vote on the spell that would enable New Silver to utilize the higher DC, should it pass.

Governance Response

This week, I was in charge of the Executive Proposal process for the @GovAlpha-Core-Unit. Additionally, I was the first of the team to notice the issue Lucas posted about when I woke up early Saturday morning (local time). I acted in full capacity as a responding Governance Facilitator and did not wait for confirmation from @LongForWisdom to take action. This is one of the stated benefits of having multiple Governance Facilitators, though it also left room for improvement which will be explored in the next headline section.

When I saw Lucas’ post on the morning of the 17th, I brought it to the attention of the mandated actors. From there a coordination effort took place primarily between myself and the @Protocol-Engineering team, where we ran a compressed version of our regular weekly coordination process after putting up a new spell was proposed. After confirming the contents of the spell (namely that the conditional July Budgets would not be included as the July 2nd Executive had already passed), I edited the executive copy and submitted it to the PE team to have its contents hashed for the new spell.

At the time, I believed that I should change as little as possible within the copy, as this did not represent a new executive spell as much as a second attempt to put forth what we had already stated to the community. So in that spirit, I only changed the dates and removed the references to the July Distributions.

After the PE team had crafted the spell and ran all their tests, I completed the rest of my checklist for the Governance Facilitator, adding the new spell address to the Github File, validating the copy for the Voting Portal Front end, and finally adding the proposal to the “active proposals” list so it would be shown at the top of the Voting Portal.

I then posted in the G&R chat channel with the new link and thanked Lucas for pointing out the error. This effectively put up a new proposal with the same aims as the previous one a little over 24 hours later and within 10 hours of the problem being identified in chat.

What Could Have Gone Better

While several successes could be highlighted from this experience, reflecting on what could be improved is of far more value. This section might be added to as we have time to do an internal GovAlpha review, but I wanted to make a post with my reflections ASAP to keep the community as informed as possible.

@equivrel posted some feedback about how the copy of the proposal should have indicated that it was a replacement proposal and that the older proposal should still appear with an erratum message. While the old proposal is still visible on the voting portal, after clicking “view more proposals”, Lev is correct that not enough transparency was given to this switch.

If I had it all to do over, “Replacement” would have been inserted into the new proposal title, and I would have started a forum thread right away to keep people updated who may not be plugged into Rocket Chat. These measures would have made it more clear to the community that the switch has occurred.

As mentioned above, I made a very conscious effort to change as little as possible in the executive copy. This was done both to get the new copy over to PE as quickly as possible and because it was my belief at the time that the replacement would be more legitimate if it was altered as little as possible. In retrospect, the addition of “Replacement Proposal” or some other clear indicator that this copy spell was meant to replace the previous day’s errant one would not have represented a material change to the proposal. Since I was already changing the copy, I should have taken the time to be as transparent as possible with it.

I believe this omission on the Governance side was due almost entirely to my lack of experience as a Governance Facilitator, and hope the community trusts that it was by no means intended to cover up the fact that this proposal was a replacement one. As highlighted earlier, I leaned into my role as Governance Facilitator and took action without waiting for validation from @LongForWisdom. While I think our ability as Governance Facilitators to take full reigns in a situation is a strength to our decentralized organization, it is all but certain that consulting with Long would have yielded more clear communication to the DAO.

We will review my actions as a Core Unit and provide the community with any process changes, decisions, or further areas of improvement that arise from that conversation.

Currently, one voter with less than 2 MKR remains on the old proposal. While limited, this does imply that not everyone has been informed of the replacement. Whether or not a label and forum post would have resulted in all voters changing over to the replacement by now is not strictly relevant, as the transparency to what is happening in Governance is the true measure of a successful DAO. I hope that this post closes the gap in that lack of transparency.

17 Likes

Thank you Payton for the write-up and explanation. I’m glad @spin caught it pretty early.

What would happen, if in the future “Delegates”rush to vote and pass an Executive with a similar human error? I believe the 48 hour delay would help to take action.

All in all—this is definitely a good learning lesson. Thanks again!

2 Likes

This was pretty well managed, but does suggest a plan should be in place for large or time-sensitive mistakes. At least we got the benefit of learn-by-doing on something not catastrophic. So it’s probably a good experience for GovAlpha/PE to have had

2 Likes

Thanks @prose11 for this write-up.

What would be the best documentation available today on how executive spells are (1) defined, (2) crafted, (3) reviewed, and finally (4) published on the voting portal? It will be interesting to learn in which one of these phases the error happened.

Especially given that all spells today still run with “root permissions” on the protocol, this process should be as transparent as possible, and be as inviting as possible to scrutiny.

Additionally we should work on a mechanism that restricts the permissions of the majority of the spells. Root-level spells must then never be rushed and always go through a number of formal reviews.

I know this particular error wouldn’t have been prevented by this mechanism because it’s not an error that breaks through the safety barrier of “safe vs unsafe operations.” But this is besides the point: the thing is that with the root permissions today, it could have been a critical mistake.

Ironically, as we’ll get more efficient in voting through proposals, we will indeed get more efficient in voting through errors as well.

You’re exactly right that the 48 hours delay would be a critical line of defense, but we have to strive to create a system that will literally never have to use this last resort mechanism.

This is the only way that we can create a protocol that will never critically fail. Else it is just a matter of time before it does, because the 48 hours may not be utilized to actually review the spell for critical errors.

—

Another security consideration here is that the correctness of the spells does not even matter if the governance publication mechanism can be hacked.

Is the authorization to publish executive proposals to the governance portal secured in a way that reflects the fact that billions of dollars depend on it?

What would a hacker have to do in order to get a malicious executive proposal published? Or, more subtly, change the address of the spell that is voted on? Through the usual administrative process? Through a backend or front-end hack?

Designing our processes correctly is as critical as writing bug-free smart contracts code.

9 Likes

I can only imagine the number of process that will need to be review and certified in order to avoid what you described above. From managing multisig wallets/how key actors manage their private keys, deploying the correct amount of funding to CUs, correct funding to RWA projects, due diligence on delegate behavior, the list goes on and on…

1 Like

So a lot of this happens on the PE side and maybe @Derek would be the best person to ask (though also they will have a post-mortem coming out where it might be detailed). On the Governance end, I’ll do my best to walk through this process.

I have put some of this info out there in my post Weekly Cadence for the GovAlpha Core Unit but will elaborate here as we more just utilize a template than true documentation for this process.

The Core Units begin their coordination for the Weekly Spell on our Tuesday Mandated actors meeting where we go over what is ready to be added by Friday. This is often up to the PE team in terms of what they can safely get accomplished, but many Core Units play a role in having reports done in time and communicating expectations with outside parties. This helps us figure out what ideally should be in Friday’s spell as well as what will realistically be there.

We utilize a New Spell coordination sheet to keep track of the contents of these items as well as where the teams are in their process throughout the week. That combined with a New Spells chat channel allows us to communicate if anything changes from what was discussed at the Mandated actors meeting.

For your 2) and 3) points it would be best to have PE respond. On the Governance side, we wait for confirmation from the spell crafter that the executive is ready and another person from the PE team to confirm the tests have passed before going on to publishing. The PE team has an internal checklist that is utilized before they get to that point.

I can (hopefully) fully explain 4) publishing to the voting portal. At this point, the coordination efforts should be complete and there is a final checklist from the Gov alpha side that consists of:

  • Using the Voting Portal to check onchain effects
  • Adding the spell address to the file in our public Github Repo
  • Validating the copy with the Voting Portal (essentially checking to make sure the portal will accept the file and display properly when submitted)
  • Changing the active proposals file to display the proposal on the front end.

Under crypto-native collateral, the first step of this process would have indicated that something was wrong with the DC raise as we would not see a line increase included in the onchain effects. However, as Lucas pointed out in his message notifying us of the error two components need to be changed in the MIP21 architecture for AOs to use their credit line. I happen to still have the tab open from the update, so here’s what the replacement spell looked like. Originally only the bottom part was present, which seemed to indicate things were all set on our side:
image

I don’t want to make a long post even longer, but worth clarifying here that the answer is probably not. While it would be incredibly difficult to get a malicious proposal on the front end without anyone knowing (seems the easier path to a successful attack would be without using the front end that anyone could see and question), a change in our public GitHub repo would be all it takes to mess with the voting portal.

Perhaps it is worth exploring having a signature component from an allow-list to put items on the front end, I would be happy to chat about it with dUX. While I do think the current publication process is at more of a risk from a griefing attack than an outright hack, making the process more secure seems like the right thing to do.

Let me know if you feel this still needs to be answered more, essentially the front end could be used to display a malicious spell if the attacker got access to the Github repo, but this is an unlikely line of attack IMO as it broadcasts the intention with the spell. So while that might be the desired outcome in a griefing venture, a truly malicious proposal would likely take place through the smart contract infrastructure rather than the front-end mechanics.

2 Likes