MIP40c3-SP34: Add Data Insights Core Unit Budget (DIN-001)

MIP40c3-SP34: Add Data Insights Core Unit Budget


MIP40c3-SP#: 34
Author(s): Tomek Mierzwa (@tmierzwa)
Tags: core-unit, cu-din-001, budget
Status: Accepted
Date Applied: 2021-09-07
Date Ratified: 2021-10-25

Sentence Summary

This subproposal adds the budget for Data Insights Core Unit. It’s a regular monthly budget of 59,500.00 DAI from November 2021 plus a one-off amount of 97,500.00 DAI for provision of data support between June and October 2021, and a one-off initial amount of 10,000 DAI for incorporation and legal costs of an independent entity to house the Core Unit (more information provided below).



We are proposing this budget for the Data Insights Core Unit to be able to succeed in its mandate to provide a free and permissionless datasets with contextualized and enriched MCD Protocol data and continuously support and empower other members of the DAO and Community in the field of data analytics and data science.

Core Unit ID



The following diagram shows the proposed roadmap for Data Insights Core Unit for the next 6 months. This roadmap will be reviewed monthly and the backlog will be kept public.

Budget considerations

This budget secures:

  • a dedicated team of:
    • CU Facilitator and Product Manager (25%)
    • Two Data Engineers (2 x 100%)
    • One Front-End Designer and Developer (50%)
  • source data:
    • fixed cost of source on-chain and off-chain data purchased from Token Flow Insights SA
    • it covers all data sourcing and preprocessing costs (blockchain nodes, external APIs subscriptions, decoding, integration, quality assurance, etc.)
  • data storage and processing infrastructure (estimated and reviewed quarterly):
    • AWS infrastructure (including but not limited to: EC2, S3, ECS, SES)
    • Snowflake database subscription (including Reader Accounts)
  • operational costs
    • finance, accounting, legal and admin costs required to operate the CU

The distribution of budget across these components is shown below:

Cost component Amount
Team costs DAI 31,500.00
- CU Facilitator & Product Manager (25%) DAI 3,500.00
- Two Data Engineers (2 x 100%) DAI 24,000.00
- One Front-End Designer & Developer (50%) DAI 4,000.00
Data costs DAI 17,500.00
- Decoded source data costs DAI 17,500.00
Infrastructure costs DAI 5,000.00
- AWS infrastructure DAI 2,000.00
- Snowflake subscription DAI 3,000.00
Other costs (outsourced) DAI 5,500.00
- Finance, accounting DAI 3,500.00
- Legal, admin DAI 2,000.00

The total budget asked for the first 6 months from 1 November 2021 is DAI 59,500.00 per month.
The activities and scope of the Core Unit will be revisited during Q1 2022 to see whether it should be expanded again to meet wider Core Unit (and other stakeholder) needs.

Interim budget

Since leaving the Foundation at the end of May 2021 the Token Flow Insights team has continued to maintain and update existing datasets (e.g. vaults history, governance actions history, liquidations history, pricing history), GUIs and APIs (e.g. MCDState.info, MCDGov.info) that are being used by other Core Units (e.g. Risk CU, GovAlpha CU). We have also updated EthTx to support Goerli at the request of Protocol Engineering.

Token Flow has also responded to a lot of ad-hoc queries and answers data-related requests from other CUs and community members. Currently one of our data engineers is almost fully utilized by these activities to support the DAO. This also generates substantial costs of maintaining the required AWS and Snowflake infrastructure.

Since June 2021 all of this has been provided to the DAO from our own budget in order to ensure that there was continuity for the Core Units that use our datasets between us leaving the Foundation and this Data Insights Core Unit proposal working its way through governance. We also understand that this proposal is likely to need at least two more months to be formally accepted by governance during which time Token Flow Insights will continue to maintain the datasets and infrastructure until the new DI Core Unit can take over much of the maintenance.

Monthly costs of maintaining existing products at a minimal level of support during June to October 2021 is DAI 19,500.00 / month and includes:

Cost component Amount
Team costs DAI 12,750.00
- CU Facilitator (10%) DAI 1,500.00
- One Data Engineer (80%) DAI 10,000.00
- One Product Manager (10%) DAI 1,250.00
Data costs DAI 5,000.00
- Decoded source data costs DAI 5,000.00
Infrastructure costs DAI 1,750.00
- AWS infrastructure DAI 1,000.00
- Snowflake subscription DAI 750.00

We have investigated alternative routes to cover these costs with a few of the Core Units and the conclusion after these discussions was that we should add the amount to the budget proposal.

Therefore we ask for an additional one-time payment of 5 months * DAI 19,500.00 = DAI 97,500.00 to cover Token Flow Insights costs for June-October 2021. It would be added to the first monthly payment if this CU and budget proposal is accepted by governance. This payment would then be made by the new DI LLC entity to Token Flow Insights for the historic services prior to the formation of the distinct entity for the Core Unit.

The regular monthly budget would then commence from 1 November 2021.

Incorporation budget

In addition, a one-off incorporation / set-up budget for establishing DI Co of DAI 10,000 is requested in the first disbursement. This budget covers legal fees, incorporation fees. etc.

Budget implementation

This budget implementation is a “Simple Budget Implementation” under MIP40c4.

The first disbursement requested is DAI 464,500 which includes what follows.

One-off Setup / Incorporation Budget / One-off Interim Budget for Data and Services Provided from June to October 2020

  • 107,500 DAI will be transferred to 0x7327Aed0Ddf75391098e8753512D8aEc8D740a1F on November 10, 2020
  • 107,500 DAI =
    • 10,000 DAI one-off setup / incorporation budget for DI Co +
    • 97,500 DAI - one-off interim budget for data and services provided from June to October 2020

Monthly Transfers

  • 357,000 DAI will be streamed to 0x7327Aed0Ddf75391098e8753512D8aEc8D740a1F starting 2021-11-01 and ending 2022-04-30 at a rate of 59,500 DAI per month

Other Considerations

  • For any budget request beyond April 2022, new MIP40c3 subproposals will be submitted

Sir, request for clarification–is that 82,750 DAI PER MONTH, or per 9-MONTHS?

And if I understand this correctly, you want to be REIMBURSED for costs pertaining to June-October 2021 expenses, correct?

Thank you in advanced!


It’s per month: “This MIP adds a Data Insights Core Unit provided as a service from Token Flow Insights with a total monthly budget of 82,750.00 DAI.


Thanks @blimpa for answering this :slight_smile:

Yes indeed, we are asking for:

  • monthly budget of 82,750.00 DAI / month for the first 9 months which is 9 * 82,750.00 DAI = 744 750,00 DAI in total


  • one time reimbursement of Jun-Oct 2021 costs (5 months * 19,500.00 DAI = 97,500.00 DAI).

I will correct the budget subproposal to make it crystal clear.



Thank you for this proposal. It is interesting, I am sure you and your team are very good with numbers.

From my point of view there are two small issues.

  1. I would like to see the demand side. It is not difficult, it looks like this.

CU RWA: for this CU we in DIN do this…
CU Oracles: for this CU we deliver …
CU PE: …and this team cannot function without our…

etc etc. Should be a breeze since you came from the Foundation.

  1. I would like to direct your attention to makerburn.com, upper right quadrant where you can see we are currently running an operational loss. The appetite for a 7-person number-crunching team costing DAI 82,000 per month and with no prospect of being a profit center might be limited. The interim team size however could possibly be more realistic.

Hi @Planet_X, many thanks for your questions.

Some time ago we created a list of areas of our cooperation with CUs (current and discussed).
Let me copy it here, and I’d love other Facilitators to step in to this discussion if they have any comments.

Protocol Engineering CU


  • provided Goerli and Rinkeby support for EthTx


  • smart contracts analytics (calls frequency, gas consumption, reverts)
  • execution simulation for transactions not mined yet
  • state diffs and state reads reporting
  • L2 rollups/proofs decoding on L1
  • L2 transactions execution decoding
  • multichain liquidity tracking tool

Growth CU


  • ad-hoc queries for grouping vaults owners according to vault age / activity


  • support for grants campaign for most loyal / active vaults owners
  • live dashboard with the major protocol information for partners

Risk CU


  • MCDState dataset / GUI / API - detailed vaults history with time machine functionality
  • Liquidations 2.0 dataset / API - detailed auctions history
  • Pricing history for collaterals (market / Oracles)


  • new, better GUI for MCDState and Liquidations 2.0 dataset



  • ongoing support in data roles recruitment (test assignments / interviews)


  • ad-hoc provisioning / analysis of on-chain data

Governance & DUX CU


  • MCDGov dataset / GUI / API - detailed history of governance actions (executives/polls)
  • various ad-hoc queries, e.g. the number of new vault / users over time
  • revealing and hardening the polls results calculation algorithm
  • ongoing support for Snapshot off-chain voting integration


  • Delegate Contracts support in MCDGov
  • MKR token tracker with detailed insight about usage and ownership
  • restoring ‘on-chain effects’ analysis for spells with dedicated EthTx endpoint


  • switching from the old Spock infrastructure to TF Ethereum Data Warehouse

StarkNet Engineering CU

In progress:

  • Gas usage analytics for major protocol contracts to understand how fees reduction could bring value to MKR holders and protocol users
  • Detailed pricing history for collaterals



  • potential migration of financial reporting from Dune to TF datasets

The list above is a quick brain dump and for sure is not complete.

Please note that very important part of our proposal is the continuous delivery of free protocol data to external consumers (community members, other protocols, partners, scientists, etc.).

A nice example can be the research project of The University of Chicago which aims to quantitatively “stress test” DAI and its robustness to large market events, collateral composition shifts, and so on. We’re providing the historical data on MKR and DAI to do this study.

We believe that empowering external data users is very important for the protocol security, sustainability and transparency.

Regarding your second comment about limited appetite for 7-person number-crunching team…

  • our proposal assumes approx. 5 FTE (some roles are part time)
  • crunching numbers is much more important in hard times than in times of prosperity



Yes, this would be fantastic! We have a lot of data needs at DUX, and we have had some issues with our current ETL process. Our team are mostly front end coders, not data scientists, so it’s helpful to have a core unit like Tokenflow to collaborate with. The on-chain effects endpoint is one important component in the verifiability of the governance process. We are focused on finding ways to improve participation by presenting data to our users to help them make informed decisions when voting.


I can confirm that the Risk Core Unit utilizes vault activity and liquidations data provided by @tmierzwa 's team. This data is already highly integrated with our models. Why we think this CU makes sense is because we can’t imagine decoding on-chain data ourselves on top of building complicated risk models and behaviour analytics. We’d probably need to employ extra on-chain data specialist, but this would be costly and there is not many who know MakerDAO technical infrastructure well.

On the other hand I can also say our particular CU needs only data, not dashboards itself. But this may not be true for every CU, because I don’t know if every single of them has a development team such as ours.


Wanted to just comment from GovAlpha’s perspective.

The MCDGov dataset has been a useful reference tool. Most recently we’ve been using it to confirm delegate’s votes while compiling metrics. Without it, we would be directly examining transactions which would definitely take longer and be more error-prone.

There have been a few times I’ve asked for specific data / tables from @tmierzwa, both on GovAlpha’s behalf, or to organise things for others. Every time this happens, @tmierzwa is unfailingly polite, receptive and generally awesome to work with. He asks relevant questions, and always helps to clarify requirements before starting the work.

This is another item that has been useful, and that I believe will need further work in the future. Because the polling system result calculation happens off-chain, any front-end will need to implement their own version of the results algorithm. Without a detailed, public and correct reference implementation, this is not possible.

It’s also worth noting that I initially asked for a specification of the algorithm. @tmierzwa delivered that, and a reference implementation in Python. Correctly determining that having a known-good implementation would make future implementations easier - something I should have realized myself.

I’ve been a little more hands-off with this element. @prose11 and @Elihu have been heading it up. So far as I’m aware, @tmierzwa and co have been present at several meetings, and helped to clarify the requirements around the snapshot ‘voting strategies’ (how snapshot counts votes).

This would be great to have in the future, in a fully featured form. It will help reduce the burden on governance and give voters further reassurance that a spell is doing what they expect it to do.

In the future, I’m confident that having governance data available and accessible will allow us to make more targeted and effective changes to the governance system with the aim of improving participation.

Up until this point, we have lacked easy access to participation metrics, meaning that even if we were to take some action to try to improve participation, we would have no idea as to its effectiveness. The Data Analytics CU can provide this data, allowing us to judge the effectiveness of interventions in a way in which is currently not available to us.

Pursuing these avenues without the Data Insights CU is still possible, but I suspect it would be more expensive, take longer, and ultimately be of lower quality.

Thus far, it has been easier to find someone that can effectively create a front-end or dashboard than it has been to find someone that can source good data. I’m not concerned about the front-end side for GovAlpha either, especially since the Data Insights CU has also delivered functional front end dashboards in several instances up to this point.

TL;DR: Up until now, the Data Insight CU has been fantastic to work with. Having their support available in the future should allow us to take a more evidence based approach to changes to governance processes in the future.


Thank you @LongForWisdom for your very kind words :slight_smile:

You mentioned me personally several times in your reply. But it would be unfair to ignore the other members of our team who did most of the work behind my back. Especially @piotr.klis who manages the ETL pipelines and GUIs for Maker Protocols datasets. Also, every other Token Flow engineer contributes to our products and we believe that today the team is the greatest value.


Hello @tmierzwa – just a few questions here–can you please provide color on how you intend to prioritize across the 7 core units that you have already supported? It sounds like you have been doing ad hoc services for all core units–which cost MakerDAO 19,500 a month (the requested 97,500 reimbursement to cover costs for June-October 2021), and now the expense are increasing to 82,750 DAI per month. I was wondering if you can specify what more will MakerDAO get, and how will you avoid stretching your Team to thin? ( Overextend oneself)

Also, trying to get more clarity on your services. Does your CU offer any actual novel products now, or are you writing them/bootstrapping analytics products with the requested budget? Do you sell data feeds, or just analytics products? If so, will you provide MakerDAO with those services as part of this onboarding proposal to become a Make Core Unit?

Thank you in advance Tomasz!

On behalf of the PECU, I can confirm that we have worked closely and successfully with the proposed Data Insights Team. This has involved Goerli support which was critical for moving our testnet infrastructure off of Kovan. Similarly, their data analysis has been important for understanding and interpreting price movements during volatile market conditions. Likewise, as liquidity moves from L1 to L2, we have been working closely to use their tooling to decode L2 transactions, make optimizations and track liquidity movements. In summary, the Data Insight Team have always been great to work with for obtaining such information that help us make data driven decisions.


Let me express my support for this proposed CU. Proven product and expertise.


Hello @ElProgreso, thank you for the questions. Breaking them out:

can you please provide color on how you intend to prioritize across the 7 core units that you have already supported?

We have good experience of working with multiple stakeholders and use agile processes to manage our backlog of tasks and prioritize what will be worked on in current and upcoming sprints.

We would involve all Core Units in quarterly planning to prioritize our activities, and also invite the CUs to bi-weekly sprint review meetings where we discuss progress, go through any blockers and potential resource or prioritization conflicts, and also get feedback on increments developed during the sprints.

We also propose that we make our backlog and sprint boards public at least at a high level so that the wider community can see what we are working on, with appropriate labeling to show the CU origin of the request and the agreed priority.

I was wondering if you can specify what more will MakerDAO get, and how will you avoid stretching your Team to thin? ( Overextend oneself)

Since leaving the Foundation at the end of May we have been doing the minimum necessary to “keep the lights on” by keeping MCDGov, MCDState, Liquidations live, fixing bugs and then responding to urgent ad-hoc requests such as adding Goerli support to EthTx for Protocol Engineering. In the list in my earlier response to @Planet_X here this is basically the “done” tasks.

This has taken almost one data engineer full time, some of my time and some additional time from the team as well as running the infrastructure.

After approval we would be able to expand the team to be able to include the items in the “Discussed” category in the list above, and the activities more generally described in our MIP39 submission. Our resource and budget estimate is based on a sufficient team to accomplish this with some headroom to respond to the inevitable urgent requests, bugs and situations that need to be resolved.

In terms of not stretching ourselves too thin, growing the team and the agile and transparency processes described above should help us to manage this.

Does your CU offer any actual novel products now, or are you writing them/bootstrapping analytics products with the requested budget? Do you sell data feeds, or just analytics products? If so, will you provide MakerDAO with those services as part of this onboarding proposal to become a Make Core Unit?

So far, we have been keeping existing products on life support rather than developing new ones.

Our main focus is on the data itself (and supporting documentation) but we have developed GUIs where we believe there is value for the community, and where there isn’t another team using the data to build a community focused GUI for that dataset. MCDState is a good example where Risk CU uses the data feed via API and builds their own models on top, but we built the MCDState GUI to allow the wider community to also have an easy way to get access to the analytics and visualizations that the data allows. We would expect that this pattern would continue in the future.

With the increased budget, as mentioned above we would expand the current products to the wider list of activities in MIP39 submission and the specific tasks already in discussion with other Core Units.


I second that and would also add that for L2, having access to off-the-shelf visualizations and analytics around gas consumptions and oracles prices will help leverage the potential that L2s have to offer to improve the user experience and robustness of our price computation methodology

Could you describe exactly what was discussed and a timeline for transitioning to TF datasets? I’m not sure if I’ve seen any data from this CU that RWF has been able to leverage and I’m concerned that it is notably absent from the roadmap.

Is the data provided to the other core units accessible anywhere to the DAO/public?

Also concerned with the language “Any bespoke, ad-hoc and specific (not shared publicly) data needs of other Core Units, external partners and community members are out of scope of Data Insights Core Unit mandate”

My assumption that any data needs for our CU are out of scope and thus must be paid on top of the requested budget? Please confirm/clarify.


As part of our broader effort to bring more transparency to the CU budget structure, we have documented the wallet setup of this CU and others;

Read more about it here: Introducing the CU Budget Transparency Map

1 Like