Building a Creator-Month Panel Dataset
N.B. I’m not sure this post will be interesting to anyone besides me (unless you also happen to be doing some very related analysis with Flipside!)
Today, I will be continuing my series of posts about an ongoing web3 academic research project. Building on my last post, I will today be building a creator-level panel dataset. The goal is to create a dataset with the following measure:
- $NewProjects_{it}$: the number of new NFT project created by creator $i$ with a first mint event in month $t$.
Project ownership will be defined as outlined in my previous post. The inclusion criteria for the panel will be as follows1:
- All creators with at least one NFT asset that was sold between
2021-01-01
and2022-01-01
(based on Flipside’sez_nft_sales
table). - All months from
2022-03
through2023-12
.
This straightforward in principle; however, the query logic starts to get a bit long, so let’s build it up in pieces. First, I will build a table called first_mint_events
with one record per project that ever has a mint event according to Flipside; I also join in some additional date info at this stage:
This yields a dataset of 274,197
distinct NFT projects. Next, I will link projects in this table to the addresses of their creators using the method outlined here; this will create a table I will call first_mint_events_w_creators
:
This yields a project-level table (still with 274,197
records), and 161,660
distinct creators2. There are no cases where an NFT contract is not linked to a creator address (which is good because someone had to deploy it). This table is important for me because it links projects with creators.
Next, I need to find the set of creators who meet my historical filtering logic – in particular, I only want to include creators with at least one NFT asset that was sold between 2021-01-01
and 2022-01-01
. For this, I combine Flipside’s ez_nft_sales
with my first_mint_events_w_creators
table, to create a project-level table called sales_history
with only projects (and their creators) that sold at least once during the desired period (I also make a creator-level version):
I find that there are 8697
distinct projects that appear during this transaction period and 6097
distinct creators. I do note that there are 159
NFT projects exclude by my INNER JOIN
here; these are projects for which I did not have an associated “mint” event in the database; I am not sure what explains the presence of projects that are sold at some point but never have a mint event. For now, I exclude them since they are a relatively small portion of projects.
Finally, we are ready to create our creator-month level panel. To do this, I CROSS JOIN
a set of month-level observations with the sales_history_creator_level
table, and then left join on a creator-month aggregation of first_mint_events_w_creators
and finally filter out creators who never have any new project creations during the entire 15-month period. If you want the whole query in all its CTE glory, here it is (I am using this as an excuse to try a collapsible code block):
Expand for the query!
The final table has 1098
distinct creators observed over 22
months for a total of 24,156
creator-month observations. This may not be the cleanest or most efficient way to build the dataset I am interested; however, what is most important to me is getting the logic correct. It would be nice if Flipside allowed analysts to build their own intermediate tables in at least some limited way; that would be easier than working with all these CTEs and would make it easier to confirm that intermediate logic steps are working properly. I have been told that there is some support for using LiveQuery to access the output of one query in another; however, I haven’t explored this (one reference here).
Anyhow, that’s all for me today. In my next post on this theme, I will try adding even more fields to this dataset, and maybe finally get to some analysis.
Footnotes
-
These may seem like arbitrary criteria; however, they are motivated by other aspects of my analysis plan which have not yet been documented on this site! ↩
-
Note, these numbers are slightly larger than my previous post because I am including a slightly longer date range. ↩