From 3446fd78e4682becfe2f0724f00b8c8395b748f6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Attila=20T=C3=B3th?= Date: Fri, 29 Oct 2021 01:45:08 +0200 Subject: [PATCH] Add nft starter kit tutorial (#553) * Add nft starter kit tutorial * Suggestions from review part 1 * Suggestions from review part 2 * Suggestions from review part 3 Co-authored-by: Lana Brindley * Suggestions from review part 4 Co-authored-by: Lana Brindley * Sample dataset note * Quotes fix * Descriptions above tables * Unspacing tables * Move description above table + links * Add links * Add links * Minor fix * Update EON NFT copy part 2 * Fix title * Remove duplacation * Wrap long lines at around 80 chars * Smart apostrophe * Remove HAVING clause * Fix links * Revert "Merge branch 'nft-starter-kit' of https://github.com/timescale/docs into nft-starter-kit" This reverts commit f4614b0983b120346e22394a323ce08c232d2f30, reversing changes made to 26d4918a71b4da4bef7ac8cabc5f306faa5ceb60. * where clause fix * Minor fix * Update starter kit components text * Revert "Revert "Merge branch 'nft-starter-kit' of https://github.com/timescale/docs into nft-starter-kit"" This reverts commit 84f2b3e60546da7be37a78ebb0cb15acebbf3873. * Update ingest script instructions * Fix file name * Minor tweaks * Ingest procedure minor update * fix link * Add link * Remove between in where clause Co-authored-by: Ryan Booz * Fix CSV file names * Faster query Co-authored-by: Ryan Booz Co-authored-by: Lana Brindley Co-authored-by: Jacob Prall Co-authored-by: Ryan Booz --- .../analyzing-nft-transactions.md | 603 ++++++++++++++++++ .../tutorials/analyze-nft-data/index.md | 64 ++ .../analyze-nft-data/nft-schema-ingestion.md | 295 +++++++++ .../tutorials/page-index/page-index.js | 24 + 4 files changed, 986 insertions(+) create mode 100644 timescaledb/tutorials/analyze-nft-data/analyzing-nft-transactions.md create mode 100644 timescaledb/tutorials/analyze-nft-data/index.md create mode 100644 timescaledb/tutorials/analyze-nft-data/nft-schema-ingestion.md diff --git a/timescaledb/tutorials/analyze-nft-data/analyzing-nft-transactions.md b/timescaledb/tutorials/analyze-nft-data/analyzing-nft-transactions.md new file mode 100644 index 000000000000..0fc143d5c461 --- /dev/null +++ b/timescaledb/tutorials/analyze-nft-data/analyzing-nft-transactions.md @@ -0,0 +1,603 @@ +# Analyzing NFT transactions +When you have successfully collected and ingested the data, it's time to analyze +it. For this analysis, we use data collected with our ingestion script that +contains only successful sale transactions that happened between +1 January 2021 to 12 October 2021 on the OpenSea marketplace, as reported by the +OpenSea API. + +For simplicity, this tutorial analyzes only those transactions that used `ETH` +as their payment symbol, but you can modify the script to include more +payment symbols in your analysis if you want to. + +All the queries in this section, plus some additional ones, are in our +[NFT Starter Kit on GitHub][nft-starter-kit] +in the [`queries.sql` file][queries]. + +We divide our analysis into two parts: simple queries and complex queries. But +first we will create something to speed up our queries: TimescaleDB continuous +aggregates. + + +All queries in this section only include data that's accessible from the +OpenSea API. + + + +## Speeding up queries with continuous aggregates + +TimescaleDB continuous aggregates speed up workloads that need to process large +amounts of data. They look like PostgreSQL materialized views, but have a +built-in refresh policy that makes sure that the data is up to date as new +data comes in. Additionally, the refresh procedure is careful to only refresh +data in the materialized view that actually needs to be changed, thereby +avoiding recomputation of data that did not change. This smart refresh procedure +massively improves the refresh performance of the materialized view and the +refresh policy ensures that the data is always up to date. + +[Continuous aggregates][cont-agg] are often used to speed up dashboards and +visualizations, summarizing data sampled at high frequency, and querying +downsampled data over long time periods. + +This tutorial creates two continuous aggregates to speed up queries on assets +and on collections. + +### Assets continuous aggregates +Create a new continuous aggregate called `assets_daily` that computes and stores +the following information about all assets for each day: `asset_id`, the collection +it belongs to, `daily average price`, `median price`, `sale volume`, `ETH volume`, +`open`, `high`, `low` and `close` prices: + +```sql +/* Asset continuous aggregates */ +CREATE MATERIALIZED VIEW assets_daily +WITH (timescaledb.continuous) AS +SELECT time_bucket('1 day', time) AS bucket, +asset_id, +collection_id, +mean(percentile_agg(total_price)) AS mean_price, +approx_percentile(0.5, percentile_agg(total_price)) AS median_price, +COUNT(*) AS volume, +SUM(total_price) AS volume_eth, +FIRST(total_price, time) AS open_price, +MAX(total_price) AS high_price, +MIN(total_price) AS low_price, +LAST(total_price, time) AS close_price +FROM nft_sales +WHERE payment_symbol = 'ETH' +GROUP BY bucket, asset_id, collection_id +``` + +Add a refresh policy to update the continuous aggregate daily with the latest data, +so that you can save +computation at query time: +```sql +SELECT add_continuous_aggregate_policy('assets_daily', + start_offset => INTERVAL '3 days', + end_offset => INTERVAL '1 day', + schedule_interval => INTERVAL '1 day'); +``` + +### Collections continuous aggregates +Create another continuous aggregate called `collections_daily` that computes and +stores the following information about all collections for each day, +including `daily average price`, `median price`, `sale volume`, `ETH volume`, +`the most expensive nft`, and `the highest price`: + +```sql +/* Collection continuous aggregates */ +CREATE MATERIALIZED VIEW collections_daily +WITH (timescaledb.continuous) AS +SELECT +collection_id, +time_bucket('1 day', time) AS bucket, +mean(percentile_agg(total_price)) AS mean_price, +approx_percentile(0.5, percentile_agg(total_price)) AS median_price, +COUNT(*) AS volume, +SUM(total_price) AS volume_eth, +LAST(asset_id, total_price) AS most_expensive_nft_id, +MAX(total_price) AS max_price +FROM nft_sales +GROUP BY bucket, collection_id; + +/* Refresh policy */ +SELECT add_continuous_aggregate_policy('collections_daily', + start_offset => INTERVAL '3 days', + end_offset => INTERVAL '1 day', + schedule_interval => INTERVAL '1 day'); +``` + +When you are asking questions where daily aggregations can help with the answer, +you can query the continuous aggregate, rather than the raw data in the `nft_sales` +hypertable. This helps speed up the result. + +## Simple queries +You can start your analysis by asking simple questions about NFT sales that +happened in 2021 and answering them using SQL queries. Use these queries +as a starting point for your own further analysis. You can modify each query +to analyze the time-period, asset, collection, or account that you are curious about! + +Where possible, we include dashboard examples from Superset to serve as +inspiration for creating your own dashboard which monitors and analyzes NFT +sales using free, open-source tools. You can find the code used to create each +graph in the [NFT Starter Kit Github repo][nft-starter-kit]. + +### Collections with the highest sales volume +Which collections have the highest volume of sales? Answering this is a great +starting point for finding collections with assets that are frequently traded, +which is important for buyers thinking about the resale value of their NFTs. If +you buy an NFT in one of the collections below, there is a good chance you'll +be able to find a buyer. In this query, you order the collections by total volume +of sales, but you could also order them by ETH volume instead: +```sql +/* Collections with the highest volume? */ +SELECT +slug, +SUM(volume) total_volume, +SUM(volume_eth) total_volume_eth +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +GROUP BY cagg.collection_id, slug +ORDER BY total_volume DESC; +``` +| slug | total_volume | total_volume_eth | +|--------------------|--------------|--------------------| +| sorare | 339776 | 35113.062124036835 | +| rarible | 87594 | 41663.18012651946 | +| art-blocks-factory | 45861 | 43607.73207320631 | +| axie | 43074 | 6692.242340266918 | +| cryptokitties | 41300 | 5560.907800845506 | +| parallelalpha | 36892 | 31212.686399159273 | +| art-blocks | 35976 | 199016.27793424827 | +| ape-gang | 25782 | 4663.009300672081 | +| 24px | 24872 | 3203.9084810874024 | +| pudgypenguins | 24165 | 35949.81731415086 | + +For this query, you take advantage of the pre-calculated data about collections +stored in the `collections_daily` continuous aggregate. You also perform an +`INNER JOIN` on the collections relational table to find the +collection name in human readable form, represented by the `slug`. + +Querying from continuous aggregates is faster and allows you to write shorter, +more readable queries. It is a pattern that you'll use again in this tutorial, +so look out for it! + +### Daily sales of a collection +How many sales took place each day for a certain collection? This query looks +at the daily volume of sales for NFTs in the `cryptokitties` collection. This +can help you find which days the NFT traders have been more active, and help you +spot patterns about which days of the week or month have higher or lower volume and why. + +You can modify this query to look at your favorite NFT collection, such as +`cryptopunks`, `lazy-lions`, or `afrodroids-by-owo`: +```sql +SELECT bucket, slug, volume +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +WHERE slug = 'cryptokitties' +ORDER BY bucket DESC; +``` + +bucket |slug |volume| +-------------------|-------------|------| +2021-10-12 02:00:00|cryptokitties| 48| +2021-10-11 02:00:00|cryptokitties| 61| +2021-10-10 02:00:00|cryptokitties| 84| +2021-10-09 02:00:00|cryptokitties| 73| +2021-10-08 02:00:00|cryptokitties| 56| +... + +Here's what this query would look like as a time-series chart in Apache Superset: + +![daily number of nft transactions](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/daily-number-of-nft-transactions.jpg) + +As a reminder, charts like this are pre-built and ready for you to use and +modify as part of the pre-built dashboards +in our [NFT Starter Kit][nft-starter-kit]. + +### Comparison of daily NFT sales for different collections +How do the daily sales of NFTs in one collection compare to that of another +collection? This query compares the daily sales of two popular NFT collections: +CryptoKitties and Ape Gang, in the past three months: +```sql +/* Daily number of NFT transactions, "CryptoKitties" vs Ape Gang from past 3 months? */ +SELECT bucket, slug, volume +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +WHERE slug IN ('cryptokitties', 'ape-gang') AND bucket > NOW() - INTERVAL '3 month' +ORDER BY bucket DESC, slug; +``` + +bucket |slug |volume| +-------------------|-------------|------| +2021-10-12 02:00:00|ape-gang | 58| +2021-10-12 02:00:00|cryptokitties| 48| +2021-10-11 02:00:00|ape-gang | 208| +2021-10-11 02:00:00|cryptokitties| 61| +2021-10-10 02:00:00|ape-gang | 248| +2021-10-10 02:00:00|cryptokitties| 84| +... + +![comparison of different collections](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/comparison-of-different-collections.jpg) + +This sort of query is useful to track sales activity in collections you're +interested in or own assets in, so you can see the activity of other NFT holders. +Also, you can modify the time-period under consideration to look at larger +(such as 9 months), or smaller (such as 14 days) periods of time. + + +### Snoop Dogg's NFT activity (or individual account activity) +How many NFTs did a particular person buy in a certain period of time? This +sort of query is useful to monitor the activity of popular NFT collectors, +like American rapper Snoop Dogg (or [Cozomo_de_Medici][snoop-dogg-opensea]) or +African NFT evangelist [Daliso Ngoma][daliso-opensea] or even compare trading +patterns of multiple collectors. Since NFT transactions are public on the Ethereum +blockchain and our database contains seller (`seller_account`) and +buyer (`winner_account`) columns as well, you can analyze the purchase +activity of a specific account. + +This query analyzes [Snoop Dogg’s](https://twitter.com/cozomomedici) address to +analyze his trades, but you can edit the query to add any address in the `WHERE` +clause to see the specified account's transactions: +```sql +/* Snoop Dogg's transactions in the past 3 months aggregated */ +WITH snoop_dogg AS ( + SELECT id FROM accounts + WHERE address = '0xce90a7949bb78892f159f428d0dc23a8e3584d75' +) +SELECT +COUNT(*) AS trade_count, +COUNT(DISTINCT asset_id) AS nft_count, +COUNT(DISTINCT collection_id) AS collection_count, +COUNT(*) FILTER (WHERE seller_account = (SELECT id FROM snoop_dogg)) AS sale_count, +COUNT(*) FILTER (WHERE winner_account = (SELECT id FROM snoop_dogg)) AS buy_count, +SUM(total_price) AS total_volume_eth, +AVG(total_price) AS avg_price, +MIN(total_price) AS min_price, +MAX(total_price) AS max_price +FROM nft_sales +WHERE payment_symbol = 'ETH' AND ( seller_account = (SELECT id FROM snoop_dogg) OR winner_account = (SELECT id FROM snoop_dogg) ) +AND time > NOW()-INTERVAL '3 months' +``` + +trade_count|nft_count|collection_count|sale_count|buy_count|total_volume_eth |avg_price |min_price|max_price| +-----------|---------|----------------|----------|---------|------------------|------------------|---------|---------| + 59| 57| 20| 1| 58|1835.5040000000006|31.110237288135604| 0.0| 1300.0| + +From the result of the query, we can see that Snoop Dogg made 59 trades overall in the past 3 months (bought 58 times, +and sold only once). His trades included 57 individual NFTs and 23 collections, totaling 1835.504 ETH spent, with +minimum paid price of 0 and max of 1300 ETH. + +### Most expensive asset in a collection +Whats the most expensive NFT in a certain collection? This query looks at a +specific collection (CryptoKitties) and finds the most expensive NFT sold from it. +This can help you find the rarest items in a collection and look at the properties +that make it rare in order to help you buy items with similar properties from that collection: +```sql +/* Top 5 most expensive NFTs in the CryptoKitties collection */ +SELECT a.name AS nft, total_price, time, a.url FROM nft_sales s +INNER JOIN collections c ON c.id = s.collection_id +INNER JOIN assets a ON a.id = s.asset_id +WHERE slug = 'cryptokitties' AND payment_symbol = 'ETH' +ORDER BY total_price DESC +LIMIT 5 +``` + +nft |total_price|time |url | +---------------|-----------|-------------------|-----------------------------------------------------------------------| +Founder Cat #40| 225.0|2021-09-03 14:59:16|https://opensea.io/assets/0x06012c8cf97bead5deae237070f9587f8e7a266d/40| +Founder Cat #17| 177.0|2021-09-03 01:58:13|https://opensea.io/assets/0x06012c8cf97bead5deae237070f9587f8e7a266d/17| +润龙🐱‍👓创世猫王44# | 150.0|2021-09-03 02:01:11|https://opensea.io/assets/0x06012c8cf97bead5deae237070f9587f8e7a266d/44| +grey | 149.0|2021-09-03 02:32:26|https://opensea.io/assets/0x06012c8cf97bead5deae237070f9587f8e7a266d/16| +Founder Cat #38| 148.0|2021-09-03 01:58:13|https://opensea.io/assets/0x06012c8cf97bead5deae237070f9587f8e7a266d/38| + +### Daily ETH volume of assets in a collection +What is the daily volume of Ether (ETH) for a specific collection? Using the +example of CryptoKitties, this query calculates the daily total ETH spent in +sales of NFTs in a certain collection: + +```sql +/* Daily ETH volume of CryptoKitties NFT transactions? */ +SELECT bucket, slug, volume_eth +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +WHERE slug = 'cryptokitties' +ORDER BY bucket DESC; +``` +bucket |slug |volume_eth | +-------------------|-------------|-------------------| +2021-10-12 02:00:00|cryptokitties| 1.6212453906698892| +2021-10-11 02:00:00|cryptokitties| 1.8087566697786246| +2021-10-10 02:00:00|cryptokitties| 2.839395250444516| +2021-10-09 02:00:00|cryptokitties| 4.585460691370447| +2021-10-08 02:00:00|cryptokitties| 5.36784615406771| +2021-10-07 02:00:00|cryptokitties| 16.591879406085422| +2021-10-06 02:00:00|cryptokitties| 11.390538587035808| +... + +![daily eth volume of assets](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/daily-eth-volume-of-assets.jpg) + + +This graph uses a logarithmic scale, which you can configure in the graph's settings in Superset. + + +### Comparison of daily ETH volume of multiple collections +How does the daily volume of ETH spent on assets in one collection compare to +others? This query uses CryptoKitties and Ape Gang as examples, to find the daily +ETH spent on buying assets in those collections in the past three months. You +can extend this query to monitor and compare the daily volume spent on your +favorite NFT collections and find patterns in sales: + +```sql +/* Daily ETH volume of NFT transactions: CryptoKitties vs Ape Gang? */ +SELECT bucket, slug, volume_eth +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +WHERE slug IN ('cryptokitties', 'ape-gang') AND bucket > NOW() - INTERVAL '3 month' +ORDER BY bucket, slug DESC; +``` + +bucket |slug |volume_eth | +-------------------|-------------|------------------| +2021-10-12 02:00:00|ape-gang | 54.31030000000001| +2021-10-12 02:00:00|cryptokitties|1.6212453906698896| +2021-10-11 02:00:00|ape-gang |205.19786218340954| +2021-10-11 02:00:00|cryptokitties|1.8087566697786257| +2021-10-10 02:00:00|ape-gang | 240.0944201232798| +2021-10-10 02:00:00|cryptokitties| 2.839395250444517| +... + +![comparison-daily-eth-volume-collections](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/comparison-daily-eth-volume-collections.jpg) + + +The graph above uses a logarithmic scale, which we configured in the graph's +settings in Superset. + + +### Daily mean and median sale price of assets in a collection +When you are analyzing the daily price of assets in a specific collection, two +useful statistics to use are the mean price and the median price. This query +finds the daily mean and median sale prices of assets in the CryptoKitties collection: +```sql +/* Mean vs median sale price of CryptoKitties? */ +SELECT bucket, slug, mean_price, median_price +FROM collections_daily cagg +INNER JOIN collections c ON cagg.collection_id = c.id +WHERE slug = 'cryptokitties' +ORDER BY bucket DESC; +``` + +bucket |slug |mean_price |median_price | +-------------------|-------------|--------------------|---------------------| +2021-10-12 02:00:00|cryptokitties| 0.03377594563895602| 0.00600596459124994| +2021-10-11 02:00:00|cryptokitties|0.029651748684895486| 0.008995758681494385| +2021-10-10 02:00:00|cryptokitties| 0.03380232441005376| 0.00600596459124994| +2021-10-09 02:00:00|cryptokitties| 0.06281453001877325| 0.010001681651251936| +2021-10-08 02:00:00|cryptokitties| 0.09585439560835196| 0.010001681651251936| +... + +![daily mean median](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/daily-mean-median.jpg) + +Since calculating the mean and median are computationally expensive for large +datasets, we use the [`percentile_agg` hyperfunction][percentile-agg], a SQL +function that is part of the Timescale Toolkit extension. It accurately +approximates both statistics, as shown in the definition of `mean_price` and +`median_price` in the continuous aggregate we created earlier in the tutorial: + +```sql +CREATE MATERIALIZED VIEW collections_daily +WITH (timescaledb.continuous) AS +SELECT +collection_id, +time_bucket('1 day', time) AS bucket, +mean(percentile_agg(total_price)) AS mean_price, +approx_percentile(0.5, percentile_agg(total_price)) AS median_price, +COUNT(*) AS volume, +SUM(total_price) AS volume_eth, +LAST(asset_id, total_price) AS most_expensive_nft, +MAX(total_price) AS max_price +FROM nft_sales s +GROUP BY bucket, collection_id; +``` + +### Daily total volume of top buyers +What days do the most prolific accounts buy on? To answer that question, you +can analyze the top five NFT buyer accounts based on the number of NFT purchases, +and their total daily volume of NFT bought over time. This is a good starting +point to dig deeper into the analysis, as it can help you find days when something +happened that made these users buy a lot of NFTs. For example a dip in ETH prices, +leading to lower gas fees, or drops of high anticipated collections: +```sql +/* Daily total volume of the 5 top buyers */ +WITH top_five_buyers AS ( + SELECT winner_account FROM nft_sales + GROUP BY winner_account + ORDER BY count(*) DESC + LIMIT 5 +) +SELECT time_bucket('1 day', time) AS bucket, count(*) AS total_volume FROM nft_sales +WHERE winner_account IN (SELECT winner_account FROM top_five_buyers) +GROUP BY bucket +ORDER BY bucket DESC +``` + +![volume top buyers](https://assets.timescale.com/docs/images/tutorials/nft-tutorial/volume-top-buyers.jpg) + +## Complex queries +Let's take a look at some more complex questions you can ask about the NFT +dataset, as well as more complex queries to +retrieve interesting things. + +### Calculating 30-min mean and median sale prices of highest trade count NFT from yesterday +What are the mean and median sales prices of the highest traded NFT from the +past day, in 30-minute intervals? + +```sql +/* Calculating 15-min mean and median sale prices of highest trade count NFT on 2021-10-17 */ +WITH one_day AS ( + SELECT time, asset_id, total_price FROM nft_sales + WHERE time >= '2021-10-17' AND time < '2021-10-18' AND payment_symbol = 'ETH' +) +SELECT time_bucket('30 min', time) AS bucket, +assets.name AS nft, +mean(percentile_agg(total_price)) AS mean_price, +approx_percentile(0.5, percentile_agg(total_price)) AS median_price +FROM one_day +INNER JOIN assets ON assets.id = one_day.asset_id +WHERE asset_id = (SELECT asset_id FROM one_day GROUP BY asset_id ORDER BY count(*) DESC LIMIT 1) +GROUP BY bucket, nft +ORDER BY bucket DESC; +``` + +bucket |nft |mean_price |median_price | +-------------------|--------------|-------------------|--------------------| +2021-10-17 23:30:00|Zero [Genesis]| 0.06| 0.06002456177152414| +2021-10-17 23:00:00|Zero [Genesis]| 0.118| 0.1180081944620535| +2021-10-17 22:30:00|Zero [Genesis]| 0.0785333333| 0.06002456177152414| +2021-10-17 22:00:00|Zero [Genesis]| 0.0775| 0.09995839119153871| +2021-10-17 21:30:00|Zero [Genesis]| 0.0555| 0.05801803032917102| + +This is a more complex query which uses PostgreSQL Common Table Expressions (CTE) +to first create a sub-table of the data from the past day, called `one_day`. +Then you use the hyperfunction time_bucket to create 30-minute buckets of our data +and use the [percentile_agg hyperfunction][percentile-agg] to find the mean and +median prices for each interval period. Finally, you JOIN on the `assets` table +to get the name of the specific NFT in order to return it along with the mean and +median price for each time interval. + +### Daily OHLCV data per asset + +Open-high-low-close-volume (OHLCV) charts are most often used to illustrate the +price of a financial instrument, most commonly stocks, over time. You can create +OHLCV charts for a single NFT, or get the OHLCV values for a set of NFTs. + +This query finds the OHLCV for NFTs with more than 100 sales in a day, as well +as the day on which the trades occurred: + +```sql +/* Daily OHLCV per asset */ +SELECT time_bucket('1 day', time) AS bucket, asset_id, +FIRST(total_price, time) AS open_price, LAST(total_price, time) AS close_price, +MIN(total_price) AS low_price, MAX(total_price) AS high_price, +count(*) AS volume +FROM nft_sales +WHERE payment_symbol = 'ETH' +GROUP BY bucket, asset_id +HAVING count(*) > 100 +ORDER BY bucket +LIMIT 5; +``` + +bucket |asset_id|open_price|close_price|low_price |high_price|volume| +-------------------|--------|----------|-----------|-----------|----------|------| +2021-02-03 01:00:00|17790698| 0.56| 1.25| 0.07| 7.0| 148| +2021-02-05 01:00:00|17822636| 7.0| 0.7| 0.7| 8.4| 132| +2021-02-11 01:00:00|17927258| 0.8| 0.2| 0.1| 2.0| 103| +2021-02-26 01:00:00|18198072| 0.1| 0.1| 0.1| 0.1| 154| +2021-02-26 01:00:00|18198081| 0.25| 0.25| 0.25| 0.25| 155| + +In this query, you used the TimescaleDB hyperfunctions [`first()`][first-docs] and +[`last()`][last-docs] to find the open and close prices respectively. These +hyperfunctions allow you to find the value of one column as ordered by another, +by performing a sequential scan through their groups. In this case, you get the +first and last values of the `total_price` column, as ordered by +the `time` column. [See the docs for more information.][first-docs] + +If you want to run this query regularly, you can create a continuous aggregate +for it, which greatly improves the query performance. Moreover, you can remove +the `LIMIT 5` and replace it with an additional WHERE clause filtering for a +specific time-period to make the query more useful. + +### Assets with the biggest intraday price change +Which assets had the biggest intraday sale price change? You can identify +interesting behaviour such as an asset being bought and then sold again for a +much higher (or lower) amount within the same day. This can help you +identify good flips of NFTs, or perhaps owners whose brand elevated the +NFT price thanks to it being part of their collection. + +This query finds the assets with the biggest intraday sale price change in the +last six months: +```sql +/* Daily assets sorted by biggest intraday price change in the last 6 month*/ +WITH top_assets AS ( + SELECT time_bucket('1 day', time) AS bucket, asset_id, + FIRST(total_price, time) AS open_price, LAST(total_price, time) AS close_price, + MAX(total_price)-MIN(total_price) AS intraday_max_change + FROM nft_sales s + WHERE payment_symbol = 'ETH' AND time > NOW() - INTERVAL '6 month' + GROUP BY bucket, asset_id + ORDER BY intraday_max_change DESC + LIMIT 5 +) +SELECT bucket, nft, url, + open_price, close_price, + intraday_max_change +FROM top_assets ta +INNER JOIN LATERAL ( + SELECT name AS nft, url FROM assets a + WHERE a.id = ta.asset_id +) assets ON TRUE;``` +``` + +bucket |nft |url |open_price|close_price|intraday_max_change| +-------------------|--------------|------------------------------------------------------------------------------|----------|-----------|-------------------| +2021-09-22 02:00:00|Page |https://opensea.io/assets/0xa7206d878c5c3871826dfdb42191c49b1d11f466/1 | 0.72| 0.9999| 239.2889| +2021-09-23 02:00:00|Page |https://opensea.io/assets/0xa7206d878c5c3871826dfdb42191c49b1d11f466/1 | 0.9999| 1.14| 100.0| +2021-09-27 02:00:00|Skulptuur #647|https://opensea.io/assets/0xa7d8d9ef8d8ce8992df33d8b8cf4aebabd5bd270/173000647| 25.0| 90.0| 65.0| +2021-09-25 02:00:00|Page |https://opensea.io/assets/0xa7206d878c5c3871826dfdb42191c49b1d11f466/1 | 1.41| 1.475| 61.3| +2021-09-26 02:00:00|Page |https://opensea.io/assets/0xa7206d878c5c3871826dfdb42191c49b1d11f466/1 | 1.48| 4.341| 43.05| + +## Resources and next steps +This section contains information about what to do when you've completed the +tutorial, and some links to more resources. + +### Claim your limited edition Time Travel Tigers NFT +The first 20 people to complete this tutorial can earn a limited edition NFT +from the +[Time Travel Tigers collection][eon-collection], for free! + +Now that you’ve completed the tutorial, all you need to do is answer the questions +in [this form][nft-form] (including the challenge question), and we’ll send one +of the limited-edition Eon NFTs to your ETH address (at no cost to you!). + +You can see all NFTs in the Time Travel Tigers collection live on [OpenSea][eon-collection]. + +### Build on the NFT Starter Kit +Congratulations! You’re now up and running with NFT data and TimescaleDB. Check out +our [NFT Starter Kit][nft-starter-kit] to use as your starting point to +build your own, more complex NFT analysis projects. + +The Starter Kit contains: +* A data ingestion script, which collects real-time data from OpenSea and ingests it into TimescaleDB +* A sample dataset, to get started quickly, if you don't want to ingest real-time data +* A schema for storing NFT sales, assets, collections, and owners +* A local TimescaleDB database, pre-loaded with sample NFT data +* Pre-built dashboards and charts in [Apache Superset][superset] and [Grafana][grafana] +for visualizing your data analysis +* Queries to use as a starting point for your own analysis + +### Learn more about how to use TimescaleDB to store and analyze crypto data +Check out these resources for more about using TimescaleDB with crypto data: +* [Analyze cryptocurrency market data][analyze-cryptocurrency] +* [Analyzing Analyzing Bitcoin, Ethereum, and 4100+ other cryptocurrencies using PostgreSQL and TimescaleDB][analyze-bitcoin] +* [Learn how TimescaleDB user Messari uses data to open the crypto economy to everyone][messari] +* [How one TimescaleDB user built a successful crypto trading bot][trading-bot] + + +[cont-agg]: /how-to-guides/continuous-aggregates +[nft-starter-kit]: https://github.com/timescale/nft-starter-kit +[queries]: https://github.com/timescale/nft-starter-kit/blob/master/queries.sql +[snoop-dogg-opensea]: https://opensea.io/Cozomo_de_Medici +[daliso-opensea]: https://opensea.io/daliso +[superset]: https://superset.apache.org +[grafana]: https://grafana.com + +[percentile-agg]: /api/:currentVersion:/hyperfunctions/percentile-approximation/percentile_agg/ +[first-docs]: /api/:currentVersion:/hyperfunctions/first/ +[last-docs]: /api/:currentVersion:/hyperfunctions/last + +[analyze-cryptocurrency]: /tutorials/analyze-cryptocurrency-data/ +[analyze-bitcoin]: https://blog.timescale.com/blog/analyzing-bitcoin-ethereum-and-4100-other-cryptocurrencies-using-postgresql-and-timescaledb/ +[messari]: https://blog.timescale.com/blog/how-messari-uses-data-to-open-the-cryptoeconomy-to-everyone/ +[trading-bot]: https://blog.timescale.com/blog/how-i-power-a-successful-crypto-trading-bot-with-timescaledb/ +[eon-collection]: https://opensea.io/collection/time-travel-tigers-by-timescale +[nft-form]: https://docs.google.com/forms/d/e/1FAIpQLSdZMzES-vK8K_pJl1n7HWWe5-v6D9A03QV6rys18woGTZr0Yw/viewform?usp=sf_link \ No newline at end of file diff --git a/timescaledb/tutorials/analyze-nft-data/index.md b/timescaledb/tutorials/analyze-nft-data/index.md new file mode 100644 index 000000000000..0565c1818bb7 --- /dev/null +++ b/timescaledb/tutorials/analyze-nft-data/index.md @@ -0,0 +1,64 @@ +# Analyze Non-fungible token (NFT) sales data + +This tutorial is a step-by-step guide to collecting, storing, and analyzing NFT +([non-fungible tokens][nft-wiki]) sales data from the largest NFT marketplace, +[OpenSea][opensea]. + +NFTs, like much of the data related to blockchains and cryptocurrencies, can +seem complicated at first, but in this tutorial we take you from zero to NFT hero +and give you a foundation for analyzing NFT trends. + +This tutorial shows you how to: +* Design a schema for NFT transactions +* Ingest time-series NFT data and additional relevant relational data +* Query the dataset using PostgreSQL and TimescaleDB to unlock insights from the data + +## NFT Starter Kit + +This tutorial is part of the [Timescale NFT Starter Kit][starter-kit], designed +to get you up and running with analyzing NFT data, and give you the inspiration +to build your own, more complex projects. +The NFT Starter Kit contains: +* Data ingestion script, which collects historical data from OpenSea and ingests +it into TimescaleDB +* Sample dataset, to get started quickly, if you don't want to wait too much time ingesting data +* Schema for storing NFT sales, assets, collections, and accounts +* Local TimescaleDB database, pre-loaded with sample NFT data +* Pre-built dashboards and charts in [Apache Superset][superset] and [Grafana][grafana] +for visualizing your data analysis +* Queries to use as a starting point for your own analysis + +To get started, clone the NFT Starter Kit [Github repo][starter-kit] and follow +along with this tutorial. + +## Complete this tutorial. Earn an NFT! +Because we love NFTs as much as you do, we created [Time Travel Tigers][eon-collection], +a limited edition set of 20 +NFT about our Timescale mascot, Eon! The first 20 people to complete this tutorial +can earn a limited edition NFT from +the collection, for free! + +Claiming your NFT is simple. All you need to do is complete the tutorial below, +answer the questions in [this form][nft-form], and we’ll send one of the +limited-edition Eon NFTs to your ETH address (at no cost to you!). + +You can see all NFTs in the Time Travel Tigers collection live on [OpenSea][eon-collection]. + +1. [NFT schema design and ingestion][nft-schema] +1. [Analyzing NFT transactions][nft-analyze] + +## Prerequisites +* TimescaleDB ([installation options][install-ts]) +* Psql or any other PostgreSQL client (for example DBeaver, or PgAdmin) + + +[nft-wiki]: https://en.wikipedia.org/wiki/Non-fungible_token +[opensea]: https://opensea.io +[starter-kit]: https://github.com/timescale/nft-starter-kit +[install-ts]: /how-to-guides/install-timescaledb/ +[nft-schema]: /tutorials/analyze-nft-data/nft-schema-ingestion +[nft-analyze]: /tutorials/analyze-nft-data/analyzing-nft-transactions +[superset]: https://superset.apache.org +[grafana]: https://grafana.com +[eon-collection]: https://opensea.io/collection/time-travel-tigers-by-timescale +[nft-form]: https://docs.google.com/forms/d/e/1FAIpQLSdZMzES-vK8K_pJl1n7HWWe5-v6D9A03QV6rys18woGTZr0Yw/viewform?usp=sf_link \ No newline at end of file diff --git a/timescaledb/tutorials/analyze-nft-data/nft-schema-ingestion.md b/timescaledb/tutorials/analyze-nft-data/nft-schema-ingestion.md new file mode 100644 index 000000000000..a42a1b2c6857 --- /dev/null +++ b/timescaledb/tutorials/analyze-nft-data/nft-schema-ingestion.md @@ -0,0 +1,295 @@ +# NFT schema design and ingestion +A properly designed database schema is essential to efficiently store and +analyze data. This tutorial uses NFT time-series data with multiple supporting +relational tables. + +To help you get familiar with NFT data, here are some of the questions that +could be answered with this dataset: +* Which collections have the highest trading volume? +* What’s the number of daily transactions of a given collection or asset? +* Which collections have the most trading volume in Ether (ETH)? +* Which account made the most NFT trades? +* How are the mean and median sale prices correlated? + +One theme across all these questions is that most of the insights are about the +sale itself, or the aggregation of sales. So you need to create a schema which +focuses on the time-series aspect of the data. It's also important to make sure +that you can JOIN supporting tables, so you can more easily make queries that +touch both the time-series and the relational tables. TimescaleDB's PostgreSQL +foundation and full-SQL support allows you to easily combine time-series and +relational tables during your analysis. + +## Tables and field descriptions +You need these tables: + +TimescaleDB hypertable: +* **nft_sales**: successful NFT transactions + +Relational tables (regular PostgreSQL tables): +* **assets**: unique NFT items +* **collections**: NFT collections +* **accounts**: NFT trading accounts/users + +### The nft_sales table +The `nft_sales` table contains information about successful sale transactions +in time-series form. One row represents one successful sale event on the +OpenSea platform. + +* `id` field is a unique field provided by the OpenSea API. +* `total_price` field is the price paid for the NFTs in ETH (or other +cryptocurrency payment symbol available on OpenSea). +* `quantity` field indicates how many NFTs were sold in the transaction +(can be more than 1). +* `auction_type` field is NULL by default, unless the transaction happened +as part of an auction. +* `asset_id` and `collection_id` fields can be used to JOIN the supporting +relational tables. + +| Data field | Description | +|---|---| +| id | OpenSea ID (unique) | +| time | Time of the sale | +| asset_id | ID of the NFT, FK: assets(id) | +| collection_id | ID of the collection this NFT belongs to, FK: collections(id)) | +| auction_type | Auction type ('dutch', 'english', 'min_price') | +| contract_address | Address of the smart contract | +| quantity | NFT quantity sold | +| payment_symbol | Payment symbol (usually ETH, depends on the blockchain where the NFT is minted) | +| total_price | Total price paid for the NFT | +| seller_account | Seller's account, FK: accounts(id) | +| from_account | Account used to transfer from, FK: accounts(id) | +| to_account | Account used to transfer to, FK: accounts(id) | +| winner_account | Buyer's account, FK: accounts(id) | + +### The assets table +The `assets` table contains information about the assets (NFTs) that are in the +transactions. One row represents a unique NFT asset on the OpenSea platform. + +* `name` field is the name of the NFT, and is not unique. +* `id` field is the primary key, provided by the OpenSea API. +* One asset can be referenced from multiple transactions (traded multiple times). + +| Data field | Description | +|---|---| +| id | OpenSea ID (PK) | +| name | Name of the NFT | +| description | Description of the NFT | +| contract_date | Creation date of the smart contract | +| url | OpenSea URL of the NFT | +| owner_id | ID of the NFT owner account, FK: accounts(id) | +| details | Other extra data fields (JSONB) | + +### The collections table +The `collections` table holds information about the NFT collections. One row +represents a unique NFT collection. +One collection includes multiple unique NFTs (that are in the `assets` table). + +* `slug` field is a unique identifier of the collection. + +| Data field | Description | +|---|---| +| id | Auto-increment (PK) | +| slug | Slug of the collection (unique) | +| name | Name of the collection | +| url | OpenSea url of the collection | +| details | Other extra data fields (JSONB) | + + +### The accounts table +The `accounts` table includes the accounts that have participated in at least +one transaction from the nft_sales table. +One row represents one unique account on the OpenSea platform. + +* `address` is never NULL and it’s unique +* `user_name` is NULL unless it’s been submitted on the OpenSea profile by the user + +| Data field | Description | +|---|---| +| id | Auto-increment (PK) | +| user_name | OpenSea user name | +| address | Account address, unique | +| details | Other extra data fields (JSONB) | + +## Database schema +The data types used in the schema for this tutorial have been determined based +on our research and hands-on experience working with the OpenSea API and the +data pulled from OpenSea. Start by running these SQL commands to create the schema. +Alternatively, you can download and run the `schema.sql` +file from our [NFT Starter Kit GitHub repository][nft-schema]. +```sql +CREATE TABLE collections ( + id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY, + slug TEXT UNIQUE, + name TEXT, + url TEXT, + details JSONB +); + +CREATE TABLE accounts ( + id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY, + user_name TEXT, + address TEXT UNIQUE NOT NULL, + details JSONB +); + +CREATE TABLE assets ( + id BIGINT PRIMARY KEY, + name TEXT, + collection_id BIGINT REFERENCES collections (id), -- collection + description TEXT, + contract_date TIMESTAMP WITH TIME ZONE, + url TEXT UNIQUE, + img_url TEXT, + owner_id BIGINT REFERENCES accounts (id), -- account + details JSONB +); + +CREATE TYPE auction AS ENUM ('dutch', 'english', 'min_price'); +CREATE TABLE nft_sales ( + id BIGINT, + "time" TIMESTAMP WITH TIME ZONE, + asset_id BIGINT REFERENCES assets (id), -- asset + collection_id BIGINT REFERENCES collections (id), -- collection + auction_type auction, + contract_address TEXT, + quantity NUMERIC, + payment_symbol TEXT, + total_price DOUBLE PRECISION, + seller_account BIGINT REFERENCES accounts (id), -- account + from_account BIGINT REFERENCES accounts (id), -- account + to_account BIGINT REFERENCES accounts (id), -- account + winner_account BIGINT REFERENCES accounts (id), -- account + CONSTRAINT id_time_unique UNIQUE (id, time) +); + +SELECT create_hypertable('nft_sales', 'time'); + +CREATE INDEX idx_asset_id ON nft_sales (asset_id); +CREATE INDEX idx_collection_id ON nft_sales (collection_id); +CREATE INDEX idx_payment_symbol ON nft_sales (payment_symbol); +``` + +### Schema design +The `id` field in each table is `BIGINT` because its storage size is 8 bytes in +PostgreSQL (as opposed to `INT`’s 4 bytes) which is needed to make sure this +value doesn’t overflow. + +For the `quantity` field we suggest using numeric or decimal (which works the +same way in PostgreSQL) as the data type, because in some edge cases we +experience transactions where the quantity was too big even for BIGINT. + +`total_price` needs to be `double precision` because NFT prices often include +many decimals, especially in the case of Ether (ETH) and similar cryptocurrencies +which are, functionally, infinitely divisible. + +We created an `ENUM` for `auction_type` as this value can only be 'dutch', +'english', or 'min_price', representing the different types of auctions used +to sell an NFT. + +We decided to not store all the data fields that are available from the +OpenSea API, only those that we deem interesting or useful for future analysis. +But we still wanted to keep all of the unused data fields somewhere close, +so we added a `details` JSONB column to each relational table. This column +contains additional information about the record. For example, it includes a +`background_color` as a field for the assets. + +Note: In our sample dataset, we chose not to include the JSONB data to keep the +size of the dataset easily managable. If you want a dataset with the full JSON +data included, you need to fetch the data directly from the OpenSea API +(see below for steps). + +## Ingest NFT data +When you have your database and schema created, you can ingest some data to play +with! You have two options to ingest NFT data for this tutorial: + +* Fetch data directly from the OpenSea API +* Download sample data and import it + + +### Fetch data directly from the OpenSea API +To ingest data from the OpenSea API, you can use the `opensea_ingest.py` script included +in the starter kit repository on GitHub. The script connects to the OpenSea +API `/events` endpoint, and fetches data from the specified time period (no API +key required!). + + + +### Fetching data directly from the OpenSea API + +1. Clone the nft-starter-kit repository on Github: + ```bash + git clone https://github.com/timescale/nft-starter-kit.git + cd nft-starter-kit + ``` +1. Create a new Python virtual environment and install the requirements: + ```bash + virtualenv env && source env/bin/activate + pip install -r requirements.txt + ``` +1. Replace the parameters in the `config.py` file: + ```python + DB_NAME="tsdb" + HOST="YOUR_HOST_URL" + USER="tsdbadmin" + PASS="YOUR_PASSWORD_HERE" + PORT="PORT_NUMBER" + OPENSEA_START_DATE="2021-10-01T00:00:00" # example start date (UTC) + OPENSEA_END_DATE="2021-10-06T23:59:59" # example end date (UTC) + ``` +1. Run the Python script: + ```python + python opensea_ingest.py + ``` + This will start ingesting data in batches, 300 rows at a time: + ```bash + Start ingesting data between 2021-10-01 00:00:00+00:00 and 2021-10-06 23:59:59+00:00 + --- + Fetching transactions from OpenSea... + Data loaded into temp table! + Data ingested! + Data has been backfilled until this time: 2021-10-06 23:51:31.140126+00:00 + --- + ``` + You can stop the ingesting process anytime (Ctrl+C), otherwise the script + will run until all the transactions have been ingested from the given time period. + + + +### Download sample NFT data +You can download and insert sample CSV files that contain NFT sales data from +1 October 2021 to 7 October 2021. + + + +### Downloading sample NFT data + +1. Download sample [CSV files containing one week of sample data][sample-data]. +1. Uncompress the ZIP file: + ```bash + unzip nft_sample.zip + ``` +1. Connect to your database: + ```bash + psql -x "postgres://host:port/tsdb?sslmode=require" + ``` + If you're using Timescale Cloud, the instructions under `How to Connect` provide a + customized command to run to connect directly to your database. +1. Import the CSV files in this order (it can take a few minutes in total): + ```bash + \copy accounts FROM 001_accounts.csv CSV HEADER; + \copy collections FROM 002_collections.csv CSV HEADER; + \copy assets FROM 003_assets.csv CSV HEADER; + \copy nft_sales FROM 004_nft_sales.csv CSV HEADER; + ``` + + + +After ingesting NFT data, you can try running some queries on your database: +```sql +SELECT count(*), MIN(time) AS min_date, MAX(time) AS max_date FROM nft_sales +``` + +[nft-starter-kit]: https://github.com/timescale/nft-starter-kit +[ingest-script]: https://github.com/timescale/nft-starter-kit/blob/master/opensea_ingest.py +[sample-data]: https://assets.timescale.com/docs/downloads/nft_sample.zip +[nft-schema]: https://github.com/timescale/nft-starter-kit/blob/master/schema.sql \ No newline at end of file diff --git a/timescaledb/tutorials/page-index/page-index.js b/timescaledb/tutorials/page-index/page-index.js index adaa3bcdf5b6..34cd44089068 100644 --- a/timescaledb/tutorials/page-index/page-index.js +++ b/timescaledb/tutorials/page-index/page-index.js @@ -25,6 +25,30 @@ module.exports = [ keywords: ['cryptocurrency', 'finance', 'tutorial', 'TimescaleDB'], excerpt: 'Use TimescaleDB to analyze cryptocurrency data', }, + { + title: 'Analyze Non-fungible token (NFT) sales data', + href: 'analyze-nft-data', + tags: ['nft', 'finance', 'analyze', 'learn', 'timescaledb'], + keywords: ['NFT', 'finance', 'tutorial', 'TimescaleDB'], + excerpt: 'Analyze NFT sales data with TimescaleDB', + children: [ + { + title: 'NFT schema design and ingestion', + href: 'nft-schema-ingestion', + tags: ['nft', 'finance', 'analyze', 'learn', 'timescaledb'], + keywords: ['NFT', 'finance', 'tutorial', 'TimescaleDB'], + excerpt: 'Design schema and ingest NFT data with TimescaleDB', + }, + { + title: 'Analyzing NFT transactions', + href: 'analyzing-nft-transactions', + tags: ['nft', 'finance', 'analyze', 'learn', 'timescaledb'], + keywords: ['NFT', 'finance', 'tutorial', 'TimescaleDB'], + excerpt: + 'Analyze NFT data transactions with PostgreSQL and hyperfunctions in TimescaleDB', + } + ], + }, { title: 'Analyze intraday stock data', href: 'analyze-intraday-stocks',