Skip to content

Commit

Permalink
Update raw data in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
LadyChristina authored and dimkarakostas committed Aug 22, 2023
1 parent 2d635c2 commit a8f625b
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 60 deletions.
56 changes: 9 additions & 47 deletions docs/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,10 @@ Note that when saving results from BigQuery you should select the option "JSONL

## Sample data & queries

### Bitcoin

Sample raw Bitcoin data are available
[here](https://drive.google.com/file/d/12sd6xn8aZE5yBrUFoxE6vVPFIuE_Sw4F/view?usp=sharing).
Sample data for all blockchains can be found [here](https://uoe-my.sharepoint.com/:f:/g/personal/s2125265_ed_ac_uk/Eg0L2n9P-txOtibKu9CXfloBt6_D-3D1AEsS2evtXIatVA?e=qHhFp4).
Alternatively, one can retrieve the data directly from BigQuery using the queries below.

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:
### Bitcoin

```
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin.transactions`.outputs
Expand All @@ -24,11 +22,6 @@ AND timestamp > '2017-12-31'

### Bitcoin Cash

Sample raw Bitcoin Cash data are available
[here](https://drive.google.com/file/d/1560i6dbicv5h8bqcYMIPNKZTMgT44qNE/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin_cash.transactions`.outputs
FROM `bigquery-public-data.crypto_bitcoin_cash.transactions`
Expand All @@ -39,11 +32,6 @@ AND timestamp > '2018-12-31'

### Cardano

Sample raw Cardano data are available
[here](https://drive.google.com/file/d/1qetBLr7sHccnoTbVTD_t_bCOeWfIjH3D/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT `iog-data-analytics.cardano_mainnet.block`.slot_no as number, `iog-data-analytics.cardano_mainnet.pool_offline_data`.ticker_name as identifiers, `iog-data-analytics.cardano_mainnet.block`.block_time as timestamp,`iog-data-analytics.cardano_mainnet.block`.pool_hash as reward_addresses
FROM `iog-data-analytics.cardano_mainnet.block`
Expand All @@ -53,11 +41,6 @@ WHERE `iog-data-analytics.cardano_mainnet.block`.block_time > '2020-12-31'

### Dogecoin

Sample raw Dogecoin data are available
[here](https://drive.google.com/file/d/1eVHop4ciT0f8veLiy2nEd61AeC1iyd-1/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_dogecoin.transactions`.outputs
FROM `bigquery-public-data.crypto_dogecoin.transactions`
Expand All @@ -68,11 +51,6 @@ AND timestamp > '2019-12-31'

### Ethereum

Sample raw Ethereum data are available
[here](https://drive.google.com/file/d/1i9veHVB8nuitXMXWxk3oTOXY6rI_Ex8a/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT number, timestamp, miner as reward_addresses, extra_data as identifiers
FROM `bigquery-public-data.crypto_ethereum.blocks`
Expand All @@ -81,11 +59,6 @@ WHERE timestamp > '2018-12-31'

### Litecoin

Sample raw Litecoin data are available
[here](https://drive.google.com/file/d/1XR8t0NZ9IfuBBzCGyBw8c_nFyFjSsxd7/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_litecoin.transactions`.outputs
FROM `bigquery-public-data.crypto_litecoin.transactions`
Expand All @@ -96,11 +69,6 @@ AND timestamp > '2018-12-31'

### Tezos

Sample raw Tezos data are available
[here](https://drive.google.com/file/d/1Jn4A2hxanwniiexzHySFa6CFU8aS03mB/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT level as number, timestamp, baker as reward_addresses
FROM `public-data-finance.crypto_tezos.blocks`
Expand All @@ -109,11 +77,6 @@ WHERE timestamp > '2020-12-31'

### Zcash

Sample raw Zcash data are available
[here](https://drive.google.com/file/d/1WEkUAaIaRsGXuemAZj_DbNIlhkWtOn5A/view?usp=sharing).

They can be retrieved using [Google BigQuery](https://console.cloud.google.com/bigquery) with the following query:

```
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_zcash.transactions`.outputs
FROM `bigquery-public-data.crypto_zcash.transactions`
Expand All @@ -122,16 +85,15 @@ WHERE is_coinbase is TRUE
AND timestamp > '2018-12-31'
```

## Automating the data collection process
## Automating the data collection process

Instead of executing each of these queries separately on the BigQuery console and saving the results manually, it is
possible to automate the process using
a [script](https://github.com/Blockchain-Technology-Lab/pooling-analysis/blob/main/src/query.py) and collect all
relevant data in one go. Executing this
script will run all queries
also possible to automate the process using
a [script](https://github.com/Blockchain-Technology-Lab/pooling-analysis/blob/main/src/collect_data.py) and collect all
relevant data in one go. Executing this script will run all queries
in [this file](https://github.com/Blockchain-Technology-Lab/pooling-analysis/blob/main/queries.yaml), so you can also
control which queries are run by adding them
to or removing them from the file.
control which queries are run by adding them to or removing them from the file (or turning them into comments by
adding a `#` in front of each relevant line).

IMPORTANT: the script uses service account credentials for authentication, therefore before running it, you need to
generate the relevant credentials from Google, as described
Expand Down
26 changes: 13 additions & 13 deletions queries.yaml
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
#bitcoin:
# SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin.transactions`.outputs
# FROM `bigquery-public-data.crypto_bitcoin.transactions`
# JOIN `bigquery-public-data.crypto_bitcoin.blocks` ON `bigquery-public-data.crypto_bitcoin.transactions`.block_number = `bigquery-public-data.crypto_bitcoin.blocks`.number
# WHERE is_coinbase is TRUE
# AND timestamp > '2017-12-31'

#bitcoin_cash:
# SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin_cash.transactions`.outputs
# FROM `bigquery-public-data.crypto_bitcoin_cash.transactions`
# JOIN `bigquery-public-data.crypto_bitcoin_cash.blocks` ON `bigquery-public-data.crypto_bitcoin_cash.transactions`.block_number = `bigquery-public-data.crypto_bitcoin_cash.blocks`.number
# WHERE is_coinbase is TRUE
# AND timestamp > '2018-12-31'
bitcoin:
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin.transactions`.outputs
FROM `bigquery-public-data.crypto_bitcoin.transactions`
JOIN `bigquery-public-data.crypto_bitcoin.blocks` ON `bigquery-public-data.crypto_bitcoin.transactions`.block_number = `bigquery-public-data.crypto_bitcoin.blocks`.number
WHERE is_coinbase is TRUE
AND timestamp > '2017-12-31'

bitcoin_cash:
SELECT block_number as number, block_timestamp as timestamp, coinbase_param as identifiers, `bigquery-public-data.crypto_bitcoin_cash.transactions`.outputs
FROM `bigquery-public-data.crypto_bitcoin_cash.transactions`
JOIN `bigquery-public-data.crypto_bitcoin_cash.blocks` ON `bigquery-public-data.crypto_bitcoin_cash.transactions`.block_number = `bigquery-public-data.crypto_bitcoin_cash.blocks`.number
WHERE is_coinbase is TRUE
AND timestamp > '2018-12-31'

cardano:
SELECT `iog-data-analytics.cardano_mainnet.block`.slot_no as number, `iog-data-analytics.cardano_mainnet.pool_offline_data`.ticker_name as identifiers, `iog-data-analytics.cardano_mainnet.block`.block_time as timestamp,`iog-data-analytics.cardano_mainnet.block`.pool_hash as reward_addresses
Expand Down

0 comments on commit a8f625b

Please sign in to comment.