Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Fly Brain Dataset #1013

Closed
Megan008 opened this issue Sep 25, 2022 · 39 comments
Closed

[DataCap Application] Fly Brain Dataset #1013

Megan008 opened this issue Sep 25, 2022 · 39 comments
Assignees
Labels

Comments

@Megan008
Copy link

Megan008 commented Sep 25, 2022

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: Public Data-FlyBrain
  • Website / Social Media:https://registry.opendata.aws/janelia-flylight/
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB):3PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB):100TiB
  • On-chain address for first allocation:f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

I have participated in some projects and hackathon. I have experience on it.

What is the primary source of funding for this project?

Personal income.

What other projects/ecosystem stakeholders is this project associated with?

No.

Use-case details

Describe the data being stored onto Filecoin

It consists of fluorescence images of Drosophila melanogaster driver lines, aligned to standard templates, and stored in formats suitable for rapid searching in the cloud.

Where was the data in this dataset sourced from?

This data set is made available by Janelia's FlyLight project.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://registry.opendata.aws/janelia-flylight/

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, it's a public dataset

What is the expected retrieval frequency for this data?

Multiple times.

For how long do you plan to keep this dataset stored on Filecoin?

2 years.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

North america; Korea; China.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

75% data will be distributed by offline data transfer. Other data will use online transfer for distributing with storage providers who close to me.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

I would let 1 sp who used to cooperate with me for this deal. Now I'm chatting with other sps. f023495, f0508988

How will you be distributing deals across storage providers?

I have communicated with 4 sp. In first time, I will divide 1/4 data to each sp. If I find out more sp, I will decrease the percentage of deals to them --- for decentralized storage.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes.
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@raghavrmadya
Copy link
Collaborator

Screen Shot 2022-10-07 at 8 26 39 AM

The data set is already stored on slingshot and the size does not justify 5 PiBs. Closing this application now

@Megan008
Copy link
Author

Megan008 commented Oct 9, 2022

@raghavrmadya It is a public dataset so that anyone can store it not only for slingshot. The actual size of this dataset is close to 300 TiB, and you can know the size of it via the following code--- aws s3 ls s3://janelia-flylight-imagery/ --no-sign-request --recursive --summarize
342762
The reason I apply for 5 PiB is that I want to store about 10 copies at least and prepare for possible error of storing in the future. Now I changed my total request into 3PiB.
Can you help reopen my issue and let me prepare my application again? Thank you!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@raghavrmadya
Copy link
Collaborator

Datacap Request Trigger

Total DataCap requested

3PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

@large-datacap-requests
Copy link

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

DataCap allocation requested

50TiB

@raghavrmadya
Copy link
Collaborator

Can you also share how your data preparation is different than the existing data set?

@Megan008
Copy link
Author

I have enough bandwidth and hard-disks for downloading data.

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceb4nfhtjp4yy7x7727zbok7byj37w622iw3smzecxbziu4jhcei64

Address

f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

Datacap Allocated

50.00TiB

Signer Address

f1ktlkcxnmzxcdaoqfsunrg3vocfbmgv4n3mrn74a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb4nfhtjp4yy7x7727zbok7byj37w622iw3smzecxbziu4jhcei64

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceav5s7orw7urk6ajd5zvisud75u7sfw6fhuyb5ddrsvxb23d36gri

Address

f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

Datacap Allocated

50.00TiB

Signer Address

f17xdri3wunqgld7dm23e4f3eqsntjakwc47xjo6i

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceav5s7orw7urk6ajd5zvisud75u7sfw6fhuyb5ddrsvxb23d36gri

@large-datacap-requests
Copy link

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

Last two approvers

MetaWaveInfo & UnionLabs2020

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

800TiB

Total DataCap granted for client so far

750TiB

Datacap to be granted to reach the total amount requested by the client (3PiB)

2.26PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
18957 6 800TiB 44.95 90.40TiB

@filplus-checker
Copy link

DataCap and CID Checker Report1

  • Organization: Public Data-FlyBrain
  • Client: f1cf2pp6bgu6vvxfuau6bsmiibhrc7v3gvjsyseay

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 25% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ f01969789 has sealed 46.65% of total datacap.

⚠️ 79.86% of total deal sealed by f01969789 are duplicate data.

⚠️ 60.84% of total deal sealed by f01344987 are duplicate data.

⚠️ 89.94% of total deal sealed by f0136399 are duplicate data.

⚠️ 46.19% of total deal sealed by f01841131 are duplicate data.

⚠️ 89.91% of total deal sealed by f0229199 are duplicate data.

⚠️ 89.75% of total deal sealed by f0681068 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01969789 Hong Kong, Central and Western, HK 304.91 TiB 46.65% 61.41 TiB 79.86%
f01344987 Hong Kong, Central and Western, HK 119.31 TiB 18.25% 46.72 TiB 60.84%
f0136399 Hong Kong, Central and Western, HK 115.50 TiB 17.67% 11.63 TiB 89.94%
f01841131 Hong Kong, Central and Western, HK 59.06 TiB 9.04% 31.78 TiB 46.19%
f0229199 Hong Kong, Central and Western, HK 34.69 TiB 5.31% 3.50 TiB 89.91%
f0681068 Hong Kong, Central and Western, HK 20.13 TiB 3.08% 2.06 TiB 89.75%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 25% of unique data are stored with less than 4 providers.

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
132.16 TiB 571.09 TiB 1 87.38%
12.47 TiB 82.50 TiB 2 12.62%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f1znd7uramgqquelcgefo5xgozopd3nga73hhywpy BotSmart 82.94 TiB 343 LDN v3 multisig
f1tgii4xgcp6um4n2w7eqaw6bwvnvyii2u7cjgzka Meson.Network 54.34 TiB 157 LDN v3 multisig
f127l5waa4pw6k3sm4e7wxevc4fe5uz7dyt4mxa3i Pangod 52.69 TiB 309 LDN v3 multisig
f1quofpadzcqlonmz7v3mv7dfqvm5hdztucdsjqsy HENAN 863 SOFTWARE CO., LTD 47.63 TiB 141 LDN v3 multisig
f1vbe7ze5yknpgu3orinv4zu4rkofsyqfsmpejpui HELIOS 46.56 TiB 151 LDN v3 multisig
f1wr6rwwqckh6um2hynaym2t4mniev5b675kbf5ni Ctrip Global Shopping 46.47 TiB 230 LDN v3 multisig
f1ugo3abkmmb4pb2atxz5oqqgwsd27b4p6k52f2va Yuepass Technology Company Limited 42.09 TiB 285 LDN v3 multisig
f1bb2z36lpq3pnwiiowiraagpzqnpow4bonjacx7a Hola Space 23.41 TiB 75 LDN v3 multisig
f1ibuglt2lzlyf7vnmtzmuykcfgl2pn2fxym5dibi Appstest 21.88 TiB 175 LDN v3 multisig
f1csetl7nor3qie2cehx7axf2ai3nedmowj53xwsa NOAA GOES---Piero 18.81 TiB 154 LDN v3 multisig
f1vykg3elgzoa3lpzf5xo6rbcghzr5ifat757mucq Hailiang Mingyou Online 10.81 TiB 71 LDN # 51
f1a2rdwwor3kq6mv7nveuxhux7rxtj6iyjs7hfswa Worldkan 6.59 TiB 42 LDN v3 multisig
f1rkmhotssjif6ucrosls7oewjz6pr2v2eygfjyui Weipaitang 5.66 TiB 34 LDN v3 multisig
f1fq6abg47ifgeee2z7q2rps3tvknoo2ztcoqy7ai DaYe Art Tuition Class (DKArt) 864.00 GiB 5 LDN v3 multisig
f3u3unadf654vezf62cd4jo6r7h6qpkx26g5amcdc
3oe6rmpmk2nfosfd2kjkdhj4ndvr626gsm7fhpmt7
gg2q
Runtu Information Technology 608.00 GiB 4 Steven Li

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@BDEio
Copy link

BDEio commented Jan 12, 2023

@Megan008 Hi! Great to see that you have gotten approval for DataCap!
BDE is a verified deals auction house helping you to get paid storing your valuable data with reliable storage providers. If you need any help, please get in touch.

Client f01925065 does not follow the datacap usage rules. More info here.
This application has been failing the requirements for 7 days.
Please take appropiate action to fix the following DataCap usage problems.

Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 8%. This should be greater than 25%. To find out more about CID checker score please look at this issue: filecoin-project/notary-governance#986

Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find Website / Social Media field in the information provided
We could not find Total amount of DataCap being requested (between 500 TiB and 5 PiB) field in the information provided
We could not find Weekly allocation of DataCap requested (usually between 1-100TiB) field in the information provided
We could not find On-chain address for first allocation field in the information provided
We could not find Data Type of Application field in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

Copy link

RootKeyHolders have approved multisig account. You can now request first datacap release

1 similar comment
Copy link

RootKeyHolders have approved multisig account. You can now request first datacap release

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

14 participants