Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Qingyun Education Fund #31

Closed
QingyunEducation opened this issue Aug 27, 2021 · 86 comments
Closed

[DataCap Application] Qingyun Education Fund #31

QingyunEducation opened this issue Aug 27, 2021 · 86 comments

Comments

@QingyunEducation
Copy link

QingyunEducation commented Aug 27, 2021

Large Dataset Notary Application

To apply for a DataCap allocation for your dataset, please fill out the following information.

Core Information

Please respond to the questions below in pargraph form, replacing the text saying "Please answer here". Include as much detail as you can in your answer!

Project details

Share a brief history of your project and organization.

Qingyun Education Fund of Guangdong Charity Federation was founded in 2021.
Qingyun Education Fund is affiliated to the Guangdong Charity Federation, which consists of six major sections: public education service platform, volunteer learning activities, talent pool plan, teachers and students incentive plan, teaching facilities donation plan, and rural education resource improvement plan.
Relying on AI and blockchain technologies, we bridge the gap betwwen college students volunteers, teachers, and rural teenagers. To provide better teacher resources for students from poor family, who are lack of education resources, and provide rural schools with better teaching facilities and equipment for lack of hardware conditions. To carry out the education support public welfare projects.
Qingyun Education Fund advocates equality in education, and encourages people from all walks of life to help students and voluntarily teach. Relying on AI and blockchain technologies, we provide rich and interesting teaching content through online voluntarily teaching. To help and support to improve the education level in rural areas.

What is the primary source of funding for this project?

Donations from all walks of life.

What other projects/ecosystem stakeholders is this project associated with?

Guangdong Zhongke Zhiyun Technology Co., LTD is our distributed storage partner.

Use-case details

Describe the data being stored onto Filecoin

The data we intend to store in the Filecoin network is publicly available.It mainly includes the following types:
1. Online learning resources
The learning content covers all subjects and sections of K12 field, primary, junior and senior high schools. At present, the accumulated teaching resources, such as test papers, courseware and teaching plans, have reached more than 12 million sets.
2. Public education video
Videos on public welfare assisting in solving education poverty, experience sharinglive-shooting videos on rural helping, rural living conditionspublic service advertisementsand other videos.
3. Youth learning videos
College professional courses, life common sense, cold knowledge and other scientific periodicals videos;
Introduction of experience and interesting experiences in exams, such as civil servants exam, graduate school exam, IELTS, TOEFL, GMAT, etc.
4. K12 Subjects Short Video
Problem solving skills: per video on per problem/test site, video on doing exercise or solving skills (quick solving skills, and unique solving skills, etc.);
Solving skills of difficult problems video class;Video on K12 learning methods, preview and review skills.
5. Quality lectures of well-known lecturers and excellent tutors; public class videos

The format is mainly divided into:① document ② picture ③ video ④ audio ⑤ system

Where was the data in this dataset sourced from?

The sources are divided into two parts: 
Purchased learning materials and learning videos
Learning materials and public service videos recorded by Qingyun Education Fund

Can you share a sample of what is in the dataset? A link to a file, an image, a table, etc., are good examples of this.

Link: https://pan.baidu.com/s/1Pq82sfc1mLebxAo6Cs2dQA 
Extraction code: 93d2

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, this is a public dataset that can be retrieved by anyone on the Network.

What is the expected retrieval frequency for this data?

About once every three months.

For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

We plan to store it for one year in the first phase.

DataCap allocation plan

In which geographies do you plan on making storage deals?

Chinese mainland.

What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?

About 20TiB will onboard per week at present.

How will you be distributing your data to miners? Is there an offline data transfer process?

Online transmission is preferred.

How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

At present, we have cooperated with Guangdong Zhongke Zhiyun Technology Co., LTD. Besides them, we also cooperate with some other miners, from head miners to small miners.
We plan to post the data on our website.

How will you be distributing data and DataCap across miners storing data?

We would divide the data up and send it to different miners.
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@QingyunEducation QingyunEducation changed the title [DataCap Application] [DataCap Application] Qingyun Education Fund Aug 27, 2021
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@MRJAVAZHAO
Copy link

Open education resources sound great.
We'll support it.

@dkkapur
Copy link
Collaborator

dkkapur commented Aug 31, 2021

@QingyunEducation - thanks for submitting your application for a large DataCap allocation to the Fil+ Notaries! As the Notaries and the Fil+ community carries out the due diligence process on your application - we'd love to invite you to participate in the Fil+ Community Governance Call. We're inviting clients applying for large amounts of DataCap to come present and field questions during the second half of each community call to help enable the process. The next set of calls are actually today - Tues Aug 31 at 3pm and 11pm UTC, information to participate can be found here: filecoin-project/notary-governance#221. Looking forward to seeing you there or perhaps at the next one on Sept 14!

@galen-mcandrew
Copy link
Collaborator

Looks like @ozhtdong is also working directly with the client, and there is some more discussion happening in that client onboarding issue.

@ozhtdong
Copy link

Yes, Galen. We had discussed with this client. They are "data partner". Since some traditional clients don't know Filecoin well, so they help these clients with storage and allocation.
We had got their certificate of authorization from clients and their business license. But we are not sure about can this way be accepted by the community.

@ozhtdong
Copy link

@galen-mcandrew Maybe we can have a discussion in next goverance call

@jennijuju
Copy link
Member

The linked website gives me
image
Do you have an updated one maybe?

@QingyunEducation
Copy link
Author

Thanks for your attention.
We are updating our website.
I will update it here if there is any new update.

@galen-mcandrew galen-mcandrew self-assigned this Oct 1, 2021
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@galen-mcandrew
Copy link
Collaborator

Multisig Notary requested

Total DataCap requested

2PiB

Expected weekly DataCap usage rate

20TiB

@large-datacap-requests
Copy link

**Multisig created and sent to RKH f01325111

@lvschouwen
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report1

  • Organization: Qingyun Education Fund
  • Client: f3uils5cdx3ezyzszjjfnulugknbdsanmqtisd7x7xkfcljdnshp4jspnrgxpldt5b4aafuz4q4rkebpjykeha

Approvers

1BlockMakeronline
4MatrixStorage
1MegTei
4MRJAVAZHAO
1ozhtdong
3PluskitOfficial
1Reiers
1swatchliu

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 30% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ 39.66% of total deal sealed by f01852325 are duplicate data.

⚠️ 89.37% of total deal sealed by f01919535 are duplicate data.

⚠️ 44.99% of total deal sealed by f01851482 are duplicate data.

⚠️ 40.96% of total deal sealed by f01852023 are duplicate data.

⚠️ 46.50% of total deal sealed by f01852664 are duplicate data.

⚠️ 40.72% of total deal sealed by f01852677 are duplicate data.

⚠️ 66.07% of total deal sealed by f01169691 are duplicate data.

⚠️ f01169691 has unknown IP location.

⚠️ 25.33% of total deal sealed by f0142721 are duplicate data.

⚠️ f0142721 has unknown IP location.

⚠️ 35.19% of total deal sealed by f0142723 are duplicate data.

⚠️ f0142723 has unknown IP location.

⚠️ 60.53% of total deal sealed by f0442383 are duplicate data.

⚠️ f0442383 has unknown IP location.

⚠️ f0883206 has unknown IP location.

⚠️ f0883202 has unknown IP location.

⚠️ f0883203 has unknown IP location.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01852325 Hong Kong, Central and Western, HK
BIH-Global Internet Harbor
111.66 TiB 18.13% 67.38 TiB 39.66%
f01919535 Hong Kong, Central and Western, HK
HONG KONG BRIDGE INFO-TECH LIMITED
134.97 TiB 21.92% 14.34 TiB 89.37%
f01851482 Busan, Busan, KR
Korea Telecom
81.06 TiB 13.16% 44.59 TiB 44.99%
f01852023 Busan, Busan, KR
Korea Telecom
74.53 TiB 12.10% 44.00 TiB 40.96%
f01852664 Singapore, Singapore, SG
StarHub Ltd
102.22 TiB 16.60% 54.69 TiB 46.50%
f01852677 Morrisville, North Carolina, US
TierPoint, LLC
109.97 TiB 17.86% 65.19 TiB 40.72%
f01169691 Unknown
Unknown
666.00 GiB 0.11% 226.00 GiB 66.07%
f0142721 Unknown
Unknown
343.50 GiB 0.05% 256.50 GiB 25.33%
f0142723 Unknown
Unknown
224.50 GiB 0.04% 145.50 GiB 35.19%
f0442383 Unknown
Unknown
168.50 GiB 0.03% 66.50 GiB 60.53%
f0883206 Unknown
Unknown
256.00 MiB 0.00% 256.00 MiB 0.00%
f0883202 Unknown
Unknown
16.00 MiB 0.00% 16.00 MiB 0.00%
f0883203 Unknown
Unknown
16.00 MiB 0.00% 16.00 MiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 30% of unique data are stored with less than 4 providers.

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
214.90 TiB 501.21 TiB 1 81.40%
28.40 TiB 89.66 TiB 2 14.56%
6.39 TiB 24.90 TiB 3 4.04%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f3w4wlayytfmsay6gu5phhij5r4yyx7t4xxrlosgo
tlmqg5eih3co5atsra4h2pe4qd2d6c76bvhj6nwim
7lgq
GMWBR INC 163.25 TiB 2,712 1BlockMakeronline
1fabriziogianni7
1liyunzhi-666
2MatrixStorage
1MegTei
4MRJAVAZHAO
2PluskitOfficial
2psh0691
1Reiers
1swatchliu
f3q7ablez3jqkcjukwbzaql7lmbx4ldouu66nexpd
cfvu6kgho3v6gricckt77cgr46tdre2l4zmvha7bs
u7qq
MatrixStorage 146.38 TiB 1,844 Unknown
f14nyld75bnvr2y3ca4ew7vxmwp4tuwytqwggthcy Kimoc 89.03 TiB 1,269 Unknown
f3qvn7f5u4z5w5pqx3htckp4jcn5dvgmebq6qkqcz
xdxkicwokf75tt5hbwtrjwldz2sjiyq752ajcn3nd
5tgq
Beijing Haishi Hengtong Technology Co., LTD 67.16 TiB 930 1BlockMakeronline
3MatrixStorage
2MRJAVAZHAO
2PluskitOfficial
2Reiers
2swatchliu
f14uxcyaoab3qhn42kaquqysga6f6zfry3x4nk3ca China Tianying INC. 42.72 TiB 672 1BlockMakeronline
1MatrixStorage
3MRJAVAZHAO
2PluskitOfficial
1Reiers
f17tvb3ejs3ev6owqmkpzomtewgrd6v2sofv7upma Beijing Lexun Technology Co., LTD 40.34 TiB 568 3BlockMakeronline
1MatrixStorage
2MRJAVAZHAO
2PluskitOfficial
1swatchliu
f1pediuk4kncwp4qxawlope7hzfmd2ran35w54o7y Qistone Information technology 36.38 TiB 598 2MatrixStorage
1MetaWaveInfo
2MRJAVAZHAO
4PluskitOfficial
1swatchliu
f3xffjctbyy7zigopfa3za5ha3pvv4z3xfghlw7kw
vyeuabkg4lzsgfwhnghwkvmmvi6yso6k52hq3ca6c
kveq
Chengdu Digital Media Industry Base Co., Ltd. 27.78 TiB 524 2BlockMakeronline
2MatrixStorage
2MRJAVAZHAO
2PluskitOfficial
f3q3eysweh273pwygu27yllyafljznqewpxwlrc7j
daugrpvhmrcnxmkn43wkctxalcg3z42zxas2hinsf
tgva
Beijing Yibo Technology Co., Ltd. 17.84 TiB 272 4MRJAVAZHAO
1PluskitOfficial
1Reiers
1s0nik42
4swatchliu
1XnMatrixSV
f1cuboogcwais57dljrpeltoy6ja2itb7wvwmrl3q Penglaiju 13.25 TiB 249 3BlockMakeronline
1Joss-Hua
1liyunzhi-666
3MatrixStorage
2MRJAVAZHAO
2PluskitOfficial
f3woqxpu6ekmj43nmpcv7j2pgu6lejxtzgxpzl6f2
vrueoqlzjntakyhdkghymyffbzfbsio6dvfmy643x
4y7q
RICH ST PETE LLC 13.00 TiB 202 2BlockMakeronline
1DarnellWashington
1IreneYoung
1MatrixStorage
1MegTei
3MRJAVAZHAO
1psh0691
2Reiers
3swatchliu
1XnMatrixSV
f1elncewt3sh356aop52uvcappblxmf6asbmhxlya Beijing Wanjie Data Technology Co., LTD 12.03 TiB 281 1BlockMakeronline
1MatrixStorage
3MRJAVAZHAO
2PluskitOfficial
f3skkellc7wegakh2blqeu4kkrlzuqy2siymwli7s
ec6eq77c2v4kzgz5ozgnasjl5r52ckameba5kds7h
djda
Meizhai Technology 896.00 GiB 24 1fabriziogianni7
1IreneYoung
1KodaRobotDog
1MRJAVAZHAO
1PluskitOfficial
1Reiers
2swatchliu
1XnMatrixSV
f1budai5eqmk7zjfd54bmnkj72vuc7jykmkkttgha `` 544.00 GiB 2 Unknown
f1iach3ih3q5x5k4lxy6wlysmqm65vwegpxwm4oii Friday Shopping 512.00 GiB 7

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@Sunnyiscoming
Copy link
Collaborator

Hi @QingyunEducation
Please explain the abnormal information.

@QingyunEducation
Copy link
Author

QingyunEducation commented Mar 7, 2023

Hello @Sunnyiscoming ,Sorry for the delay.
From project application to now, we have been looking for sp. As a customer, we only provide data. In this process, seal has been very smooth, because we have no way to verify the seal results. It wasn't until we saw the cid report that we found some problems.
As an early participant in LDN, we were not professional enough, we relied too much on familiarity with sp and their self-description, so there seemed to be some problems in the seal.
Since then, we have strengthened our understanding of the sealing requirements published by the community and intend to pay close attention to the capabilities of future collaborators and the sealing process:
-- provide the sp id, organization, and region
-- Storage provider should not exceed 20-30% of total datacap.
-- Storage provider has a copy of the data stored and ready for retrieval
-- Storage provider should not be storing duplicate data for more than 20%
-- CID sharing is not allowed

@large-datacap-requests
Copy link

large-datacap-requests bot commented Mar 9, 2023

DataCap Allocation requested

Request number 9

Multisig Notary address

f02049625

Client address

f3uils5cdx3ezyzszjjfnulugknbdsanmqtisd7x7xkfcljdnshp4jspnrgxpldt5b4aafuz4q4rkebpjykeha

DataCap allocation requested

160TiB

Id

670118b6-5780-4c2c-9cab-0a35d48a75f4

@large-datacap-requests
Copy link

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3uils5cdx3ezyzszjjfnulugknbdsanmqtisd7x7xkfcljdnshp4jspnrgxpldt5b4aafuz4q4rkebpjykeha

Last two approvers

PluskitOfficial & not found

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

160TiB

Total DataCap granted for client so far

760TiB

Datacap to be granted to reach the total amount requested by the client (2PiB)

1.25PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
20195 7 160TiB 20.13 39.91TiB

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 10 storage providers sealed too much duplicate data - f01852325: 39.66%, f01919535: 89.37%, f01851482: 44.99%, f01852023: 40.96%, f01852664: 46.50%, f01852677: 40.72%, f01169691: 66.07%, f0142721: 25.33%, f0142723: 35.19%, f0442383: 60.53%

⚠️ 7 storage providers have unknown IP location - f01169691, f0142721, f0142723, f0442383, f0883206, f0883202, f0883203

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@flyworker
Copy link

As a notary did early DD back to on Aug 27, 2021, It has been two years,and @QingyunEducation said you still not skilled in onboarding data to SPs.
I suggest you contact #fil-filswan in slack channel to get some help about data onboarding

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Jul 21, 2023
@github-actions
Copy link

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests