-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tracking issue] http delegated routing used in production #9150
Comments
2022-08-19 conversation: Tasks involved in finishing this off:
Total estimate: 13 ideal dev days Assuming a velocity of .7 ideal dev day to actual day, that puts us at 19 calendar days assuming just one person working on this. Next steps here:
|
@ajnavarro : I moved over the estimates from #9188 (comment) to here. Can you help with the issue cleanup here? Specifically:
|
|
@ajnavarro : thanks. I updated "# 3" above. I believe the confusion stemmed from the fact that I wrote "#2" which got translated to issue "#2" in the repo. |
Total estimate is 13 ideal dev days |
@ajnavarro : should #9157 combine into #9079 or will the config work be a separate PR. If it will be separate then I agree we should have two issues. |
@BigLep I think we can do them separately. |
|
|
I was not sure where "productizing" |
@guseggert : I added the important area to the done criteria we were missing last week of "ipfs.io gateway operators are able to assess reframe call health with metrics". Feel free to create a separate tracking issue for this work or do it here. |
I've expanded the done checklist to account for the migration from reframe to the updated HTTP delegated routing API: ipfs/go-delegated-routing#63 . I have updated the title as well. |
Pending verification that clientside metrics are flowing for ipfs.io gateway and cid.contact in https://github.com/protocol/bifrost-infra/issues/2183 |
2023-04-25 maintainer conversation: |
I suspect https://github.com/ipfs/boxo/blob/main/routing/http/client/transport.go#L24-L28 could be an issue:
|
IIUC the body is not fully drained in the case where the limit is reached, which is indeed a bug, but are there actually records in cid.contact with >1 MiB of providers? |
My other theory is that it is related to the accelerated DHT client. When I run locally with FullRT client, no reqs are made to cid.contact. |
I don't think this explains things - we have cases of providers that only are publishing to cid.contact. Just because you have some peers from FullRT you don't need to close the channel / cancel the query on the indexer side, right? (in the same way that when indexers gets requests requesting a cascade to the DHT they don't cancel the dht cascade once we've returned responses from the index.) |
~30% of queries to kubo on gateways end up timing out with no peers, so we should still see those at minimum reaching the indexers, right? |
Okay there's a plumbing bug in Kubo's rats nest of content routing plumbing that is causing this issue, when the experimental DHT client is turned on then the cid.contact router is not used. Working on a fix, could take me a few hours to untangle that mess. |
Thanks @guseggert for the update. As you are working on this, friendly reminder that we plan to move "accelerated DHT" out of experimental in the next release. If there are followups that need to occur, fee free to drop them to #9703 In terms of releasing this, I assume we'll have it part of 0.20? (We'll talk during 2023-04-27 standup. Alternative is that Bifrost uses a custom branch until this come in a future release.) A challenge here will be code review given team members being out. I think our best case is that @lidel take a look at it Thursday, 2023-04-27. |
i would push to get an RC onto bifrost and validate things are working as expected before releasing |
Agreed we should one-box this on the gateways before releasing to make sure it works, otherwise the feedback loop is brutal |
2023-04-27 maintainer conversation: this is still the top priority of @guseggert that he is actively working on. We don't have a PR yet. |
Okay I have working code now and can confirm that the default HTTP routers are being invoked alongside the FullRT client, will open a PR shortly and the plan is for @Jorropo to take a look on Monday |
Thanks @guseggert . Could you please also give directions on what Bifrost would need to test it? I'd like to get their test/verification happening in parallel. |
2023-05-02 update: The fix has deployed to "one box" on ipfs.io gateway. Integration looks good. These are going to get rolled out to the whole fleet over the next day or two. |
I'm closing this issue because ipfs.io gateway <> cid.contact integration is rolled out (see https://github.com/protocol/bifrost-infra/issues/2183#issuecomment-1533342956) and confirmed to be working. |
Done Criteria
Notes
This corresponds with "stage 1" in https://www.notion.so/pl-strflt/Indexer-IPFS-Reframe-Q3-Plan-77d0f3fa193b438784d2d85096e315e7
It doesn't matter if STI gives empty results or if the results are unreachable because they're to Filecoin Storage Providers that don't allow data transfer with Bitswap. Enabling retrieval over Bitswap for Filecoin SP's is a separate effort.
Reframe shipped in Kubo 0.14 as part of #8775. That said, it isn't used in production that we know of. The purpose of this tracking issue is to track followups that we have already identified as needing to be done or that come up as a result of attempting production deployment.
Expanded scope for items observed while in production:
Other followup issues that should be done, but aren't directly tied to the done criteria.
proto.DelegatedRouting_Client.GetIPNS
is never used. go-delegated-routing#30The text was updated successfully, but these errors were encountered: