Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handleAddProvider messages don't appear in log tail after some time #6296

Closed
dokterbob opened this issue May 5, 2019 · 8 comments
Closed
Assignees
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@dokterbob
Copy link
Contributor

Version information:

go-ipfs version: 0.4.20-
Repo version: 7
System version: amd64/linux
Golang version: go1.12.4

Type: bug

Description:

In order to index items, at ipfs-search, we're continuously tailing the logs and filtering for handleAddProvider messages through the following command:
ipfs log tail | jq -r 'if .Operation == "handleAddProvider" then .Tags.key else empty end'

It seems that, with 0.4.19 as well as 0.4.20 (and possibly prior versions), after outputting hashes for about 2 hours, IPFS suddenly stops outputting anything. I am yet to discover after how much time this happens.

What I do know, is that it's not a matter of restart ipfs log tail and I've also confirmed that there's plenty of other messages. Also, manually requesting hashes should result in new messages, but none seem to appear. Perhaps I'm missing something?

Ref: ipfs-search/ipfs-search#108

@dokterbob dokterbob changed the title handleAddProvider messages don't appear in logs after some time handleAddProvider messages don't appear in log tail after some time May 5, 2019
@michaelavila michaelavila self-assigned this May 14, 2019
@michaelavila
Copy link
Contributor

michaelavila commented May 21, 2019

@dokterbob a few things:

  1. Have you confirmed any versions prior to 0.4.19 behaved differently?
  2. I know that the providing in IPFS can quickly get backed up. I don't know if that's the cause here, but I'm bringing it up to see if you've ruled it out? And if not, I think that's worth doing.
  3. What's the behavior over time for providing during the 2 hours you are seeing those handleAddProvider messages? Do you see a steady stream of handleAddProvider messages at somewhat regular intervals then suddenly none? Do you see a lot of handleAddProvider messages at the beginning and then they slowly taper off? Something else?

Thanks for posting the issue

@michaelavila michaelavila added kind/bug A bug in existing code (including security flaws) need/author-input Needs input from the original author labels May 21, 2019
@Stebalien
Copy link
Member

@dokterbob 0.4.21 fixed a long-standing issue with providers where the internal provider logic would start doing more and more work over time, eventually stalling any provider-related operations. Could you try upgrading to 0.4.21 or, preferably, 0.4.22-rc1?

@dokterbob
Copy link
Contributor Author

dokterbob commented Jul 25, 2019 via email

@Stebalien
Copy link
Member

@dokterbob unfortunately, I'm quite sure that 0.4.22 won't help (it's a small patch release with a few bug fixes).

@dokterbob
Copy link
Contributor Author

dokterbob commented Aug 25, 2019

@dokterbob a few things:

  1. Have you confirmed any versions prior to 0.4.19 behaved differently?

Not fully sure, before 0.4.19 there was no connection manager - it would typically just soak up resources until being shut down by the OOM every 15 minutes or so.

  1. I know that the providing in IPFS can quickly get backed up. I don't know if that's the cause here, but I'm bringing it up to see if you've ruled it out? And if not, I think that's worth doing.

We're not intentionally providing anything - but I suppose we're getting a lot of DHT flowing through. What do you suggest I do to look into this?

  1. What's the behavior over time for providing during the 2 hours you are seeing those handleAddProvider messages? Do you see a steady stream of handleAddProvider messages at somewhat regular intervals then suddenly none? Do you see a lot of handleAddProvider messages at the beginning and then they slowly taper off? Something else?

Would it suffice for you if I gave you a log file with times so you can do the timeseries yourself? I'm quite stuck on time these days...

I've just upgraded to 0.4.22, curious to see how this behaves.

@dokterbob
Copy link
Contributor Author

Since the 0.4.22 I can't seem to keep the daemon running long enough to replicate the problem, there seems to be a serious issue with resource constraints (or lack thereof, ref #6286).

@Stebalien
Copy link
Member

@aschmahmann was correct in #6286 (comment). Effectively, the entire network is trying to connect to you all at once.

@Stebalien
Copy link
Member

Closing as this could have been one of many issues and the DHT has been rewritten (mostly) in 0.5.0.

Please comment if this is still an issue and I'll reopen.

@Stebalien Stebalien removed the need/author-input Needs input from the original author label Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

3 participants