-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocks (CAR) of firehose commits could be missed #186
Comments
Full stacktrace with
|
Let's isolate from the feed-generator and try to reproduce it with this example: https://github.com/MarshalX/atproto/blob/main/examples/firehose/process_commits.py i changed the 90th line from this:
to this: client = FirehoseSubscribeReposClient(params, 'wss://bsky.network/xrpc') and run locally on a mac; the error doesn't happen at least at start. could you try too? |
the source of error comes from iroh-car lib: but the main question, for now, is what the value of "commit.blocks" passed to |
I tried running the firehose for a bit and yep, it looks like an empty binary string. These are the arguments in the commit that broke it:
I get the same results when doing CAR.from_bytes(b''): |
my firehose example still running fine. idk what kind of commit you are receiving :( |
That's pretty strange! My code is a little different for handling the firehose. I removed saving the cursor, and it certainly doesn't continually update the client's params (this was back before we had a wonderful rust library to speed up the python sdk.) Could that be the issue? There's also only one worker thread (because of AWS limitations, can't use multiprocessing.Queue) and stuff is passed from the main thread to the worker with a multiprocessing.Pipe object instead. I'd be surprised if that's an issue, though, as I bet Queue uses pipes to communicate between threads too... I'll try running the minimal example linked above (process_commits.py) for a while and see if I can reproduce the issue on my machine. I at least now have a bugfix that stops |
It didn't crash after 150 minutes, so I assume the issue has something to do with my specific code (maybe the cursor stuff? not sure.) Either ways, thanks for the help! ^^ |
We stumbled upon the same error and for now circumvent the error by just catching the panic (which throws a BaseException and not a normal Exception). This way the malformed message just gets ignored and so far it runs stable since 3 days now.
|
Could you try to replace try-catch with "if not commit.blocks: return" pls |
This is almost exactly what I've been doing (and the feed is stable), may be a bit safer than having a catch all for all kinds of exception:
|
@emilyhunt Did you have to change anything to not do multiprocessing? I think I may have that same limitation and my firehose has not been acting right in production. |
Do you mean with not using a |
pls add if statement like i did in the updated firehose example. feed generator repo has been updated too. i am closing it, thanks 42b74d4 |
Hello from 2024. I am not satisfied with this at all. It is getting worse. Gonna reopen it. We need to find the real root of the problem and do something with it. I've started and got something around here bluesky-social/atproto#2893 |
One step closer. My PR has been merged. The investigation continues here: bluesky-social/indigo#780 |
As of today, I switched the astronomy feeds
FirehoseSubscribeReposClient
to use the new BGS atwss://bsky.network/xrpc
. I am getting intermittent errors on a very small fraction of posts whenatproto.CAR.from_bytes
is called on certain commits. This happens once every ~10 minutes (so must be caused by only a small fraction of posts); otherwise, the feed is running fine.This is on Ubuntu 20.04, with Python 3.11.4 and atproto 0.0.30. This is the commit in the GitHub repo where issues started happening.
Stack trace of the error:
I've added more logging on the feed and set
RUST_BACKTRACE=1
. Will update this issue if I can work out which commits are causing the problem.The text was updated successfully, but these errors were encountered: