Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: putting same entry twice not supported #521

Closed
AskAlexSharov opened this issue Feb 20, 2023 · 18 comments · Fixed by #522
Closed

panic: putting same entry twice not supported #521

AskAlexSharov opened this issue Feb 20, 2023 · 18 comments · Fixed by #522

Comments

@AskAlexSharov
Copy link

Screenshot 2023-02-20 at 13 57 54

@vyzo
Copy link
Collaborator

vyzo commented Feb 20, 2023

Ouch, seems our newly incorporated cache has a bug.

@Wondertan
Copy link
Contributor

Our nodes are crashing because of this, and we need this fixed ASAP. If no one is working on this, we can take it,

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

sure, go ahead.

Correct behaviour is to ignore the message.

@Wondertan
Copy link
Contributor

Wondertan commented Feb 21, 2023

The Has on cache is checked here with correct synchronization. However, false from Has can be returned even if the message already in the cache, because the message is outdated(time since first seen > TTL)

@Wondertan
Copy link
Contributor

Wondertan commented Feb 21, 2023

We can either:

  • Simply don't panic and ignore the message
  • Or sweep the message in place on Has when the time since first seen > TTL
    • This way panic will never be triggered

@Wondertan
Copy link
Contributor

I am fine with both, but my intuition inclines toward the second option. WDYT, @vyzo

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

just ignore the message i would say, panic is very bad.

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

dont do the sweep in Has, will need xlock.

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

Also, we need to implement background sweeping at some point, for both cache impls, as this eager sweep business has implications for locking.

@Wondertan
Copy link
Contributor

There are many places where processLoop is locked unnecessarily :)
Network.Connectedness can look it for a while

@Wondertan
Copy link
Contributor

Wondertan commented Feb 21, 2023

dont do the sweep in Has, will need xlock.

I know, but it seemed fine. Though background sweeping is a better approach so, let's just remove panic for now, as you've said

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

yeah, although we try to make it fast.

But point taken, ok, lets fix Has.

But please do remove the panic, it needs to ignore the message (and maybe log something in debug).

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

We are racing ;)

To summarize:
Lets fix Has while at it.
If you feel so inclined, feel free to do bg sweeping.

Other than that, the panic must go.

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

More generally, if we are using bg sweeping, we can greatly simplify everything.

We dont need queues, just a map of mid to expiry.
Has shouldnt even check time, just map presence.
And bg sweeping just clears everything expired.

We can and should do for both implementations.

@Wondertan
Copy link
Contributor

Ok, bg sweeping can be done in a separate PR. Let's remove the panic as a start: #522.

@Wondertan
Copy link
Contributor

@vyzo, mind releasing a patch?

@vyzo
Copy link
Collaborator

vyzo commented Feb 21, 2023

We certainly can, but i was thinking of doing bg sweeping first (unless you want to do it).

@Wondertan
Copy link
Contributor

Fair, no need to keep that tech debt around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants