Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A gossip spy node will OOM if zeros are shoveled at it. #8175

Closed
mvines opened this issue Feb 8, 2020 · 1 comment · Fixed by #8328
Closed

A gossip spy node will OOM if zeros are shoveled at it. #8175

mvines opened this issue Feb 8, 2020 · 1 comment · Fixed by #8328
Assignees
Labels
security Pull requests that address a security vulnerability

Comments

@mvines
Copy link
Member

mvines commented Feb 8, 2020

STR:

  1. $ solana-gossip spy --gossip-port 1234
  2. $ dd if=/dev/zero bs=1232 > /dev/udp/127.0.0.1/1234

Within a minute or two, the spy node will be killed by the kernel.

cc: #5414

@sagar-solana
Copy link
Contributor

I spent some time debugging this.

At first it looked like the calls to "verify" the incoming gossip messages were causing the recycler to blow up. After further inspection, it became clear that if the channel's consumer take even a few nanoseconds the socket receiver is able to collect items at a much higher rate.
This leads to the Recycler making new allocations instead of recycling old ones.

It is not a memory leak, just a resource exhaustion where the channel's consumer isn't fast enough.

Suggested fix:
In Gossip, perform a greedy receive on the receiver. If the number of items exceeds some computed limit (number of nodes * expected messages per node * expected messages per second) use stakes to drop lower staked messages. If no stakes are known, drop items at random.

While that approach seems simple enough, it needs to be efficient otherwise there will be no improvement. Figuring out the stakes and which ones are "lower" will require 2 passes over the incoming packets or will need cache some of the data possibly increasing memory consumption (by a negligible amount).

@mvines mvines modified the milestones: Tofino v0.23.3, Tofino v0.23.4 Feb 12, 2020
@mvines mvines modified the milestones: Tofino v0.23.5, Tofino v0.23.6 Feb 15, 2020
@leoluk leoluk added the security Pull requests that address a security vulnerability label Sep 16, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
security Pull requests that address a security vulnerability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants