Validator can't handle alive accounts spanning just hours-worth slots #8931

ryoqun · 2020-03-18T11:47:07Z

Problem

Currently, 1 created-and-forgot alive account for each slot consumes 1 AccountsDB/AppendVec/mmap indefinitely. This means we can handle only 65530 (default /proc/sys/vm/max_map_count) slots.
So, just send small rent-exempt (it needs to be?) lamports to each of around 70000 random accounts for each new slot (~10 hour);

Then, the cluster dies unless specifically configured (this isn't advised in the docs), meaning this is DoS vulnerability.

Previously, we encountered this error: #5432
But, it wasn't regarded important at the time due to being caused by unrooted banks.

But, as I can demonstrate at the unit test code and integration test (#8932); this threat is real.

Also, ad-hoc test is running here: https://metrics.solana.com:3000/d/V5LPmn_Zk/testnet-monitor-edge-ryoqun?orgId=2&from=now-3h&to=now&var-datasource=Solana%20Metrics%20(read-only)&var-testnet=testnet-dev-ryoqun&var-hostid=All

Proposed Solution

So, increasing /proc/sys/vm/max_map_count is one of mitigation. And indeed, various other mmap-based famous middlewares do so (see refs).

Nevertheless, to mitigate days-spanning attacks, I think we just need to introduce LRU eviction of old AppendVecs or similar other mechanism to avoid remote-controllable unbounded mmap usage.
Further, background old slot aggregation service can also be conceivable. But dunno the added complexity justifies for such cases in addition of LRU eviction.

refs

128000: MongoDB: https://docs.mongodb.com/manual/administration/production-checklist-operations/
262144: ElasticSearch: https://www.elastic.co/guide/en/elasticsearch/reference/master/docker.html#docker-prod-prerequisites
524288: Varnish: https://image.slidesharecdn.com/fastlyvarnishnycmeetupfinal-140804141851-phpapp02/95/fastly-inaugural-nyc-varnish-meetup-25-638.jpg?cb=1407162101

The text was updated successfully, but these errors were encountered:

sakridge · 2020-03-18T17:22:43Z

We could add to docs and sys-tuner in the short-term.

sakridge · 2020-05-15T01:10:56Z

@ryoqun Want to prioritize this? and maybe increase rent as a solution.

ryoqun · 2020-05-18T23:43:54Z

@sakridge Yeah I think we want to do so. :)

So how about closing this issue (because this is fixed by #8940 and #9527) and create another issue titled Increase rent or like that.

ryoqun added the security Pull requests that address a security vulnerability label Mar 18, 2020

ryoqun mentioned this issue Mar 18, 2020

[wip] Too many alive slots #8932

Closed

mvines added this to the v1.1.0 milestone Mar 18, 2020

sakridge mentioned this issue Mar 18, 2020

Increase vmmap count in sys-tuner #8940

Merged

mvines modified the milestones: v1.1.0, v1.2.0 Mar 30, 2020

ryoqun mentioned this issue Apr 6, 2020

[wip][abi-incompat.] Introduce background stale slot coalescence #9319

Closed

This was referenced May 19, 2020

Introduce eager rent collection #9527

Merged

Fix another unstable test after eager rent #10120

Merged

mvines modified the milestones: v1.2.0, v1.3.0 May 21, 2020

ryoqun mentioned this issue Jun 1, 2020

Update docs for eager rent collection #10348

Merged

sakridge closed this as completed Jul 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validator can't handle alive accounts spanning just hours-worth slots #8931

Validator can't handle alive accounts spanning just hours-worth slots #8931

ryoqun commented Mar 18, 2020 •

edited

Loading

sakridge commented Mar 18, 2020

sakridge commented May 15, 2020 •

edited

Loading

ryoqun commented May 18, 2020

Validator can't handle alive accounts spanning just hours-worth slots #8931

Validator can't handle alive accounts spanning just hours-worth slots #8931

Comments

ryoqun commented Mar 18, 2020 • edited Loading

Problem

Proposed Solution

refs

sakridge commented Mar 18, 2020

sakridge commented May 15, 2020 • edited Loading

ryoqun commented May 18, 2020

ryoqun commented Mar 18, 2020 •

edited

Loading

sakridge commented May 15, 2020 •

edited

Loading