Retry hash file allocation #33565

HaoranYi · 2023-10-06T16:37:28Z

Problem

Recently, on Oct 6, 20023, we saw a validator crash on one of the canaries boxes.

[2023-10-06T06:29:48.605280562Z INFO  solana_metrics::metrics] datapoint: accounts_db_active shrink=0i
[2023-10-06T06:29:48.653273411Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solAccountsLo03" one=1i message="panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 12, kind: OutOfMemory, message: \"Cannot allocate memory\" }', accounts-db/src/accounts_hash.rs:102:34" location="accounts-db/src/accounts_hash.rs:102:34" version="1.18.0 (src:e0091d69; feat:1091887072, client:SolanaLabs)"

The crash happens when allocating account hash file during hash dedup.

[2023-10-06T06:29:31.721389079Z INFO  solana_metrics::metrics] datapoint: memory-stats total=269907476480i swap_total=0i free_percent=3.6710358146504354 used_bytes=39749242880i avail_percent=85.27301155255498 buffers_percent=0.6657022085615106 cached_percent=79.78760422961368 swap_free_percent=0
[2023-10-06T06:29:36.744595335Z INFO  solana_metrics::metrics] datapoint: memory-stats total=269907476480i swap_total=0i free_percent=5.081942011716148 used_bytes=35619844096i avail_percent=86.80294278597377 buffers_percent=0.6663881443585271 cached_percent=79.90525739437273 swap_free_percent=0
[2023-10-06T06:29:41.745857461Z INFO  solana_metrics::metrics] datapoint: memory-stats total=269907476480i swap_total=0i free_percent=5.645556638416836 used_bytes=35985874944i avail_percent=86.66732933325522 buffers_percent=0.6665914970063151 cached_percent=79.21095985788381 swap_free_percent=0
[2023-10-06T06:29:46.768160791Z INFO  solana_metrics::metrics] datapoint: memory-stats total=269907476480i swap_total=0i free_percent=0.6511670469158839 used_bytes=37269983232i avail_percent=86.19157063818434 buffers_percent=0.6666142603624108 cached_percent=83.64288057827423 swap_free_percent=0

The used memory increase by 2G and the free memory drops to 0.65 percent.

It appears that the box still has enough physical memory. However, the allocation fails. Probably due to memory defragmentation and multiple threads allocate mmaps at the same time.

Summary of Changes

Account hash calculation is highly parallel. It seems that when all threads start allocation files to store account hashes at the same time, the system may hit OOM for some of those allocations. To mitigate that, we add retry for account hash file allocation. If a thread fails its allocation, it will sleep for some time and then retry. This will help to stagger the memory allocation from all parallel threads.

Hopefully, the kernel has time to defrag the memory in-between the retries, and the later retry for allocation will succeed.

Fixes #

brooksprumo · 2023-10-06T17:07:09Z

Hopefully, the kernel has defrag the memory and retry allocation will succeed.

I'm not familiar with this part of the kernel. How likely is it that (1) fragmentation was the issue, and (2) that a defrag will run before we've hit our retry limit?

brooksprumo · 2023-10-06T17:11:45Z

Is there any way to test this? I can't think of anything deterministic off the top of my head. But that would be nice...

codecov · 2023-10-06T17:42:20Z