solana-validator leaks memory (but at very slow pace) #14366

ryoqun · 2020-12-31T04:37:32Z

Problem

solana-validator (tested v1.4.19) definitely leaks memory needing periodic restart of once per a week or so.

The pace seems stable across nodes at the rate of 1-2G/day.

Proposed Solution

Debug.

we don't know this existed on the v1.3 line as well.
But this leak is observed from both RPC and non-RPC nodes.
All, the leak happening on RssAnon. This excludes AppendVec (mmap) as it's accounted under RssFile

So, remaining culprits: gossip, blockstore, runtime, rocksdb, etc.

For runtime, blockstore, I think we can just run loong ledger-tool verify session.

CC: @carllin

The text was updated successfully, but these errors were encountered:

ryoqun · 2020-12-31T05:52:43Z

All, the leak happening on RssAnon. This excludes AppendVec (mmap) as it's accounted under RssFile
Well, I was lying. AccoundsDB is leaking AppendVec as well. This might be the root cause...

Two validators' (which are started at different timings) AppendVec/snapshot store entry count is increasing likewise when the timescale is aligned, indicating reproducibility. Restarting seems reduce AppendVec count.

ryoqun · 2020-12-31T16:54:43Z

what bothering me a bit is that accountdb leak (unit: mmap count) and rssanon leak (unit: bytes) doesn't correlate well. we should expect roughtly same rssanon leak per mmap (if any), so it should correlate. Hmm.

RssAnon really looks straight. maybe there is yet another leak source?

ryoqun · 2021-01-04T15:02:58Z

status update: I've run another experiment by hourly executing Bank::exhaustively_.. inside solana-validator (this should basically be equivalent to solana-ledger verify ...) and found that this alleviated the pace of leak abit but the leak still is happening at steady state; and just restarting valdiator reduced appendvec count..
no conclusive clue yet; but AccountsDB and some other are leaking.

leoluk · 2021-01-05T19:30:39Z

Not that slow on tds with 1.4.20:

Anon pages as well:

(accounts not on a tmpfs)

ryoqun · 2021-01-06T05:17:02Z

I'm getting clue, it seems that these are outstanding mem leaks:

~230MB (47M => 272M) PinnedVec leak with duration of about ~2 hours.

EDIT: well, this could be a false alarm; the peak memory doesn't tell this as a definitive leak. maybe only validator session B is killed while processing actively?

Validator session A (run for 10499sec against mainnet-beta):

(1,67GB peak memory consumed over 3475383 calls from)
...
47.40MB leaked over 3475383 calls from
  alloc_raw_vec_finish_grow_h26e00165dd5a25df
  alloc_raw_vec_RawVec_T_A_grow_amortized_h7987cbf1907e2ec3
  alloc_raw_vec_RawVec_T_A_try_reserve_hb4c832486ee65910
  alloc_raw_vec_RawVec_T_A_reserve_hb9061ae855f6838d
  alloc_vec_Vec_T_A_reserve_h24b35d80fdc3c2ec
  solana_perf_cuda_runtime_PinnedVec_T_reserve_and_pin_h2455234a841a6b82
  solana_perf_packet_Packets_new_with_recycler_ha8355c0dd6beae8f
  solana_streamer_streamer_recv_loop_hba901ce15171c4c1
  solana_streamer_streamer_receiver_closure_hd2ad5bcdb43fe168
  std_sys_common_backtrace_rust_begin_short_backtrace_h6c4abc2617c6acd6
  std_thread_Builder_spawn_unchecked_closure_closure_h638166842d896381
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h731bac9f8ae6c0bd
  std_panicking_try_do_call_hc95c6398442b3f6d
  std_panicking_try_hcf56f7d0552102ce
  std_panic_catch_unwind_hb71a283413e2b9a8
  std_thread_Builder_spawn_unchecked_closure_hced44ecbbfcb1434
  core_ops_function_FnOnce_call_once_vtable_shim_h53f87cc1cbe0d172
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

Validator session B (run for 17822 secs against mainnet-beta):

(900,99MB peak memory consumed over 6105123 calls from)
...
272.40MB leaked over 6105123 calls from
  alloc_raw_vec_finish_grow_h26e00165dd5a25df
  alloc_raw_vec_RawVec_T_A_grow_amortized_h7987cbf1907e2ec3
  alloc_raw_vec_RawVec_T_A_try_reserve_hb4c832486ee65910
  alloc_raw_vec_RawVec_T_A_reserve_hb9061ae855f6838d
  alloc_vec_Vec_T_A_reserve_h24b35d80fdc3c2ec
  solana_perf_cuda_runtime_PinnedVec_T_reserve_and_pin_h2455234a841a6b82
  solana_perf_packet_Packets_new_with_recycler_ha8355c0dd6beae8f
  solana_streamer_streamer_recv_loop_hba901ce15171c4c1
  solana_streamer_streamer_receiver_closure_hd2ad5bcdb43fe168
  std_sys_common_backtrace_rust_begin_short_backtrace_h6c4abc2617c6acd6
  std_thread_Builder_spawn_unchecked_closure_closure_h638166842d896381
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h731bac9f8ae6c0bd
  std_panicking_try_do_call_hc95c6398442b3f6d
  std_panicking_try_hcf56f7d0552102ce
  std_panic_catch_unwind_hb71a283413e2b9a8
  std_thread_Builder_spawn_unchecked_closure_hced44ecbbfcb1434
  core_ops_function_FnOnce_call_once_vtable_shim_h53f87cc1cbe0d172
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

ryoqun · 2021-01-06T05:26:43Z

another very outstanding PinnedVec memory use (I dunno this is legitimate; but this is the top 2 consumer of validator heap memory, comprising roughly 50%!):

Validator session A (out of total memory leaked: 6.76GB):

2.18GB leaked over 3380117 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  alloc_alloc_Global_alloc_impl_h6e8f952ef784c80f
  alloc_alloc_Global as core_alloc_Allocator_allocate_hcd5aa98f8fdb82d8
  alloc_raw_vec_RawVec_T_A_allocate_in_h818863092046a12c
  alloc_raw_vec_RawVec_T_A_with_capacity_in_h3a765724eed00dc0
  alloc_vec_Vec_T_A_with_capacity_in_ha008ade098d67d69
  alloc_vec_Vec_T_with_capacity_hd2ecc761f1daf365
  solana_perf_cuda_runtime_PinnedVec_T_with_capacity_h5dbe6aefe33e2adf
  solana_perf_packet_Packets as core_default_Default_default_h4342c11fecffb599
  solana_perf_packet_to_packets_chunked_hfafd8f8486b85a43
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_verify_votes_h7d3ddbeff06cdcdc
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_recv_loop_h91e5dda1a5fc3fd7
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_new_closure_h3e1d03dbb6524a56
  std_sys_common_backtrace_rust_begin_short_backtrace_h5cce6b87eca7c9a3
  std_thread_Builder_spawn_unchecked_closure_closure_he045a66354cc8797
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h42d6191854d9b03c
  std_panicking_try_do_call_ha5f2c90129a5ee4b
  std_panicking_try_hae34b99540f9c3c6
  std_panic_catch_unwind_h27769507ffb95343
  std_thread_Builder_spawn_unchecked_closure_hd1a7253f0ccbfa41
  core_ops_function_FnOnce_call_once_vtable_shim_hdcfce3903c63d73d
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

1.13GB leaked over 113 calls from
  alloc_alloc_realloc_h3bb58a04aa0093ca
  alloc_alloc_Global_grow_impl_h5a622a115c5de61d
  alloc_alloc_Global as core_alloc_Allocator_grow_h28f68d43e02db30b
  alloc_raw_vec_finish_grow_h8053b204cf294204
  alloc_raw_vec_RawVec_T_A_grow_amortized_h030f13e40c75802b
  alloc_raw_vec_RawVec_T_A_try_reserve_h9238ce87d53e24f9
  alloc_raw_vec_RawVec_T_A_reserve_hde62ac898ea7be48
  alloc_vec_Vec_T_A_reserve_hbaaa39a6d2daa347
  alloc_vec_Vec_T_A_push_h15d09b8fa551ae7d
  solana_perf_cuda_runtime_PinnedVec_T_push_h20af81875f38d657
  solana_core_cluster_info_ClusterInfo_handle_pull_requests_h9daa2e58efe9890f
  solana_core_cluster_info_ClusterInfo_handle_batch_pull_requests_h22b763c82a7fad3b
  solana_core_cluster_info_ClusterInfo_process_packets_ha400200fdda1ab71
  solana_core_cluster_info_ClusterInfo_run_listen_h9d26647f78aabc30
  solana_core_cluster_info_ClusterInfo_listen_closure_h0261f2d147aae602
  std_sys_common_backtrace_rust_begin_short_backtrace_h629427e066150c45
  std_thread_Builder_spawn_unchecked_closure_closure_h43d3ede5b3af3248
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h66fd89bcad0e645f
  std_panicking_try_do_call_hf7267000084bddf2
  std_panicking_try_hfb87fb2d879d069c
  std_panic_catch_unwind_hddf958bf20e4592c
  std_thread_Builder_spawn_unchecked_closure_h7748ff5da9f6e17c
  core_ops_function_FnOnce_call_once_vtable_shim_h275a6c4e6b4b80b1
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

Validator session B (out of total memory leaked: 8.96GB):

2.28GB leaked over 5782773 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  alloc_alloc_Global_alloc_impl_h6e8f952ef784c80f
  alloc_alloc_Global as core_alloc_Allocator_allocate_hcd5aa98f8fdb82d8
  alloc_raw_vec_RawVec_T_A_allocate_in_h818863092046a12c
  alloc_raw_vec_RawVec_T_A_with_capacity_in_h3a765724eed00dc0
  alloc_vec_Vec_T_A_with_capacity_in_ha008ade098d67d69
  alloc_vec_Vec_T_with_capacity_hd2ecc761f1daf365
  solana_perf_cuda_runtime_PinnedVec_T_with_capacity_h5dbe6aefe33e2adf
  solana_perf_packet_Packets as core_default_Default_default_h4342c11fecffb599
  solana_perf_packet_to_packets_chunked_hfafd8f8486b85a43
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_verify_votes_h7d3ddbeff06cdcdc
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_recv_loop_h91e5dda1a5fc3fd7
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_new_closure_h3e1d03dbb6524a56
  std_sys_common_backtrace_rust_begin_short_backtrace_h5cce6b87eca7c9a3
  std_thread_Builder_spawn_unchecked_closure_closure_he045a66354cc8797
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h42d6191854d9b03c
  std_panicking_try_do_call_ha5f2c90129a5ee4b
  std_panicking_try_hae34b99540f9c3c6
  std_panic_catch_unwind_h27769507ffb95343
  std_thread_Builder_spawn_unchecked_closure_hd1a7253f0ccbfa41
  core_ops_function_FnOnce_call_once_vtable_shim_hdcfce3903c63d73d
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

2.08GB leaked over 188 calls from
  alloc_alloc_realloc_h3bb58a04aa0093ca
  alloc_alloc_Global_grow_impl_h5a622a115c5de61d
  alloc_alloc_Global as core_alloc_Allocator_grow_h28f68d43e02db30b
  alloc_raw_vec_finish_grow_h8053b204cf294204
  alloc_raw_vec_RawVec_T_A_grow_amortized_h030f13e40c75802b
  alloc_raw_vec_RawVec_T_A_try_reserve_h9238ce87d53e24f9
  alloc_raw_vec_RawVec_T_A_reserve_hde62ac898ea7be48
  alloc_vec_Vec_T_A_reserve_hbaaa39a6d2daa347
  alloc_vec_Vec_T_A_push_h15d09b8fa551ae7d
  solana_perf_cuda_runtime_PinnedVec_T_push_h20af81875f38d657
  solana_core_cluster_info_ClusterInfo_handle_pull_requests_h9daa2e58efe9890f
  solana_core_cluster_info_ClusterInfo_handle_batch_pull_requests_h22b763c82a7fad3b
  solana_core_cluster_info_ClusterInfo_process_packets_ha400200fdda1ab71
  solana_core_cluster_info_ClusterInfo_run_listen_h9d26647f78aabc30
  solana_core_cluster_info_ClusterInfo_listen_closure_h0261f2d147aae602
  std_sys_common_backtrace_rust_begin_short_backtrace_h629427e066150c45
  std_thread_Builder_spawn_unchecked_closure_closure_h43d3ede5b3af3248
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h66fd89bcad0e645f
  std_panicking_try_do_call_hf7267000084bddf2
  std_panicking_try_hfb87fb2d879d069c
  std_panic_catch_unwind_hddf958bf20e4592c
  std_thread_Builder_spawn_unchecked_closure_h7748ff5da9f6e17c
  core_ops_function_FnOnce_call_once_vtable_shim_h275a6c4e6b4b80b1
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

ryoqun · 2021-01-06T05:58:48Z

another supposedly steady memory increase. (not conclusively leaks):

50MB leak per 2 hours if true

From validator session A:

49.78MB leaked over 134311 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  hashbrown_raw_RawTable_T_new_uninitialized_hc1cc1aef131c8958
  hashbrown_raw_RawTable_T_fallible_with_capacity_hae9dc6ce2849dd15
  hashbrown_raw_RawTable_T_resize_h611fee0f5ddf5b09
  hashbrown_raw_RawTable_T_reserve_rehash_h48355927ea38d0bc
  hashbrown_raw_RawTable_T_reserve_h46c074a3daa7c4e4
  hashbrown_raw_RawTable_T_insert_h4241b356b8ebdea2
  hashbrown_map_HashMap_K_V_S_insert_h1798adbcfaaf0c6a
  std_collections_hash_map_HashMap_K_V_S_insert_hbd7f5b3fafcd801d
  solana_core_cluster_slots_ClusterSlots_insert_node_id_h84b28722a32ccd59
  solana_core_cluster_slots_ClusterSlots_update_internal_h873c459b277f284e
  solana_core_cluster_slots_ClusterSlots_update_h7208012ccf02cdf4
  solana_core_cluster_slots_service_ClusterSlotsService_run_h9e545025a4f45ef9
  solana_core_cluster_slots_service_ClusterSlotsService_new_closure_hb0a066dbddffd586
  std_sys_common_backtrace_rust_begin_short_backtrace_haa4297b7cdc10993
  std_thread_Builder_spawn_unchecked_closure_closure_h613cc1608c5b6671
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h2a9d265dda2c3049
  std_panicking_try_do_call_hfe77d8723e265be2
  std_panicking_try_hae9d694bd176d6b2
  std_panic_catch_unwind_h9e0cf8a53ea16883
  std_thread_Builder_spawn_unchecked_closure_h4babb5ac5a74a080
  core_ops_function_FnOnce_call_once_vtable_shim_h813ef8a1abcc4028
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

From validator session B:

106.87MB leaked over 238165 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  hashbrown_raw_RawTable_T_new_uninitialized_hc1cc1aef131c8958
  hashbrown_raw_RawTable_T_fallible_with_capacity_hae9dc6ce2849dd15
  hashbrown_raw_RawTable_T_resize_h611fee0f5ddf5b09
  hashbrown_raw_RawTable_T_reserve_rehash_h48355927ea38d0bc
  hashbrown_raw_RawTable_T_reserve_h46c074a3daa7c4e4
  hashbrown_raw_RawTable_T_insert_h4241b356b8ebdea2
  hashbrown_map_HashMap_K_V_S_insert_h1798adbcfaaf0c6a
  std_collections_hash_map_HashMap_K_V_S_insert_hbd7f5b3fafcd801d
  solana_core_cluster_slots_ClusterSlots_insert_node_id_h84b28722a32ccd59
  solana_core_cluster_slots_ClusterSlots_update_internal_h873c459b277f284e
  solana_core_cluster_slots_ClusterSlots_update_h7208012ccf02cdf4
  solana_core_cluster_slots_service_ClusterSlotsService_run_h9e545025a4f45ef9
  solana_core_cluster_slots_service_ClusterSlotsService_new_closure_hb0a066dbddffd586
  std_sys_common_backtrace_rust_begin_short_backtrace_haa4297b7cdc10993
  std_thread_Builder_spawn_unchecked_closure_closure_h613cc1608c5b6671
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h2a9d265dda2c3049
  std_panicking_try_do_call_hfe77d8723e265be2
  std_panicking_try_hae9d694bd176d6b2
  std_panic_catch_unwind_h9e0cf8a53ea16883
  std_thread_Builder_spawn_unchecked_closure_h4babb5ac5a74a080
  core_ops_function_FnOnce_call_once_vtable_shim_h813ef8a1abcc4028
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

ryoqun · 2021-01-06T06:39:43Z

another hashbrown (HashMap) suspicious one:

30MB leak per 2 hours if true

A:

42.86MB leaked over 125939 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  hashbrown_raw_RawTable_T_new_uninitialized_hbf62b4c513840750
  hashbrown_raw_RawTable_T_fallible_with_capacity_h2cb2c7e333174c57
  hashbrown_raw_RawTable_T_resize_h13252e14bbd1149f
  hashbrown_raw_RawTable_T_reserve_rehash_h01b9065b2752922d
  hashbrown_raw_RawTable_T_reserve_he0884e694923a683
  hashbrown_raw_RawTable_T_insert_h803df403061dcc17
  hashbrown_map_HashMap_K_V_S_insert_hf3d639ab33766cc2
  std_collections_hash_map_HashMap_K_V_S_insert_h85055d8dc3007680
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_filter_and_confirm_with_new_votes_h7a77086eea0fcfb6
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_listen_and_confirm_votes_h448c3d85d16b1eb0
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_process_votes_loop_h9e0f7d6545ed1179
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_new_closure_hd067ad09352620a3
  std_sys_common_backtrace_rust_begin_short_backtrace_h95d5da4959cff159
  std_thread_Builder_spawn_unchecked_closure_closure_hf2fa6c9721c5db72
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h2a87ffdf409c228d
  std_panicking_try_do_call_ha0bd8d57c16f64bb
  std_panicking_try_h6494b96aec7a7848
  std_panic_catch_unwind_hb82738547a14e8ac
  std_thread_Builder_spawn_unchecked_closure_h256f174acea8f9ba
  core_ops_function_FnOnce_call_once_vtable_shim_h53a3f4a3c1007e8e
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

B:

73.34MB leaked over 216940 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  hashbrown_raw_RawTable_T_new_uninitialized_hbf62b4c513840750
  hashbrown_raw_RawTable_T_fallible_with_capacity_h2cb2c7e333174c57
  hashbrown_raw_RawTable_T_resize_h13252e14bbd1149f
  hashbrown_raw_RawTable_T_reserve_rehash_h01b9065b2752922d
  hashbrown_raw_RawTable_T_reserve_he0884e694923a683
  hashbrown_raw_RawTable_T_insert_h803df403061dcc17
  hashbrown_map_HashMap_K_V_S_insert_hf3d639ab33766cc2
  std_collections_hash_map_HashMap_K_V_S_insert_h85055d8dc3007680
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_filter_and_confirm_with_new_votes_h7a77086eea0fcfb6
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_listen_and_confirm_votes_h448c3d85d16b1eb0
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_process_votes_loop_h9e0f7d6545ed1179
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_new_closure_hd067ad09352620a3
  std_sys_common_backtrace_rust_begin_short_backtrace_h95d5da4959cff159
  std_thread_Builder_spawn_unchecked_closure_closure_hf2fa6c9721c5db72
  std_panic_AssertUnwindSafe_F_ as core_ops_function_FnOnce_LP_RP_call_once_h2a87ffdf409c228d
  std_panicking_try_do_call_ha0bd8d57c16f64bb
  std_panicking_try_h6494b96aec7a7848
  std_panic_catch_unwind_hb82738547a14e8ac
  std_thread_Builder_spawn_unchecked_closure_h256f174acea8f9ba
  core_ops_function_FnOnce_call_once_vtable_shim_h53a3f4a3c1007e8e
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc_boxed_Box_F_A_ as core_ops_function_FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

ryoqun · 2021-01-06T08:46:07Z

status update: I'm testing now against testnet

ryoqun · 2021-01-06T09:21:42Z

solana_core_cluster_slots_ClusterSlots_insert_node_id_h84b28722a32ccd59

solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_filter_and_confirm_with_new_votes_h7a77086eea0fcfb6

I looked both code, but I couldn't find anything smelly quickly. From newbie me in this code, it looks like it's pruning correctly older ClusterSlots and VoteTracker entries.

Hmm, HashMap needs periodic shrink_to_fit? I doubt it.

@carllin does something ring? If heaptrack is right, it says we keep some references to these heap-allocated objects somehow or HashMap having too many elements or HashMap doesn't shrink its capacity after moderate .retain?:

sakridge · 2021-01-06T16:57:12Z

Hmm, HashMap needs periodic shrink_to_fit? I doubt it.

@carllin does something ring? If heaptrack is right, it says we keep some references to these heap-allocated objects somehow or HashMap having too many elements or HashMap doesn't shrink its capacity after moderate .retain?:

I think that might be right if we are not ok with the excess capacity. when I looked both Vec and HashMap I don't think size-down capacity for .resize, .retain .remove etc.

ryoqun · 2021-01-07T04:28:28Z

status update: I'm testing now against testnet

Now I got tds version of heaptrack; also longest run of heaptrack.

Continuing on #14366 (comment):

it seems that PinnedVec or its Recycler continues to leak... I haven't looked the code yet.

4.91GB leaked over 458 calls from
  alloc_alloc_realloc_h3bb58a04aa0093ca
  alloc_alloc_Global_grow_impl_h5a622a115c5de61d
  alloc..alloc..Global as core..alloc..Allocator_grow_h28f68d43e02db30b
  alloc_raw_vec_finish_grow_h521693843e05a20a
  alloc_raw_vec_RawVec_T_A_grow_amortized_h1949ef58f6ba0446
  alloc_raw_vec_RawVec_T_A_try_reserve_h784c56450bdd9b1e
  alloc_raw_vec_RawVec_T_A_reserve_h4f82937dd041a25c
  alloc_vec_Vec_T_A_reserve_h96ea896b37c41c76
  alloc_vec_Vec_T_A_push_h4aa03ecdea73ffa9
  solana_perf_cuda_runtime_PinnedVec_T_push_h802ea5fe799d02bf
  solana_core_cluster_info_ClusterInfo_handle_pull_requests_h7a6992e7b5966709
  solana_core_cluster_info_ClusterInfo_handle_batch_pull_requests_h3f355c136d728d4c
  solana_core_cluster_info_ClusterInfo_process_packets_h0aff03a6db0a96b0
  solana_core_cluster_info_ClusterInfo_run_listen_hb024688c955e655e
  solana_core_cluster_info_ClusterInfo_listen_closure_h6e59cb16f1543135
  std_sys_common_backtrace_rust_begin_short_backtrace_h3b0f02be958f2a9a
  std_thread_Builder_spawn_unchecked_closure_closure_h1cfc9f4df993f27d
  std..panic..AssertUnwindSafe_F_ as core..ops..function..FnOnce_LP_RP_call_once_h9b03ce0b4d9139da
  std_panicking_try_do_call_h5ba1fb73b8d3db43
  std_panicking_try_hd525d1c4662904b4
  std_panic_catch_unwind_h57020ade5973cc18
  std_thread_Builder_spawn_unchecked_closure_h5f931b810fbd5f64
  core_ops_function_FnOnce_call_once_vtable.shim_h1febae6956c944b8
  alloc..boxed..Box_F_A_ as core..ops..function..FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc..boxed..Box_F_A_ as core..ops..function..FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

2.95GB leaked over 40905549 calls from
  alloc_alloc_alloc_h42bff26f59c33bd6
  alloc_alloc_Global_alloc_impl_h6e8f952ef784c80f
  alloc..alloc..Global as core..alloc..Allocator_allocate_hcd5aa98f8fdb82d8
  alloc_raw_vec_RawVec_T_A_allocate_in_hbec8986bcb4e6771
  alloc_raw_vec_RawVec_T_A_with_capacity_in_h488bda26ed159c98
  alloc_vec_Vec_T_A_with_capacity_in_h6946bff660ccb632
  alloc_vec_Vec_T_with_capacity_hb4b49148fd807899
  solana_perf_cuda_runtime_PinnedVec_T_with_capacity_h19467fa98be7cc54
  solana_perf..packet..Packets as core..default..Default_default_h5ab40008ecb917f2
  solana_perf_packet_to_packets_chunked_h52b727f50cb224ac
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_verify_votes_h94ed2643ccf2413b
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_recv_loop_hd4bfe483d6c83604
  solana_core_cluster_info_vote_listener_ClusterInfoVoteListener_new_closure_hc2bbb1d5f3ae4abe
  std_sys_common_backtrace_rust_begin_short_backtrace_h5583620442f4d659
  std_thread_Builder_spawn_unchecked_closure_closure_hfa62a37b8e2cbe9f
  std..panic..AssertUnwindSafe_F_ as core..ops..function..FnOnce_LP_RP_call_once_h0d49f9fc8fa39431
  std_panicking_try_do_call_h6c9ed7abb438b74c
  std_panicking_try_h203875b2fc963079
  std_panic_catch_unwind_h9e38bdfa511d5169
  std_thread_Builder_spawn_unchecked_closure_hacc05c54b81724eb
  core_ops_function_FnOnce_call_once_vtable.shim_h7bd16b4340250e18
  alloc..boxed..Box_F_A_ as core..ops..function..FnOnce_Args_call_once_hea1090dbdcecbf5a
  alloc..boxed..Box_F_A_ as core..ops..function..FnOnce_Args_call_once_h8d5723d3912bd325
  std_sys_unix_thread_Thread_new_thread_start_hc17a425ca2995724
  start_thread
  clone

total runtime_ 58470.06s.
calls to allocation functions_ 20024280554 342470_s_
temporary memory allocations_ 2290735512 39177_s_
peak heap memory consumption_ 11.89GB
peak RSS including heaptrack overhead_ 26.67GB
total memory leaked_ 11.30GB

@sakridge could this also be a legimate leak aside from the Vec HashMap capacity leak? I don't think we need 5GB / 3GB heap data at any given moment to run validator...

ryoqun · 2021-01-12T04:53:04Z

@behzadnouri this pr (#14467) could be a fix to one of the above suspicious memory leaks, especially like this? #14366 (comment)? Or completely different one? (I'm not seeing crds in my backtraces). Maybe, did you find #14467 via metrics?

behzadnouri · 2021-01-12T14:33:33Z

@behzadnouri this pr (#14467) could be a fix to one of the above suspicious memory leaks, especially like this? #14366 (comment)? Or completely different one? (I'm not seeing crds in my backtraces). Maybe, did you find #14467 via metrics?

I doubt that that is the case. You mention:

needing periodic restart of once per a week or so.

but the issue with #14467 does not go away with restart. If you restart a node it quickly syncs up to the previous table it had in memory. Also as you also mentioned, those stack traces do not show relevant crds or ClusterInfo either. There is one with ClusterInfo_handle_pull_requests which is not good, but that seems unrelated to the crds table size thing.

did you find #14467 via metrics?

yes, it is cluster_info_stats.table_size.

ryoqun · 2021-01-15T09:25:36Z

@behzadnouri I see! thanks for clear explanations. :)

sakridge · 2021-01-16T02:17:06Z

@sakridge could this also be a legimate leak aside from the Vec HashMap capacity leak? I don't think we need 5GB / 3GB heap data at any given moment to run validator...

The recyclers just allocate to cover the transitional load and don't size down generally. Potentially we can size down with some heuristic if we can detect there is too much being held onto.

ryoqun · 2021-01-28T13:04:02Z

status update!:

The recyclers just allocate to cover the transitional load and don't size down generally. Potentially we can size down with some heuristic if we can detect there is too much being held onto.

I see. Yeah, it seems that commenting-out recycling code seems to reduce leaks in general. I'm now thinking about the heuristics..

Also, I'm seeing odd RSS leakage via recycled AppendVec, which shouldn't count toward RSS, but toward SHR. Like the PinnedVec recycler I'm just disabled and watching to see the leakage are gone.

Lastly, I'm now tracking down to ClusterSlots leaks... I'm now thinking this will be the last leak problem...

ryoqun · 2021-01-28T14:28:56Z

Lastly, I'm now tracking down to ClusterSlots leaks... I'm now thinking this will be the last leak problem...

Whoa, an odd thing is happening here. It seems that ClusterSlots are holding too much, even if the validator is started up from recent snapshot.

Given this patch, on tds:

@@ -37,14 +37,18 @@ impl ClusterSlots {
         for epoch_slots in epoch_slots_list {
             let slots = epoch_slots.to_slots(root);
             for slot in &slots {
-                if *slot <= root {
+                if *slot <= root || *slot > root + 10000 { // HACK
                     continue;
                 }
                 let unduplicated_pubkey = self.keys.get_or_insert(&epoch_slots.from);
                 self.insert_node_id(*slot, unduplicated_pubkey);
             }
         }
-        self.cluster_slots.write().unwrap().retain(|x, _| *x > root);
+        if let Ok(mut w) = self.cluster_slots.write() {
+            let old_len = w.len();
+            w.retain(|x, _| *x > root);
+            info!("ClusterSlots: root: {}, len: {} => {} cpacity: {}", root, old_len, w.len(), w.capacity());
+        }
         self.keys.purge();
         *self.since.write().unwrap() = since;
     }

Without HACK:

[2021-01-28T13:45:00.202154454Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60506700, len: 707622 => 707621 cpacity: 915159
[2021-01-28T13:45:00.412907732Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60506700, len: 707621 => 707621 cpacity: 915159
[2021-01-28T13:45:00.622265577Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60506701, len: 707621 => 707620 cpacity: 915159

With HACK (I assumed validator won't ever need to hold info of far-future slots (+10000)... lol)

[2021-01-28T14:21:54.590700027Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60511161, len: 40 => 39 cpacity: 896
[2021-01-28T14:21:54.792363269Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60511161, len: 39 => 39 cpacity: 896
[2021-01-28T14:21:54.993943670Z INFO  solana_core::cluster_slots] ClusterSlots: root: 60511162, len: 39 => 38 cpacity: 896

And indeed, heaptrack doesn't report leaks of this anymore: #14366 (comment)

@carllin Do you have any idea? Maybe, possibly compressed EpochSlots (Flate2 thing) is acting up via the .to_slots() from ClusterSlots?

carllin · 2021-01-28T23:14:55Z

@ryoqun, hmmm weird, is the node caught up with the cluster? I can only imagine those far-future slots if:

Node is very far behind and others are completing slots in the future
Pollution in gossip from another network
Malicious spam
Flate2 errors

I think we can distinguish between the above by seeing how many nodes are in the ClusterSlots that have completed a slot > root + 10,000. If it's a few, it might be some pollution, if it's a lot AND we're sure we're near the tip, then probably something is wrong with the compression/decompression path.

For context, when thinking about whether we can do a blanket filter like *slot > root + 10000 the primary two places where ClusterSlots is used:

Propagation status for your own leader slots: https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L1545-L1549. Here it's fine to ignore far future slots since you only care about your own leader slots and slots built on top of your leader slot, which should be in a reasonable range from your current root
For weighting repairs, to find validators who have actually completed that slot: https://github.com/solana-labs/solana/blob/master/core/src/cluster_slots.rs#L110. This currently magnifies the weight of nodes that have completed the slot by a factor of 2. I imagine this might be useful in catchup scenarios where validators are trying to repair slots that are far in the future, for instance if a node is > 10,000 slots behind. To get around this, we may be able to leverage information based on votes in the cluster about which slots in the future are actaully relevant. This is already done here: https://github.com/solana-labs/solana/blob/master/core/src/repair_weight.rs#L142-L148 to find the best orphans to repair. We could do something like, ignore *slot > root + 10000 && slot > best_orphan.

ryoqun · 2021-01-29T16:08:14Z

Pollution in gossip from another network

Verdict? it seems that this is the case... Maybe, mainnet-beta and tds are mixed up?

[2021-01-29T09:43:52.778521424Z INFO  solana_core::cluster_slots] EpochSlots: EpochSlots { from: J78SNwDW6G86sMmh7djnBKGjewXNpjD74sJTjJ1iNgTH num_slots: 22168 lowest_slot: 0 wallclock: 1611715675556 }
[2021-01-29T09:43:52.778558015Z INFO  solana_core::cluster_slots] Slots: [62553872, 62553873, 62553874, 62553875, 62553876, 62553877, 62553878, 62553879, 62553880, 62553881, 62553882, 62553883, 62553884, 62553885>

ryoqun@ubuqun:~/work/solana/solana$ solana -ut gossip | grep J78SNwDW6G86sMmh7djnBKGjewXNpjD74sJTjJ1iNgTH
195.201.199.116 | J78SNwDW6G86sMmh7djnBKGjewXNpjD74sJTjJ1iNgTH | 8000   | 8003  | 195.201.199.116:8899  | 1.5.5
==> 2021-01-30 01:02:01.845503751|0
ryoqun@ubuqun:~/work/solana/solana$ solana  gossip | grep J78SNwDW6G86sMmh7djnBKGjewXNpjD74sJTjJ1iNgTH
135.181.136.246 | J78SNwDW6G86sMmh7djnBKGjewXNpjD74sJTjJ1iNgTH | 8000   | 8003  | 135.181.136.246:8899  | 1.4.24

t-nelson · 2021-01-29T16:41:42Z

Verdict? it seems that this is the case... Maybe, mainnet-beta and tds are mixed up?

This smells like someone who was invited to MB from TdS and copypasta'd their way to victory 😔

Shred version should be keeping the networks apart though

behzadnouri · 2021-08-25T22:38:11Z

81965-leaks.txt.gz
81965-leaks-nomerge.txt.gz
81965-peaks-nomerge.txt.gz

Some more heaptrack profiles; which seem very confusing. Some of the reported leaks are just plain rust vectors and I do not see how they can be leaked.
A lot of rocksdb stack traces as well, which might be suspicious, since they are calling into c++ code and they have already been reports of rocksdb having memory leak: rust-rocksdb/rust-rocksdb#41

Page fault events: page_faults.svg.gz

…on (#17899) Inspecting TDS gossip table shows that crds values of nodes with different shred-versions are creeping in. Their epoch-slots are accumulated in ClusterSlots causing bogus slots very far from current root which are not purged and so cause ClusterSlots keep consuming more memory: #17789 #14366 (comment) #14366 (comment) This commit updates ClusterInfo::get_epoch_slots, and discards entries from nodes with unknown or different shred-version. Follow up commits will patch gossip not to waste bandwidth and memory over crds values of nodes with different shred-version. (cherry picked from commit 985280e) # Conflicts: # core/src/cluster_info.rs

…on (backport #17899) (#19551) * excludes epoch-slots from nodes with unknown or different shred version (#17899) Inspecting TDS gossip table shows that crds values of nodes with different shred-versions are creeping in. Their epoch-slots are accumulated in ClusterSlots causing bogus slots very far from current root which are not purged and so cause ClusterSlots keep consuming more memory: #17789 #14366 (comment) #14366 (comment) This commit updates ClusterInfo::get_epoch_slots, and discards entries from nodes with unknown or different shred-version. Follow up commits will patch gossip not to waste bandwidth and memory over crds values of nodes with different shred-version. (cherry picked from commit 985280e) # Conflicts: # core/src/cluster_info.rs * removes backport merge conflicts Co-authored-by: behzad nouri <behzadnouri@gmail.com>

behzadnouri · 2021-09-27T16:06:23Z

Updating the issue with what was discussed on discord:

v1.6 is using jemalloc, but #16346 removed jemalloc, so 1.7 is using system allocator.

Above is bvb ram usage.

Left chunk is 1.7.11 with jemalloc: ram usage growth was noticeably slowing down, but it was interrupted by the cluster restart.
Middle chunk is 1.6.25: It takes 3.5 days until ram usage stops growing.
Right chunk is 1.6.25 without jemalloc: much larger ram usage with no sign of slowing down.

Noticeably from the 3 chunks, 1.7.11 with jemalloc is using least amount of memory.

So if:

these plots are not overly impacted by the fact that the cluster was restarted.
memleak commit was not backported to 1.6.25.

then it shows that removing jemalloc was the culprit for 1.7 memory regression.

This is also bve restarted at the same time as bvb but instead running 1.6.25 (which uses jemalloc). Much less ram usage and slowing growth.

#20149 reverted back the allocator to jemalloc.

im-0 · 2021-09-29T17:47:59Z

Hi @behzadnouri, sorry for bothering, but it looks like unprefixed_malloc_on_supported_platforms feature should be enabled in tikv-jemalloc-sys. Without this, jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). This seems wrong.

behzadnouri · 2021-09-29T17:49:50Z

Hi @behzadnouri, sorry for bothering, but it looks like unprefixed_malloc_on_supported_platforms feature should be enabled in tikv-jemalloc-sys. Without this, jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). This seems wrong.

thanks, for letting me know. I will add that

Without this feature jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). #14366 (comment)

Without this feature jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). #14366 (comment) (cherry picked from commit 4bf6d0c)

…20325) Without this feature jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). #14366 (comment) (cherry picked from commit 4bf6d0c) Co-authored-by: behzad nouri <behzadnouri@gmail.com>

…s#20317) Without this feature jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). solana-labs#14366 (comment)

ryoqun · 2021-11-29T13:55:05Z

btw, I'm looking into this again. things to try:

jemalloc's malloc_stats_print
dhat?
https://crates.io/crates/allocation-counter/0.5.0

bji · 2021-12-10T16:27:28Z

ERR: out of memory cuda-ecc-ed25519/gpu_ctx.cu 97
solana-validator.hack: common/gpu_common.h:22: void cuda_assert(cudaError_t, const char*, int): Assertion `0' failed.

This was with 1.8.5 after about 16 days of uptime. RTX 2080ti, 11 GB memory.

ryoqun · 2021-12-13T12:39:23Z

btw, I'm looking into this again. things to try:
* jemalloc's `malloc_stats_print`

* dhat?

* https://crates.io/crates/allocation-counter/0.5.0

I think i've found the last remaining leak cause.... in rocksdb:

here's the patch for the bus factor

diff --git b/ledger/src/blockstore_db.rs a/ledger/src/blockstore_db.rs
index 46b97e121e..0939e95e24 100644
--- b/ledger/src/blockstore_db.rs
+++ a/ledger/src/blockstore_db.rs
@@ -396,6 +396,7 @@ impl Rocks {
         let cf_names: Vec<_> = cfs.iter().map(|c| c.0).collect();
 
         // Open the database
+        db_options.set_max_open_files(200); // needs to consider any downsides.
         let db = match access_type {
             AccessType::PrimaryOnly | AccessType::PrimaryOnlyForMaintenance => Rocks(
                 DB::open_cf_descriptors(&db_options, path, cfs.into_iter().map(|c| c.1))?,

I'll detail later

without fix:

with fix:

steviez · 2022-01-03T17:08:29Z

I'll detail later

Hi @ryoqun - are you still looking actively looking into this / have any updates? There was some chatter last week on Discord about the memleaks that seem to be present in both v1.9 and master and wanted to avoid any duplicate work

ryoqun · 2022-01-04T15:07:39Z

I'll detail later

Hi @ryoqun - are you still looking actively looking into this / have any updates? There was some chatter last week on Discord about the memleaks that seem to be present in both v1.9 and master and wanted to avoid any duplicate work

hi, i haven't actively investigating the possible memory leak bug, which i thought i came up with. seems it's false alarm... also, i was testing v1.8.x line.

steviez · 2022-03-19T23:02:06Z

I've had a node running v1.9.9 against mainnet-beta for a couple weeks; it showed a 4-5 GB / day ramp shortly after starting; however, memory has looked pretty stable the last 2+ weeks:

CriesofCarrots · 2023-02-16T17:55:03Z

@ryoqun , @steviez can this issue be closed?

steviez · 2023-02-16T18:07:48Z

We do seemingly have another memory leak/growth (master & v1.14) that is in the early stages of investigation at the moment. That being said, I'm in favor of closing this issue due to its' age. Releases are different and code is so different that I think new investigation would be worthy of a new issue (currently in Discord).

We could always reference this issue as a "prior work" in a new issue.

ryoqun mentioned this issue Jan 5, 2021

Save 7G mem on mainnet fixing AccIndex overalloc. #14435

Merged

mvines added this to the v1.4.21 milestone Jan 5, 2021

CriesofCarrots modified the milestones: v1.4.21, v1.4.22 Jan 6, 2021

jackcmay modified the milestones: v1.4.23, v1.2.24 Jan 12, 2021

mvines modified the milestones: v1.4.24, v1.4.25 Jan 20, 2021

ryoqun mentioned this issue Jan 24, 2021

Reduce ~2 GBs mem by avoiding another overalloc. #14806

Merged

mvines modified the milestones: v1.6.20, v1.7.11 Aug 30, 2021

behzadnouri mentioned this issue Sep 29, 2021

adds unprefixed_malloc_on_supported_platforms to jemalloc #20317

Merged

behzadnouri added a commit that referenced this issue Sep 29, 2021

adds unprefixed_malloc_on_supported_platforms to jemalloc (#20317)

4bf6d0c

Without this feature jemalloc is used only for Rust code but not for bundled C/C++ libraries (like rocksdb). #14366 (comment)

CriesofCarrots modified the milestones: v1.7.11, v1.7.15 Sep 30, 2021

jstarry modified the milestones: v1.7.15, v1.7.17 Oct 22, 2021

CriesofCarrots modified the milestones: v1.7.17, v1.7.18 Oct 28, 2021

mvines modified the milestones: v1.7.18, The Future! Dec 3, 2021

steviez closed this as completed Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

solana-validator leaks memory (but at very slow pace) #14366

solana-validator leaks memory (but at very slow pace) #14366

ryoqun commented Dec 31, 2020

ryoqun commented Dec 31, 2020

ryoqun commented Dec 31, 2020

ryoqun commented Jan 4, 2021 •

edited

Loading

leoluk commented Jan 5, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021

ryoqun commented Jan 6, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021

ryoqun commented Jan 6, 2021

sakridge commented Jan 6, 2021

ryoqun commented Jan 7, 2021

ryoqun commented Jan 12, 2021

behzadnouri commented Jan 12, 2021

ryoqun commented Jan 15, 2021

sakridge commented Jan 16, 2021

ryoqun commented Jan 28, 2021

ryoqun commented Jan 28, 2021

carllin commented Jan 28, 2021

ryoqun commented Jan 29, 2021

t-nelson commented Jan 29, 2021

behzadnouri commented Aug 25, 2021 •

edited

Loading

behzadnouri commented Sep 27, 2021

im-0 commented Sep 29, 2021

behzadnouri commented Sep 29, 2021

ryoqun commented Nov 29, 2021

bji commented Dec 10, 2021

ryoqun commented Dec 13, 2021

steviez commented Jan 3, 2022 •

edited

Loading

ryoqun commented Jan 4, 2022

steviez commented Mar 19, 2022

CriesofCarrots commented Feb 16, 2023

steviez commented Feb 16, 2023

solana-validator leaks memory (but at very slow pace) #14366

solana-validator leaks memory (but at very slow pace) #14366

Comments

ryoqun commented Dec 31, 2020

Problem

Proposed Solution

ryoqun commented Dec 31, 2020

ryoqun commented Dec 31, 2020

ryoqun commented Jan 4, 2021 • edited Loading

leoluk commented Jan 5, 2021 • edited Loading

ryoqun commented Jan 6, 2021 • edited Loading

ryoqun commented Jan 6, 2021

ryoqun commented Jan 6, 2021 • edited Loading

ryoqun commented Jan 6, 2021 • edited Loading

ryoqun commented Jan 6, 2021

ryoqun commented Jan 6, 2021

sakridge commented Jan 6, 2021

ryoqun commented Jan 7, 2021

ryoqun commented Jan 12, 2021

behzadnouri commented Jan 12, 2021

ryoqun commented Jan 15, 2021

sakridge commented Jan 16, 2021

ryoqun commented Jan 28, 2021

ryoqun commented Jan 28, 2021

carllin commented Jan 28, 2021

ryoqun commented Jan 29, 2021

t-nelson commented Jan 29, 2021

behzadnouri commented Aug 25, 2021 • edited Loading

behzadnouri commented Sep 27, 2021

im-0 commented Sep 29, 2021

behzadnouri commented Sep 29, 2021

ryoqun commented Nov 29, 2021

bji commented Dec 10, 2021

ryoqun commented Dec 13, 2021

steviez commented Jan 3, 2022 • edited Loading

ryoqun commented Jan 4, 2022

steviez commented Mar 19, 2022

CriesofCarrots commented Feb 16, 2023

steviez commented Feb 16, 2023

ryoqun commented Jan 4, 2021 •

edited

Loading

leoluk commented Jan 5, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021 •

edited

Loading

ryoqun commented Jan 6, 2021 •

edited

Loading

behzadnouri commented Aug 25, 2021 •

edited

Loading

steviez commented Jan 3, 2022 •

edited

Loading