Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable AppendVec recycling under accounts caching #15188

Closed
wants to merge 1 commit into from

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Feb 8, 2021

Problem

For some reason, recycling AppendVec seems to accmulate RSS usage:

The followings are top excerpts over the time for easier illustration. (I have somewhat more detailed metrics on the tested machines, if wanted)

Without recycling (this pr; good)

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1845269 sol       20   0   42.5g  21.6g   5.9g S  1262  34.3 571:23.41 solana-valida                                                                                                                                                   
1845269 sol       20   0   45.3g  17.7g   5.9g S 600.0  28.2   1338:33 solana-validato
1845269 sol       20   0   46.3g  20.7g   5.9g S 960.0  32.9   3540:31 solana-validato                                                                                               
1845269 sol       20   0   51.0g  18.6g   5.9g S 570.1  29.7   7567:33 solana-validato                                                                                               
1845269 sol       20   0   52.4g  19.5g   5.9g S 350.0  31.1   9450:12 solana-validato                                                                                               
1845269 sol       20   0   53.0g  20.7g   6.0g S 680.4  33.0  10730:30 solana-validato                                                                                               
1845269 sol       20   0   54.2g  21.5g   5.9g S 780.0  34.2  12364:17 solana-validato                                                                                               
1845269 sol       20   0   55.9g  24.4g   6.0g S 706.2  38.9  17610:31 solana-validato                                                                                                                      
1845269 sol       20   0   56.9g  24.2g   6.0g S 635.6  38.6  19783:25 solana-validato                                                                                               
1845269 sol       20   0   58.6g  23.4g   6.0g S 866.7  37.3  23263:47 solana-valida                                                                                    
1845269 sol       20   0   63.1g  26.1g   6.1g S 504.8  41.5  28105:47 solana-validato
1845269 sol       20   0   63.1g  24.4g   6.0g S 726.7  38.9  29958:35 solana-valida                                                                                   
1845269 sol       20   0   65.7g  24.5g   6.1g S 318.8  39.0  39694:00 solana-valida                                                                                   
1845269 sol       20   0   65.9g  23.1g   6.1g S 543.8  36.8  43028:56 solana-valida                                                                                                                    

With recycling (current code; bad;):

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 881078 ubuntu    20   0   47.2g  33.0g   6.1g S  1094  26.7   9795:01 solana-validato                                                                                               
 881078 ubuntu    20   0   49.1g  35.3g   6.4g S  1325  28.6  16664:41 solana-validato                                                                                               
 881078 ubuntu    20   0   52.1g  37.8g   6.3g S  1086  30.6  33565:52 solana-validato                                                                                                                      
 881078 ubuntu    20   0   52.4g  38.2g   6.5g S 286.7  30.9  37263:25 solana-validato                                                                                               
 881078 ubuntu    20   0   56.4g  42.0g   8.7g S 693.8  34.0  43183:25 solana-valida                                                                                  
 881078 ubuntu    20   0   67.5g  52.6g  17.2g S 787.5  42.5  51273:52 solana-validato         
 881078 ubuntu    20   0   70.5g  52.2g  16.5g S  1050  42.2  54339:02 solana-validato
 881078 ubuntu    20   0   81.6g  47.7g   9.8g S 773.3  38.6  65919:56 solana-valida                                                                                   
 881078 ubuntu    20   0   85.4g  52.3g  13.9g S 712.5  42.3  70357:35 solana-valida                                                                                   
 881078 ubuntu    20   0   87.9g  53.4g  14.9g S 944.2  43.2  75894:56 solana-valida                                                                                                                    
 881078 ubuntu    20   0   94.6g  46.9g   7.8g S  1100  37.9  88728:44 solana-valida                                                                                                                     
 881078 ubuntu    20   0   98.3g  51.4g  12.3g S  1812  41.6 101525:57 solana-valida                                                                                                                    

Note that there is still other leak sources... So the above good excerpt still increases RSS gradually.

The odd fact:

  • While SHR matches to the du path/to/accounts for the good validator, it doesn't for the bad validator. However pmap -p does match to ls -l path/to/accounts for the both good and bad validator.
  • du output is usually larger than SHR for the bad validator, so it's looks like mmap is counted toward RSS, considering bad validator 's unexplained leak pace.
  • Maybe, this is the reason for the recently reported bug about rather fast mem growth on discord.

I tried madvise DONTNEED to tell linux kernel to free seeing this quote; but failed. (I thought undue mem usage is due to we don't discard dirty pages by msync or munmap between recyling)

https://man7.org/linux/man-pages/man2/madvise.2.html:

... The resident set size (RSS) of the calling process will be immediately reduced however.

I also looked /proc/NNN/smaps somewhat in detail; but no definitive clue...

Summary of Changes

Considering AccountsCaching is the future, I think disabling recycling won't hurt as much. (At least, such a validator can catchup with mainnet-beta; will run perf system test).

context: #14366

@codecov
Copy link

codecov bot commented Feb 8, 2021

Codecov Report

Merging #15188 (d88241c) into master (774416a) will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##           master   #15188   +/-   ##
=======================================
  Coverage    79.5%    79.5%           
=======================================
  Files         402      402           
  Lines      102297   102288    -9     
=======================================
- Hits        81416    81412    -4     
+ Misses      20881    20876    -5     

@sakridge
Copy link
Member

sakridge commented Feb 8, 2021

Interesting, yes with caching enabled the performance from creating stores from the recycler is not as big of a concern, but seems like it still makes sense in theory to avoid unmapping and thus flushing those stores if we can. Does the capacity of the append_vecs in the recycler match up to the extra usage?

@ryoqun
Copy link
Member Author

ryoqun commented Feb 9, 2021

Interesting, yes with caching enabled the performance from creating stores from the recycler is not as big of a concern, but seems like it still makes sense in theory to avoid unmapping and thus flushing those stores if we can. Does the capacity of the append_vecs in the recycler match up to the extra usage?

Yeah, it's possible that .recycled_store could be holding too much stale large AppendVecs. Originally, I thought recycled_stores are mostly empty. But it seems not...

@sakridge
Copy link
Member

sakridge commented Feb 9, 2021

Interesting, yes with caching enabled the performance from creating stores from the recycler is not as big of a concern, but seems like it still makes sense in theory to avoid unmapping and thus flushing those stores if we can. Does the capacity of the append_vecs in the recycler match up to the extra usage?

Yeah, it's possible that .recycled_store could be holding too much stale large AppendVecs. Originally, I thought recycled_stores are mostly empty. But it seems not...

I see. AppendVecs which are not new will also not be as sparse on average than if you have a new one each time.

@ryoqun
Copy link
Member Author

ryoqun commented Feb 24, 2021

closing in favor of #15513

@ryoqun ryoqun closed this Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants