Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve get_block_by_seqno / lt / unix time speed #666

Closed
wants to merge 2 commits into from

Conversation

tvorogme
Copy link
Contributor

@tvorogme tvorogme commented Apr 5, 2023

This is not for production, but for getting feedback.

The main problem is that the methods (get_file_desc_by_seqno / get_file_desc_by_unix_time / get_file_desc_lt) combine multithreaded queries into 1 thread, so they also go through the map every time, without using indexes. These methods degrade performance.

This is my improvement for this problem, currently there is only test file, indexes and async+indexes examples with get_block_by_seqno, but if all 'ok', it can be used in all 3 methods.

For test, I've requested blocks by seqno from MC and WC with seqno from 3 600 000 to 3 610 000.

Current realization:

[!Pure TestMC #1]	Start test: Pure TestMC #1 Start from: 3600000 End at: 3610000 Is masterchain: true
[!Pure TestMC #1]	Test Pure TestMC #1 done, results:
[!Pure TestMC #1]	AVG: 8.980766
[!Pure TestMC #1]	Done at: 8.997811

[!Pure TestWC #2]	Start test: Pure TestWC #2 Start from: 3600000 End at: 3610000 Is masterchain: false
[!Pure TestWC #2]	Test Pure TestWC #2 done, results:
[!Pure TestWC #2]	AVG: 27.886684
[!Pure TestWC #2]	Done at: 27.905077

It can be improved by precalculating min_seqno in each file to skip a lot of td::map in ArchiveManager::load_package, also there we can add calculation of minimal lt and unix time. If we add this min index and add skipping we got improvements:

[!Index TestMC #1]	Start test: Index TestMC #1 Start from: 3600000 End at: 3610000 Is masterchain: true
[!Index TestMC #1]	Test Index TestMC #1 done, results:
[!Index TestMC #1]	AVG: 4.828474
[!Index TestMC #1]	Done at: 4.846530

[!Index TestWC #2]	Start test: Index TestWC #2 Start from: 3600000 End at: 3610000 Is masterchain: false
[!Index TestWC #2]	Test Index TestWC #2 done, results:
[!Index TestWC #2]	AVG: 8.406131
[!Index TestWC #2]	Done at: 8.427663

As you see masterchain requests of 10k blocks works x2 faster, while workchain request of 10k blocks speeds up from 27.9 sec to 8.4 sec. But, now we can add async worker to calculate file_desc in threads, it'll improve situation:

[!Async TestMC #1]	Start test: Async TestMC #1 Start from: 3600000 End at: 3610000 Is masterchain: true
[!Async TestMC #1]	Test Async TestMC #1 done, results:
[!Async TestMC #1]	AVG: 2.749450
[!Async TestMC #1]	Done at: 2.852702

[!Async TestWC #2]	Start test: Async TestWC #2 Start from: 3600000 End at: 3610000 Is masterchain: false
[!Async TestWC #2]	Test Async TestWC #2 done, results:
[!Async TestWC #2]	AVG: 3.310955
[!Async TestWC #2]	Done at: 4.184114

8.9 sec -> 2.8 sec
27.9 sec -> 4.1 sec

This impact on all get block requests including liteserver, validator, indexers, etc.


I must admit that I'm not excellent at C++ and could have made a mistake, but our dton.io indexer now index old blocks much faster :)

@tvorogme
Copy link
Contributor Author

Fixed at #685

@tvorogme tvorogme closed this Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant