Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of database::update_expired_feeds() #1093

Closed
4 tasks done
abitmore opened this issue Jun 25, 2018 · 3 comments
Closed
4 tasks done

Improve performance of database::update_expired_feeds() #1093

abitmore opened this issue Jun 25, 2018 · 3 comments
Assignees
Labels
2d Developing Status indicating currently designing and developing a solution 3c Enhancement Classification indicating a change to the functionality of the existing imlementation 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform 6 Performance Impacts flag identifying system/user efficiency, performance, etc. 9c Large Effort estimation indicating TBD performance

Comments

@abitmore
Copy link
Member

abitmore commented Jun 25, 2018

According to profiling data mentioned in #1083, update_expired_feeds() plays a significant role while replaying.

---------------------- first 27 M blocks ----------------------------
764718ms th_a db_management.cpp:124 reindex ] Done reindexing, elapsed time: 5063.73962899999969522 sec

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
  9.62    215.94   215.94 448458421     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_object, ... >::find(graphene::db::object_id_type) const
  8.47    406.11   190.17 20731743387     0.00     0.00  graphene::chain::operator>(graphene::chain::price const&, graphene::chain::price const&)
  7.18    567.18   161.07 390785433     0.00     0.00  graphene::chain::database::adjust_balance(graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, graphene::chain::asset)
  4.68    672.30   105.12 307448392     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_statistics_object, ... >::find(graphene::db::object_id_type) const
  4.64    776.50   104.20 22967573183     0.00     0.00  graphene::chain::operator<(graphene::chain::price const&, graphene::chain::price const&)
  4.39    874.95    98.45                             sha256_block_data_order_avx
  3.92    962.82    87.87 3183637020     0.00     0.00  graphene::chain::generic_index<graphene::chain::asset_bitasset_data_object, ... >::find(graphene::db::object_id_type) const
  3.62   1044.04    81.22 386682199     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_balance_object, ... >::modify(graphene::db::object const&, std::function<void (graphene::db::object&)> const&)
  3.32   1118.50    74.46 805850110     0.00     0.00  graphene::chain::generic_index<graphene::chain::asset_object, ... >::find(graphene::db::object_id_type) const
  2.85   1182.43    63.93 245765371     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_statistics_object, ... >::modify(graphene::db::object const&, std::function<void (graphene::db::object&)> const&)
  2.76   1244.32    61.89 27000000     0.00     0.00  graphene::chain::database::update_expired_feeds()
  2.16   1292.90    48.58 72953692     0.00     0.00  graphene::chain::generic_index<graphene::chain::limit_order_object, ... >::create(std::function<void (graphene::db::object&)> const&)

These 3 entries sum up to 10% of replay time.

Current code iterates through all asset_objects who is a bit asset or a prediction market on every new block, then fetch their asset_bitasset_data_object, then check if feed is expired. However, we have more than 500 bit assets / prediction markets on the chain now, it's inefficient to iterate through them all on every block.

Another issue is related to cryptonomex/graphene#615. For the first 5,000,000 blocks or so, update_median_feeds() is almost called for every bit asset on every block. Although the bug has been fixed with a hard fork, the buggy code is still there for processing blocks before the hard fork, thus wasting our time for every sync/replay.

To optimize, things need to be done:

  • update median feed and check call orders when feed expired
    • add a by_expiration index in asset_bitasset_index, only iterate through and process expired ones
    • refactor pre-HF615 code with better performance
  • update CER in asset_object when
    • median CER changed, or
    • CER in asset_object got updated (E.G. by asset_update_operation)

This is a sub-task of #982.

@abitmore abitmore added performance 6 Performance Impacts flag identifying system/user efficiency, performance, etc. 9b Small Effort estimation indicating TBD and removed 9b Small Effort estimation indicating TBD labels Jun 25, 2018
@abitmore abitmore self-assigned this Jun 28, 2018
@abitmore abitmore added 2d Developing Status indicating currently designing and developing a solution 3c Enhancement Classification indicating a change to the functionality of the existing imlementation 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform 9b Small Effort estimation indicating TBD labels Jun 28, 2018
@abitmore
Copy link
Member Author

abitmore commented Jul 1, 2018

Almost done on top of #1099.

Total replay time for 27,000,000 blocks reduced from 5063 seconds to 3761 seconds, the difference is 1 - 3761/5063 ~= 25%.

Will create pull request after done cleanup.

@oxarbitrage
Copy link
Member

excellent and very appreciated!

@abitmore abitmore added 9c Large Effort estimation indicating TBD and removed 9b Small Effort estimation indicating TBD labels Jul 1, 2018
abitmore added a commit that referenced this issue Jul 27, 2018
…formance-2

Improve update_expired_feeds performance #1093
@abitmore
Copy link
Member Author

Done with #1180.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2d Developing Status indicating currently designing and developing a solution 3c Enhancement Classification indicating a change to the functionality of the existing imlementation 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform 6 Performance Impacts flag identifying system/user efficiency, performance, etc. 9c Large Effort estimation indicating TBD performance
Projects
None yet
Development

No branches or pull requests

2 participants