Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support store block meta to RocksBlockStore, store location to heap #15238

Open
wants to merge 2 commits into
base: master-2.x
Choose a base branch
from

Conversation

maobaolong
Copy link
Contributor

@maobaolong maobaolong commented Mar 31, 2022

What changes are proposed in this pull request?

The PR #14565 is the dependency PR of this.

image

From this figure, we can see one of the bottleneck of getFileInfo performance is getLocation, so I propose to use HEAP to store the location of blocks, but store the block meta into rocks store.

Why are the changes needed?

With this PR, we can get a tradeoff between performance and scalability.

Does this PR introduce any user facing changes?

Add a configure key alluxio.master.block.metastore to ROCKS_BLOCK_META_ONLY

@alluxio-bot alluxio-bot added the API Change Changes covering public API label Mar 31, 2022
@kevincai
Copy link
Contributor

very interested in performance comparison before and after this change. Also, is there any HEAP memory estimation about the location info?

@maobaolong
Copy link
Contributor Author

@kevincai We had a test through stressmasterbench these days, it shows that this PR can improve the performance obviously.

  • For zero byte file getBlockLocation, metaonly vs rocks , the master rpc throughput are 26w cps vs 22w cps.
  • For 1kb file getBlockLocation, metaonly vs rocks, the master rpc throughput are 16w vs 7w.

@jiacheliu3
Copy link
Contributor

@kevincai We had a test through stressmasterbench these days, it shows that this PR can improve the performance obviously.

  • For zero byte file getBlockLocation, metaonly vs rocks , the master rpc throughput are 26w cps vs 22w cps.
  • For 1kb file getBlockLocation, metaonly vs rocks, the master rpc throughput are 16w vs 7w.

Of course using heap will be faster than rocks. I'd be more interested to see the performance difference compared to the memory usage. After all we are comparing the pros and cons and we need to see the cons.

@jiacheliu3
Copy link
Contributor

@kevincai We had a test through stressmasterbench these days, it shows that this PR can improve the performance obviously.

  • For zero byte file getBlockLocation, metaonly vs rocks , the master rpc throughput are 26w cps vs 22w cps.
  • For 1kb file getBlockLocation, metaonly vs rocks, the master rpc throughput are 16w vs 7w.

Of course using heap will be faster than rocks. I'd be more interested to see the performance difference compared to the memory usage. After all we are comparing the pros and cons and we need to see the cons.

@maobaolong Pls see comments above and let's discuss on this PR in the next community meeting. Thx :)

@maobaolong maobaolong force-pushed the supportRocksBlockStoreMetaOnly branch from 5fc4b5a to cdbcf8b Compare November 4, 2022 02:32
@alluxio-bot alluxio-bot removed the API Change Changes covering public API label Nov 4, 2022
@maobaolong
Copy link
Contributor Author

@jiacheliu3 Is there any further suggestion about this PR?

Copy link
Contributor

@jiacheliu3 jiacheliu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maobaolong Thanks for the improvements they make sense to me. Could you please add below stats?

  1. block_meta size:
    (a) on heap
    (b) on disk

  2. block_location size:
    (a) on heap
    (b) on disk

  3. Can you provide some use cases on how many about block_meta and block_location you have in your environment?

* @param baseDir the base directory in which to store block store metadata
*/
public RocksBlockMetaStoreMetaOnly(String baseDir) {
super(baseDir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RocksBlockMetaStore() creates BlockMeta and BlockLocation tables in RocksDB, and BlockLocation is not needed.

You can probably do

class RocksBlockMetaStore {
    RocksBlockMetaStore() {
      .. 
      initDb();
    }
    public initDb() {
      // create rocks configs etc
      createBlockMetaTable();
      createBlockLocationsTable();
    }
}

public class RocksBlockMetaStoreMetaOnly extends RocksBlockMetaStore {
    RocksBlockMetaStoreMetaOnly() {
      super();
      // create your TwoKeyHashMap
    }

    @Override
    public initDb() {
      // create rocks configs etc
      createBlockMetaTable();
      // do not create BlockLocation table
    }
}

@maobaolong
Copy link
Contributor Author

maobaolong commented Jan 6, 2023

I get the BlockLocation jmap as following

index            count      total bytes
1543:             2             96  alluxio.proto.meta.Block$BlockLocation

@maobaolong
Copy link
Contributor Author

image

@@ -57,7 +59,10 @@ public List<BlockLocation> getLocations(long blockid) {

@Override
public void addLocation(long blockId, BlockLocation location) {
mBlockLocations.addInnerValue(blockId, location.getWorkerId(), location);
// NOTICE(maobaolong): mLocationCacheMap can be increase to WOKRER_COUNT X TIER_COUNT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider when to remove from this map

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove in LostWorkerDetectionExecutor heartbeat thread, when a worker is removed from mWorkers into mLostWorkers, we remove all worker location objects after processLostWorker()

@@ -35,6 +36,7 @@ public class RocksBlockMetaStoreMetaOnly extends RocksBlockMetaStore {
// Map from block id to block locations.
public final TwoKeyConcurrentMap<Long, Long, BlockLocation, Map<Long, BlockLocation>>
mBlockLocations = new TwoKeyConcurrentMap<>(() -> new HashMap<>(4));
private final Map<BlockLocation, BlockLocation> mLocationCacheMap = new ConcurrentHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very useful change! Let's separate it into a different PR. Thanks a lot in advance!

Also I think this applies to all BlockMetaStore. The change should be in

mBlockMetaStore.addLocation(blockId, BlockLocation.newBuilder()
So when we need a BlockLocation, we always go to a cache instead of creating a new one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

In fact we have two caller as above figure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiacheliu3 Create a PR #16763 and move the cache logic to DefaultBlockMaster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants