Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-disks for PageStorage #1128

Closed
8 tasks done
flowbehappy opened this issue Sep 28, 2020 · 6 comments
Closed
8 tasks done

Support multi-disks for PageStorage #1128

flowbehappy opened this issue Sep 28, 2020 · 6 comments
Assignees

Comments

@flowbehappy
Copy link
Contributor

flowbehappy commented Sep 28, 2020

Currently, we can only use one path to store a PageStorage. It causes much more IO pressure(from Delta and Region snapshots) for the first disk than other disks when TiFlash is deployed in multi-paths mode. And support multi-paths for PageStorage can distribute the IO pressures between all disks.

Tasks:

@JaySon-Huang

This comment has been minimized.

@JaySon-Huang
Copy link
Contributor

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Nov 3, 2020

The IOPS in StoragePool.meta_storage is almost the same as StoragePool.log_storage, though meta_storage requires less bandwidth.
To get better performance under multi-disks deployment, we should distribute the IOPS pressure from meta_storage.

@JaySon-Huang JaySon-Huang changed the title Support multi-paths for PageStorage Support multi-disks for PageStorage Nov 3, 2020
@JaySon-Huang
Copy link
Contributor

Now StoragePathPool only considers the used size when choosing disks for new DTFile/PageFile. It may make the data imbalance between different disks.
We may use the used/capacity in PathCapacityMetrics instead of the used size in each Storage.

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Nov 6, 2020

## Deprecated storage path setting style. Check [storage] section for new style.
# path = "/tmp/tiflash/data/db"
# capacity = "10737418240"
## Deprecated storage path setting style of multi-disks. Check [storage] section for new style.
# path = "/tmp/tiflash/tiflash0,/tmp/tiflash1,/tmp/tiflash2"
## If you set `path_realtime_mode` to `true` and multiple directories are deployed in
## the path, the latest data is stored in the first directory and older data is stored in
## the rest directories.
# path_realtime_mode = false
# capacity = "0,0,0"

# new style storage paths
[storage]
## If there are multiple SSD disks on the machine,
## specify the path list on `storage.main.dir` can improve TiFlash performance.

## If there are multiple disks with different IO metrics (e.g. one SSD and some HDDs)
## on the machine,
## set `storage.latest.dir` to store the latest data on SSD (disks with higher IOPS metrics)
## set `storage.main.dir` to store the main data on HDD (disks with lower IOPS metrics)
## can improve TiFlash performance.

[storage.main]
## The path to store main data.
# e.g.
# dir = [ "/data0/tiflash" ]
# or
# dir = [ "/data0/tiflash", "/data1/tiflash" ]

## Store capacity of each path, i.e. max data size allowed.
## If it is not set, or is set to 0s, disk capacity will be used.
# capacity = [ ]

[storage.latest]
## The path(s) to store latest data.
## If not set, it will be the same with `storage.main.dir`.
dir = [ ]

## Store capacity of each path, i.e. max data size allowed.
## If it is not set, or is set to 0s, disk capacity will be used.
# capacity = [ ]

[storage.raft]
## The path(s) to store Raft data.
## If not set, it will be the paths in `storage.latest.dir` appended with "/kvstore".
# dir = [ ]

I think we'd better use the list format in TOML instead of a string separated with ",".
And the capacity should be a list for paths too.
If the disks are 500GB nvme SSD * 1, 2 TB HDD * 10, we need to set different capacities for each path.

What's more, we should choose a path for DTFile/PageFile by the global free disk space instead of the used disk space of each storage.

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Nov 6, 2020

storage:
    ## If there are multiple SSD disks on the machine,
    ## specify the path list on `storage.main.dir` can improve TiFlash performance.

    ## If there are multiple disks with different IO metrics (e.g. one SSD and some HDDs)
    ## on the machine,
    ## set `storage.latest.dir` to store the latest data on SSD (disks with higher IOPS metrics)
    ## set `storage.main.dir` to store the main data on HDD (disks with lower IOPS metrics)
    ## can improve TiFlash performance.

    main:
        ## The path to store main data.
        dir: 
        - /data1/tiflash
        - /data2/tiflash
        ## Store capacity of each path, i.e. max data size allowed.
        ## If it is not set, or is set to 0s, disk capacity will be used.
        capacity:
        - 107374182400
        - 107374182400
    latest:
        ## The path(s) to store latest data.
        ## If not set, it will be the same with `storage.main.dir`.
        #dir:
        #    - ""
        ## Store capacity of each path, i.e. max data size allowed.
        ## If it is not set, or is set to 0s, disk capacity will be used.
        #capacity:
        #    - 0
    raft:
        ## The path(s) to store Raft data.
        ## If not set, it will be the paths in `storage.latest.dir` appended with "/kvstore".
        #dir:
        #    - ""

The configuration in topo.yaml file of TiUP should be in this format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants