layout | title | permalink | redirect_from | ||
---|---|---|---|---|---|
post |
REBALANCE |
/docs/rebalance |
|
To maintain consistent distribution of user data at all times, AIStore rebalances itself based on new versions of its cluster map.
More exactly:
- When storage targets join or leave the cluster, the current primary (leader) proxy transactionally creates the next updated version of the cluster map;
- Synchronizes the new map across the entire cluster so that each and every node gets the version;
- Which further results in each AIS target starting to traverse its locally stored content, recomputing object locations,
- And sending at least some of the objects to their respective new locations
- Whereby object migration is carried out via intra-cluster optimized communication mechanism and over a separate physical or logical network, if provisioned.
Thus, cluster-wide rebalancing is totally and completely decentralized. When a single server joins (or goes down in a) cluster of N servers, approximately 1/Nth of the entire namespace will get rebalanced via direct target-to-target transfers.
Further, cluster-wide rebalancing does not require any downtime. Incoming GET requests for the objects that haven't yet migrated (or are being moved) are handled internally via the mechanism that we call "get-from-neighbor". The (rebalancing) target that must (according to the new cluster map) have the object but doesn't, will locate its "neighbor", get the object, and satisfy the original GET request transparently from the user.
Similar to all other AIS modules and sub-systems, global rebalance is controlled and monitored via the documented RESTful API. It might be easier and faster, though, to use AIS CLI - see next section.
- Disable automated global rebalance (for instance, to perform maintenance or upgrade operations) and show resulting config in JSON on a randomly selected target:
$ ais config cluster rebalance.enabled=false
config successfully updated
$ ais show config 361179t8088 --json | grep -A 6 rebalance
"rebalance": {
"dest_retry_time": "2m",
"quiescent": "20s",
"compression": "never",
"multiplier": 4,
"enabled": false
},
- Re-enable automated global rebalance and show resulting config section as a simple
name/value
list:
$ ais config cluster rebalance.enabled=true
config successfully updated
$ ais show config <TAB-TAB>
125210p8082 181883t8089 249630t8087 361179t8088 477343p8081 675515t8084 70681p8080 782227p8083 840083t8086 911875t8085
$ ais show config 840083t8086 rebalance
PROPERTY VALUE DEFAULT
rebalance.compression never -
rebalance.dest_retry_time 2m -
rebalance.enabled true -
rebalance.multiplier 2 -
rebalance.quiescent 10s -
- Monitoring: notice per-target statistics and the
EndTime
column
$ ais show rebalance
DaemonID RebID ObjRcv SizeRcv ObjSent SizeSent StartTime EndTime Aborted
====== ====== ====== ====== ====== ====== ====== ====== ======
181883t8089 1 0 0B 1058 1.27MiB 04-28 16:05:35 <not completed> false
249630t8087 1 0 0B 988 1.18MiB 04-28 16:05:35 <not completed> false
361179t8088 1 5029 6.02MiB 0 0B 04-28 16:05:35 <not completed> false
675515t8084 1 0 0B 989 1.18MiB 04-28 16:05:35 <not completed> false
840083t8086 1 0 0B 974 1.17MiB 04-28 16:05:35 <not completed> false
911875t8085 1 0 0B 1020 1.22MiB 04-28 16:05:35 <not completed> false
$ ais show rebalance
DaemonID RebID ObjRcv SizeRcv ObjSent SizeSent StartTime EndTime Aborted
====== ====== ====== ====== ====== ====== ====== ====== ======
181883t8089 1 0 0B 1058 1.27MiB 04-28 16:05:35 04-28 16:05:53 false
249630t8087 1 0 0B 988 1.18MiB 04-28 16:05:35 04-28 16:05:53 false
361179t8088 1 5029 6.02MiB 0 0B 04-28 16:05:35 04-28 16:05:53 false
675515t8084 1 0 0B 989 1.18MiB 04-28 16:05:35 04-28 16:05:53 false
840083t8086 1 0 0B 974 1.17MiB 04-28 16:05:35 04-28 16:05:53 false
911875t8085 1 0 0B 1020 1.22MiB 04-28 16:05:35 04-28 16:05:53 false
- Since global rebalance is an extended action (xaction), it can be also monitored via generic
show xaction
API:
$ ais show job xaction rebalance
NODE ID KIND BUCKET OBJECTS BYTES START END STATE
181883t8089 g2 rebalance - 1058 1.27MiB 04-28 16:10:14 - Running
...
- Finally, you can always start and stop global rebalance administratively, for instance:
$ ais start rebalance
While rebalance (previous section) takes care of the cluster grow and shrink events, resilver, as the name implies, is responsible for the mountpath added and mountpath removed events handled locally within (and by) each storage target.
In other words, global rebalance handles scaling (up and down) of the entire AIS cluster while automated resilvering takes care of disk attachments and disk faults within a given storage node.
- A mountpath is a single disk or a volume (a RAID) formatted with a local filesystem of choice, and a local directory that AIS utilizes to store user data and AIS metadata. A mountpath can be disabled and (re)enabled, automatically or administratively, at any point during runtime. In a given cluster, a total number of mountpaths would normally compute as a direct product of
(number of storage targets) x (number of disks in each target)
.
As stated, mountpath removal can be done administratively (via API) or be triggered by a disk fault (see filesystem health checking. Irrespectively of the original cause, mountpath-level events activate resilver that in many ways performs the same set of steps as the rebalance. The one salient difference is that all object migrations are local (and, therefore, relatively fast(er)).
Resilvering can be run on a specific target node or the entire cluster (when all targets execute resilvering in parallel).
Similar to global rebalancing, resilvering is a managed eXtended operation or xaction.
All xactions execute asyncrhonously and support a common set of documented APIs to start, terminate the xaction, inquire its progress, etc. The progress of resilvering can be monitored via ais show job xaction
CLI.
Examples:
$ ais advanced resilver # all targets will be resilvered
Started resilver "NGxmOthtE", use 'ais show job xaction NGxmOthtE' to monitor the progress
$ ais advanced resilver BUQOt8086 # resilver a single node
Started resilver "NGxmOthtE", use 'ais show job xaction NGxmOthtE' to monitor the progress
Automated resilvering can also be disabled. Just like with rebalance
, the resulting config can be viewed through the CLI:
NOTE: When automated resilvering is disabled, removing a mountpath may result in data loss.
$ ais config cluster resilver.enabled=false
config successfully updated
$ ais show config 361179t8088 resilver --json | grep -A 2 resilver
"resilver": {
"enabled": false
},
$ ais config cluster resilver.enabled=true
config successfully updated
$ ais show config <TAB-TAB>
125210p8082 181883t8089 249630t8087 361179t8088 477343p8081 675515t8084 70681p8080 782227p8083 840083t8086 911875t8085
$ ais show config 361179t8088 resilver
PROPERTY VALUE
resilver.enabled true
During rebalancing, response latency and overall cluster throughput may substantially degrade.