You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to list a few considerations that should be taken into account to improve the autoscaling controller when handling storage resources:
ECK doesn't know, or can't predict, the available capacity of a volume because of the reserved space for some filesystems (mostly the case for ext4, 5% by defaults, only a few MB for xfs).
For data tiers (except frozen ?), the total storage capacity required by the autoscaling API is always at least the total observed storage capacity from all the Pods in the tier, required_capacity.total = Σ(current_capacity.node.storage) + unassigned_data.
K8S may bind a volume with a larger capacity than the one claimed (Volume Capacity > Volume Claim).
If not handled properly these considerations may lead to 3 issues:
Because of the fs reserved capacity, the capacity available to Elasticsearch might be smaller than the one in the K8S claim: it may delay a scale up event. We should compare the required capacity to the "observed" capacity as reported by the autoscaling API to understand when a scale up must be triggered.
If the actual capacity is higher than the claimed one then Elasticsearch reports that value as a required one (even if it's technically not required) which can lead to some cascading scale up events, up to the limit specified by the user. It can also exceed to limit specified by the user in which case some not pertinent HorizontalScalingLimitReached events are generated.
If the actual capacity of a volume is greater than the claim, then the nodes may hold more data than the maximum one specified in the autoscaling specification. It may lead to overloaded nodes. For example, assuming the following autoscaling policy:
Say that the claims of 1Gi have been bound to volumes of 1Ti of data each, then chances are that the 2Gi of memory are not enough to handle that amount of data. We should maybe notify the user that the total storage capacity is "unexpected" and maybe immediately scale up the memory to 6Gi ?
The text was updated successfully, but these errors were encountered:
If the actual capacity of a volume is greater than the claim, then the nodes may hold more data than the maximum one specified in the autoscaling specification. It may lead to overloaded nodes.
This issue is to list a few considerations that should be taken into account to improve the autoscaling controller when handling storage resources:
ext4
, 5% by defaults, only a few MB forxfs
).required_capacity.total = Σ(current_capacity.node.storage) + unassigned_data
.If not handled properly these considerations may lead to 3 issues:
HorizontalScalingLimitReached
events are generated.Say that the claims of 1Gi have been bound to volumes of 1Ti of data each, then chances are that the 2Gi of memory are not enough to handle that amount of data. We should maybe notify the user that the total storage capacity is "unexpected" and maybe immediately scale up the memory to 6Gi ?
The text was updated successfully, but these errors were encountered: