Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: Add docs for recommend, revert expression change #579

Merged
merged 2 commits into from
Oct 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,15 @@ https://user-images.githubusercontent.com/35299017/186680122-d7756b47-06be-44cb-

**Recommendation Framework**

Provide a pluggable framework for analytics and give recommendation for cloud resources, support out-of-box recommenders: Workload Resources/Replicas, Idle Resources.
Provide a pluggable framework for analytics and give recommendation for cloud resources, support out-of-box recommenders: Workload Resources/Replicas, Idle Resources. [learn more](https://gocrane.io/docs/tutorials/recommendation/).

**Prediction-driven Horizontal Autoscaling**

EffectiveHorizontalPodAutoscaler supports prediction-driven autoscaling. With this capability, user can forecast the incoming peak flow and scale up their application ahead, also user can know when the peak flow will end and scale down their application gracefully. [learn more](docs/tutorials/using-effective-hpa-to-scaling-with-effectiveness.md).
EffectiveHorizontalPodAutoscaler supports prediction-driven autoscaling. With this capability, user can forecast the incoming peak flow and scale up their application ahead, also user can know when the peak flow will end and scale down their application gracefully. [learn more](https://gocrane.io/docs/tutorials/using-effective-hpa-to-scaling-with-effectiveness/).

**Load-Aware Scheduling**

Provide a simple but efficient scheduler that schedule pods based on actual node utilization data,and filters out those nodes with high load to balance the cluster. [learn more](docs/tutorials/scheduling-pods-based-on-actual-node-load.md).
Provide a simple but efficient scheduler that schedule pods based on actual node utilization data,and filters out those nodes with high load to balance the cluster. [learn more](https://gocrane.io/docs/tutorials/scheduling-pods-based-on-actual-node-load/).

**Colocation with Enhanced QoS**

Expand Down
6 changes: 3 additions & 3 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,15 @@ Crane Dashboard **在线 Demo**: http://dashboard.gocrane.io/

**推荐框架**

提供了一个可扩展的推荐框架以支持多种云资源的分析,内置了多种推荐器:资源推荐,副本推荐,闲置资源推荐。
提供了一个可扩展的推荐框架以支持多种云资源的分析,内置了多种推荐器:资源推荐,副本推荐,闲置资源推荐。[了解更多](https://gocrane.io/zh-cn/docs/tutorials/recommendation/)

**基于预测的水平弹性器**

EffectiveHorizontalPodAutoscaler 支持了预测驱动的弹性。它基于社区 HPA 做底层的弹性控制,支持更丰富的弹性触发策略(预测,观测,周期),让弹性更加高效,并保障了服务的质量。[了解更多](docs/tutorials/using-effective-hpa-to-scaling-with-effectiveness.zh.md)。
EffectiveHorizontalPodAutoscaler 支持了预测驱动的弹性。它基于社区 HPA 做底层的弹性控制,支持更丰富的弹性触发策略(预测,观测,周期),让弹性更加高效,并保障了服务的质量。[了解更多](https://gocrane.io/zh-cn/docs/tutorials/using-effective-hpa-to-scaling-with-effectiveness/)

**负载感知的调度器**

动态调度器根据实际的节点利用率构建了一个简单但高效的模型,并过滤掉那些负载高的节点来平衡集群。[了解更多](docs/tutorials/scheduling-pods-based-on-actual-node-load.zh.md)。
动态调度器根据实际的节点利用率构建了一个简单但高效的模型,并过滤掉那些负载高的节点来平衡集群。[了解更多](https://gocrane.io/zh-cn/docs/tutorials/scheduling-pods-based-on-actual-node-load/)

**基于 QoS 的混部**

Expand Down
4 changes: 2 additions & 2 deletions pkg/utils/ehpa.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,15 +113,15 @@ func GetExpressionQueryDefault(metric autoscalingv2.MetricSpec, namespace string
labels = append(labels, k+"="+`"`+v+`"`)
}
}
expressionQuery = GetCustumerExpression(metric.Pods.Metric.Name, strings.Join(labels, ","))
expressionQuery = GetCustomerExpression(metric.Pods.Metric.Name, strings.Join(labels, ","))
case autoscalingv2.ExternalMetricSourceType:
var labels []string
if metric.External.Metric.Selector != nil {
for k, v := range metric.External.Metric.Selector.MatchLabels {
labels = append(labels, k+"="+`"`+v+`"`)
}
}
expressionQuery = GetCustumerExpression(metric.External.Metric.Name, strings.Join(labels, ","))
expressionQuery = GetCustomerExpression(metric.External.Metric.Name, strings.Join(labels, ","))
}

return expressionQuery
Expand Down
26 changes: 11 additions & 15 deletions pkg/utils/expression_prom_default.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ import (
// todo: later we change these templates to configurable like prometheus-adapter
const (
// WorkloadCpuUsageExprTemplate is used to query workload cpu usage by promql, param is namespace,workload-name,duration str
WorkloadCpuUsageExprTemplate = `sum(irate(container_cpu_usage_seconds_total{namespace="%s",pod=~"^%s-%s"}[%s]))`
WorkloadCpuUsageExprTemplate = `sum(irate(container_cpu_usage_seconds_total{namespace="%s",pod=~"^%s-.*$"}[%s]))`
// WorkloadMemUsageExprTemplate is used to query workload mem usage by promql, param is namespace, workload-name
WorkloadMemUsageExprTemplate = `sum(container_memory_working_set_bytes{namespace="%s",pod=~"^%s-%s"})`
WorkloadMemUsageExprTemplate = `sum(container_memory_working_set_bytes{namespace="%s",pod=~"^%s-.*$"})`

// following is node exporter metric for node cpu/memory usage
// NodeCpuUsageExprTemplate is used to query node cpu usage by promql, param is node name which prometheus scrape, duration str
Expand All @@ -23,35 +23,31 @@ const (
PodMemUsageExprTemplate = `sum(container_memory_working_set_bytes{container!="POD",namespace="%s",pod="%s"})`

// ContainerCpuUsageExprTemplate is used to query container cpu usage by promql, param is namespace,pod,container duration str
ContainerCpuUsageExprTemplate = `irate(container_cpu_usage_seconds_total{container!="POD",namespace="%s",pod=~"^%s-%s",container="%s"}[%s])`
ContainerCpuUsageExprTemplate = `irate(container_cpu_usage_seconds_total{container!="POD",namespace="%s",pod=~"^%s.*$",container="%s"}[%s])`
// ContainerMemUsageExprTemplate is used to query container cpu usage by promql, param is namespace,pod,container
ContainerMemUsageExprTemplate = `container_memory_working_set_bytes{container!="POD",namespace="%s",pod=~"^%s-%s",container="%s"}`
ContainerMemUsageExprTemplate = `container_memory_working_set_bytes{container!="POD",namespace="%s",pod=~"^%s.*$",container="%s"}`

CustumerExprTemplate = `sum(%s{%s})`
CustomerExprTemplate = `sum(%s{%s})`
)

const (
RegMatchesPodName = `[a-z0-9]+-[a-z0-9]{5}$`
)

func GetCustumerExpression(metricName string, labels string) string {
return fmt.Sprintf(CustumerExprTemplate, metricName, labels)
func GetCustomerExpression(metricName string, labels string) string {
return fmt.Sprintf(CustomerExprTemplate, metricName, labels)
}

func GetWorkloadCpuUsageExpression(namespace string, name string) string {
return fmt.Sprintf(WorkloadCpuUsageExprTemplate, namespace, name, RegMatchesPodName, "3m")
return fmt.Sprintf(WorkloadCpuUsageExprTemplate, namespace, name, "3m")
}

func GetWorkloadMemUsageExpression(namespace string, name string) string {
return fmt.Sprintf(WorkloadMemUsageExprTemplate, namespace, name, RegMatchesPodName)
return fmt.Sprintf(WorkloadMemUsageExprTemplate, namespace, name)
}

func GetContainerCpuUsageExpression(namespace string, workloadName string, containerName string) string {
return fmt.Sprintf(ContainerCpuUsageExprTemplate, namespace, workloadName, RegMatchesPodName, containerName, "3m")
return fmt.Sprintf(ContainerCpuUsageExprTemplate, namespace, workloadName, containerName, "3m")
}

func GetContainerMemUsageExpression(namespace string, workloadName string, containerName string) string {
return fmt.Sprintf(ContainerMemUsageExprTemplate, namespace, workloadName, RegMatchesPodName, containerName)
return fmt.Sprintf(ContainerMemUsageExprTemplate, namespace, workloadName, containerName)
}

func GetPodCpuUsageExpression(namespace string, name string) string {
Expand Down
5 changes: 4 additions & 1 deletion pkg/web/src/i18n/resources/en/translation.json
Original file line number Diff line number Diff line change
Expand Up @@ -157,5 +157,8 @@
"当前副本数": "Current Replicas",
"更新时间": "Update Time",
"当前资源(容器/CPU/Memory)": "Current Resource(Container/CPU/Memory)",
"成本分布": "Cost by Dimension"
"成本分布": "Cost by Dimension",
"最近1小时": "Last 1 Hour",
"节点名": "Node Name",
"闲置节点": "Idle Node"
}
5 changes: 4 additions & 1 deletion pkg/web/src/i18n/resources/zh/translation.json
Original file line number Diff line number Diff line change
Expand Up @@ -157,5 +157,8 @@
"当前副本数": "当前副本数",
"更新时间": "更新时间",
"当前资源(容器/CPU/Memory)": "当前资源(容器/CPU/Memory)",
"成本分布": "成本分布"
"成本分布": "成本分布",
"最近1小时": "最近1小时",
"节点名": "节点名",
"闲置节点": "闲置节点"
}
2 changes: 1 addition & 1 deletion pkg/web/src/router/modules/recommend.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ export const useRecommendRouteConfig = () => {
path: 'idleNode',
Component: lazy(() => import('pages/Recommend/IdleNode')),
meta: {
title: t('闭置节点'),
title: t('闲置节点'),
},
},
],
Expand Down
4 changes: 4 additions & 0 deletions site/content/en/docs/Contributing/developer-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ description: "Getting started to develop crane"
First, please make sure you've got a working [Go environment](https://golang.org/doc/install)
and [Docker environment](https://docs.docker.com/engine).

## Prepare local crane environment

Please referring to [quick start](/docs/getting-started/quick-start)

## Clone crane

Clone the repository,
Expand Down
2 changes: 1 addition & 1 deletion site/content/en/docs/Getting started/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="tru

**Recommendation Framework**

Provide a pluggable framework for analytics and give recommendation for cloud resources, support out-of-box recommenders: Workload Resources/Replicas, Idle Resources.
Provide a pluggable framework for analytics and give recommendation for cloud resources, support out-of-box recommenders: Workload Resources/Replicas, Idle Resources. [learn more](/docs/tutorials/recommendation).

**Prediction-driven Horizontal Autoscaling**

Expand Down
7 changes: 7 additions & 0 deletions site/content/en/docs/Tutorials/Recommendation/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

---
title: "Recommendation"
weight: 10
description: >
Docs for Recommendation.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
title: "How to develop Recommender"
description: "Introduce how to develop and extend Recommender based on framework"
weight: 100
---

Recommendation Framework provides an extensible framework for Recommender and supports several built-in Recommender. Users can implement a self-defined Recommender or modify the existing Recommenders.

## Recommender Interface

```go
type Recommender interface {
Name() string
framework.Filter
framework.PrePrepare
framework.Prepare
framework.PostPrepare
framework.PreRecommend
framework.Recommend
framework.PostRecommend
framework.Observe
}

// Phase: Filter

// Filter interface
type Filter interface {
// The Filter will filter resource can`t be recommended via target recommender.
Filter(ctx *RecommendationContext) error
}

// Phase: Prepare

// PrePrepare interface
type PrePrepare interface {
CheckDataProviders(ctx *RecommendationContext) error
}

// Prepare interface
type Prepare interface {
CollectData(ctx *RecommendationContext) error
}

type PostPrepare interface {
PostProcessing(ctx *RecommendationContext) error
}

// PreRecommend interface
type PreRecommend interface {
PreRecommend(ctx *RecommendationContext) error
}

// Phase: Recommend

// Recommend interface
type Recommend interface {
Recommend(ctx *RecommendationContext) error
}

// PostRecommend interface
type PostRecommend interface {
Policy(ctx *RecommendationContext) error
}

// Phase: Observe

// Observe interface
type Observe interface {
Observe(ctx *RecommendationContext) error
}

```
Recommender interface defines four stages and eight extension points that need to be implemented in recommender. These extension points are called sequentially during the recommendation process. Some of these extension points can change recommendation decisions, while others are only give information.

## Architecture

![](/images/recommendation-framework.png)

## Phases

The whole recommendation process is divided into four phases: Filter,Prepare,Recommend,Observe。Phase's input is the Kubernetes resource to analysis,output is the recommendation advise. Let's begin to introduce the inputs, outputs, and capabilities of each phase.

`RecommendationContext` saved the context for a recommended process, including recommended target, RecommendationConfiguration etc., the user can add more content as needed.

### Filter

The Filter phase is used to preprocess the recommendation data. In general, it is necessary to decide whether the recommendation target matches Recommender during preprocessing. For example, the Resource Recommender only supports handling Workload (Deployment, StatefulSet). In addition, it can also determine whether the recommended target state is suitable for recommendation, such as whether it is being deleted or just created. The recommendation will be terminated when return error. BaseRecommender implements basic preprocessing functions and users can call it to inherit related functions.

### Prepare

The Prepare phase is used for data preparation, requesting an external monitoring system and saving the timing data in the context. PrePrepare extension point used to check the connection status of the monitoring system. Prepare extension point used to query time series data. The PostPrepare extension point is used to process time series data, such as abnormal cold start data, partial data loss, data aggregation, and clearing abnormal data.

### Recommend

The Recommend phase is used to optimize recommendations based on timing data and resource allocation. The type of optimization recommendation depends on the type of recommendation. For example, if it is a resource recommendation, then the output is the resource configuration for the kubernetes workload. The Recommend extension point is used to analyze and calculate the data using Crane's algorithm module, and the analysis result is finally processed in the PostRecommend stage. Users can customize it by implement their Recommend phase.

### Observe

The Observe phase is used to observe the recommendation result. For example, when recommending a resource, the information about the optimization proposal is saved to the monitoring system via Metric, and the revenue generated by the optimization proposal is observed through the Dashboard.
Loading