Skip to content

Commit

Permalink
fix(retry): fix the retry policy do not take affect
Browse files Browse the repository at this point in the history
  • Loading branch information
whalecold committed Sep 2, 2024
1 parent 3820016 commit cee5056
Show file tree
Hide file tree
Showing 12 changed files with 303 additions and 58 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ jobs:
unit-benchmark-test:
strategy:
matrix:
go: [ 1.17, 1.18, 1.19 ]
os: [ X64, ARM64 ]
go: [ "1.18", "1.19", "1.20", "1.21", "1.22"]
os: [ X64 ]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
Expand Down
71 changes: 69 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,10 @@ To enable xDS mode in Kitex, we should invoke `xds.Init()` to initialize the xds
#### Bootstrap
The xdsClient is responsible for the interaction with the xDS Server (i.e. Istio). It needs some environment variables for initialization, which need to be set inside the `spec.containers.env` of the Kubernetes Manifest file in YAML format.

* `POD_NAMESPACE`: the namespace of the current service.
* `POD_NAMESPACE`: the namespace of the current service.
* `POD_NAME`: the name of this pod.
* `INSTANCE_IP`: the ip of this pod.
* `KITEX_XDS_METAS`: the metadata of this xDS node.

Add the following part to the definition of your container that uses xDS-enabled Kitex client.

Expand All @@ -50,6 +51,8 @@ valueFrom:
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: KITEX_XDS_METAS
value: '{"CLUSTER_ID":"Kubernetes","DNS_AUTO_ALLOCATE":"true","DNS_CAPTURE":"true","INSTANCE_IPS":"$(INSTANCE_IP)","NAMESPACE":"$(POD_NAMESPACE)"}'
```

### Client-side
Expand Down Expand Up @@ -231,6 +234,10 @@ spec:
failure_percentage_threshold: 10
# the failure percentage request volume
failure_percentage_request_volume: 101
workloadSelector:
labels:
# the label of the client pod.
app.kubernetes.io/name: kitex-client
```

#### RateLimit
Expand Down Expand Up @@ -262,12 +269,72 @@ spec:
max_tokens: 4
workloadSelector:
labels:
# the label of the service pod.
# the label of the server pod.
app.kubernetes.io/name: kitex-server
```

#### Retry

Support using VirtualService and EnvoyFilter to config retry policy, the EnvoyFilter has more configuration.

```
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: retry-sample
namespace: default
spec:
hosts:
- hello.prod.svc.cluster.local:21001
http:
- route:
- destination:
host: hello.prod.svc.cluster.local:21001
retries:
attempts: 1
perTryTimeout: 2s
```

```
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: retry-enhance
namespace: default
spec:
configPatches:
- applyTo: HTTP_ROUTE
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
# service name, should obey FQDN
name: hello.default.svc.cluster.local:21001
vhost:
# service name, should obey FQDN
name: hello.default.svc.cluster.local:21001
patch:
operation: MERGE
value:
route:
retryPolicy:
numRetries: 3
perTryTimeout: 100ms
retryBackOff:
baseInterval: 100ms
maxInterval: 100ms
retriableHeaders:
- name: "kitexRetryErrorRate"
stringMatch:
exact: "0.29"
- name: "kitexRetryMethods"
stringMatch:
exact: "Echo,Greet"
workloadSelector:
labels:
# the label of the service pod.
app.kubernetes.io/name: kitex-client
```

## Example
The usage is as follows:
Expand Down
70 changes: 69 additions & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,10 @@ Kitex 通过外部扩展 [kitex-contrib/xds](https://github.com/kitex-contrib/xd
xdsClient 负责与控制面(例如 Istio)交互,以获得所需的 xDS 资源。在初始化时,需要读取环境变量用于构建 node 标识。所以,需要在K8S 的容器配置文件 `spec.containers.env` 部分加入以下几个环境变量。


* `POD_NAMESPACE`: 当前 pod 所在的 namespace。
* `POD_NAMESPACE`: 当前 pod 所在的 namespace。
* `POD_NAME`: pod 名。
* `INSTANCE_IP`: pod 的 ip。
* `KITEX_XDS_METAS`: 用于构建 node 标识的元信息,格式为 json 字符串。

在需要使用 xDS 功能的容器配置中加入以下定义即可:

Expand All @@ -51,6 +52,8 @@ valueFrom:
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: KITEX_XDS_METAS
value: '{"CLUSTER_ID":"Kubernetes","DNS_AUTO_ALLOCATE":"true","DNS_CAPTURE":"true","INSTANCE_IPS":"$(INSTANCE_IP)","NAMESPACE":"$(POD_NAMESPACE)"}'
```

### Kitex 客户端
Expand Down Expand Up @@ -221,6 +224,10 @@ spec:
failure_percentage_threshold: 10
# 触发熔断请求量
failure_percentage_request_volume: 101
workloadSelector:
labels:
# the label of the client pod.
app.kubernetes.io/name: kitex-client
```

#### 限流配置
Expand Down Expand Up @@ -256,6 +263,67 @@ spec:
app.kubernetes.io/name: kitex-server
```
#### 重试配置

重试支持两个配置方式:VirtualService 和 EnvoyFilter,EnvoyFilter 支持更丰富的重试策略。

```
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: retry-sample
namespace: default
spec:
hosts:
- hello.prod.svc.cluster.local:21001
http:
- route:
- destination:
host: hello.prod.svc.cluster.local:21001
retries:
attempts: 1
perTryTimeout: 2s
```

```
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: retry-enhance
namespace: default
spec:
configPatches:
- applyTo: HTTP_ROUTE
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
# service name, should obey FQDN
name: hello.default.svc.cluster.local:21001
vhost:
# service name, should obey FQDN
name: hello.default.svc.cluster.local:21001
patch:
operation: MERGE
value:
route:
retryPolicy:
numRetries: 3
perTryTimeout: 100ms
retryBackOff:
baseInterval: 100ms
maxInterval: 100ms
retriableHeaders:
- name: "kitexRetryErrorRate"
stringMatch:
exact: "0.29"
- name: "kitexRetryMethods"
stringMatch:
exact: "Echo,Greet"
workloadSelector:
labels:
# the label of the service pod.
app.kubernetes.io/name: kitex-client
```

## 示例
完整的客户端用法如下:
Expand Down
31 changes: 31 additions & 0 deletions core/xdsresource/rds.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,21 @@ package xdsresource
import (
"encoding/json"
"fmt"
"strconv"
"strings"
"time"

v3routepb "github.com/envoyproxy/go-control-plane/envoy/config/route/v3"
"github.com/golang/protobuf/ptypes/any"
"google.golang.org/protobuf/proto"
)

const (
// If the error rate large than it, the retry policy do not take affect.
cBErrorRateKey = "kitexRetryErrorRate"
retryMethods = "kitexRetryMethods"
)

// RouteConfigResource is used for routing
// HTTPRouteConfig is the native http route config, which consists of a list of virtual hosts.
// ThriftRouteConfig is converted from the routeConfiguration in thrift proxy, which can only be configured in the listener filter
Expand All @@ -34,6 +42,7 @@ type RouteConfigResource struct {
HTTPRouteConfig *HTTPRouteConfig
ThriftRouteConfig *ThriftRouteConfig
MaxTokens uint32
TokensPerFill uint32
}

type HTTPRouteConfig struct {
Expand Down Expand Up @@ -64,7 +73,9 @@ type RetryPolicy struct {
NumRetries int
PerTryTimeout time.Duration
PerTryIdleTimeout time.Duration
CBErrorRate float64
RetryBackOff *RetryBackOff
Methods []string
}

type Route struct {
Expand Down Expand Up @@ -181,6 +192,26 @@ func unmarshalRoutes(rs []*v3routepb.Route) ([]*Route, error) {
PerTryTimeout: retryPolicy.GetPerTryTimeout().AsDuration(),
PerTryIdleTimeout: retryPolicy.GetPerTryIdleTimeout().AsDuration(),
}
// used for config the errRate.
for _, header := range retryPolicy.GetRetriableHeaders() {
match := header.GetStringMatch()
if match == nil {
continue
}
value := match.GetExact()
if value == "" {
continue
}
switch header.Name {
case cBErrorRateKey:
errRate, err := strconv.ParseFloat(value, 64)
if err == nil {
route.RetryPolicy.CBErrorRate = errRate
}
case retryMethods:
route.RetryPolicy.Methods = strings.Split(value, ",")
}
}
if backoff := retryPolicy.GetRetryBackOff(); backoff != nil {
route.RetryPolicy.RetryBackOff = &RetryBackOff{
BaseInterval: backoff.GetMaxInterval().AsDuration(),
Expand Down
14 changes: 7 additions & 7 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
module github.com/kitex-contrib/xds

go 1.17
go 1.18

require (
github.com/bytedance/gopkg v0.1.0
github.com/bytedance/gopkg v0.1.1
github.com/cenkalti/backoff/v4 v4.1.0
github.com/cloudwego/kitex v0.10.3
github.com/cncf/udpa/go v0.0.0-20220112060539-c52dc94e7fbe
github.com/envoyproxy/go-control-plane v0.11.1
github.com/golang/protobuf v1.5.3
github.com/google/go-cmp v0.5.9
github.com/stretchr/testify v1.8.3
github.com/stretchr/testify v1.9.0
go.uber.org/atomic v1.11.0
google.golang.org/genproto/googleapis/rpc v0.0.0-20230526203410-71b5a4ffd15e
google.golang.org/protobuf v1.30.0
Expand Down Expand Up @@ -52,10 +52,10 @@ require (
github.com/tidwall/pretty v1.2.0 // indirect
github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
golang.org/x/arch v0.2.0 // indirect
golang.org/x/net v0.17.0 // indirect
golang.org/x/sync v0.1.0 // indirect
golang.org/x/sys v0.13.0 // indirect
golang.org/x/text v0.13.0 // indirect
golang.org/x/net v0.24.0 // indirect
golang.org/x/sync v0.8.0 // indirect
golang.org/x/sys v0.19.0 // indirect
golang.org/x/text v0.14.0 // indirect
google.golang.org/genproto v0.0.0-20230526203410-71b5a4ffd15e // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20230526203410-71b5a4ffd15e // indirect
google.golang.org/grpc v1.56.3 // indirect
Expand Down
Loading

0 comments on commit cee5056

Please sign in to comment.