Skip to content

Commit

Permalink
[STRMCMP-1659] k8s v1.24 upgrade (#296)
Browse files Browse the repository at this point in the history
This PR updates the operator to use Kubernetes 1.24 APIs, tested on
Kubernetes 1.27.

## Changes
1. Updates project dependencies in `go.mod` to use the K8S 1.24 APIs
2. Updates the WordCount example application to use Flink 1.16
(previously Flink 1.8, which predates [Flink's new memory management
configs](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_setup/))
3. Configures the `SubmitJobRequest` struct type to consider the fields
all as `omitempty` so they're included in the SubmitJob only if
non-empty. This resolves a problem with newer versions of Flink that
attempted to start the job from an empty-string savepoint URI, which
would consistently fail. These fields are
[optional](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run),
so omitting them should be fine.
4. Introduces a Zap logging wrapper since the Prometheus logging package
is dead.

###Testing
This was developed and tested against a [k3d cluster](https://k3d.io) of
Kubernetes 1.27 with the following specs:
```
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4+k3s1", GitCommit:"36645e7311e9bdbbf2adb79ecd8bd68556bc86f6", GitTreeState:"clean", BuildDate:"2023-07-28T09:46:05Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/arm64"}
```

### Steps
#### 1. Build Operator
```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator
→ make
IMAGE_NAME=$REPOSITORY ./boilerplate/lyft/docker_build/docker_build.sh

------------------------------------
           DOCKER BUILD
------------------------------------

[+] Building 28.6s (17/17) FINISHED                                                                                                                  docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                                                                                 0.0s
 => => transferring dockerfile: 895B                                                                                                                                 0.0s
 => [internal] load .dockerignore                                                                                                                                    0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [internal] load metadata for docker.io/library/alpine:3.17                                                                                                       1.0s
 => [internal] load metadata for docker.io/library/golang:1.20.2-alpine3.17                                                                                          1.8s
 => [auth] library/alpine:pull token for registry-1.docker.io                                                                                                        0.0s
 => [auth] library/golang:pull token for registry-1.docker.io                                                                                                        0.0s
 => [builder 1/7] FROM docker.io/library/golang:1.20.2-alpine3.17@sha256:87734b78d26a52260f303cf1b40df45b0797f972bd0250e56937c42114bf472c                            0.0s
 => CACHED [stage-1 1/2] FROM docker.io/library/alpine:3.17@sha256:f71a5f071694a785e064f05fed657bf8277f1b2113a8ed70c90ad486d6ee54dc                                  0.0s
 => [internal] load build context                                                                                                                                    0.1s
 => => transferring context: 203.13kB                                                                                                                                0.1s
 => CACHED [builder 2/7] RUN apk add git openssh-client make curl bash                                                                                               0.0s
 => CACHED [builder 3/7] COPY go.mod go.sum /go/src/github.com/lyft/flinkk8soperator/                                                                                0.0s
 => CACHED [builder 4/7] WORKDIR /go/src/github.com/lyft/flinkk8soperator                                                                                            0.0s
 => CACHED [builder 5/7] RUN go mod download                                                                                                                         0.0s
 => [builder 6/7] COPY . /go/src/github.com/lyft/flinkk8soperator/                                                                                                   0.2s
 => [builder 7/7] RUN go mod vendor && make linux_compile                                                                                                           25.8s
 => [stage-1 2/2] COPY --from=builder /artifacts /bin                                                                                                                0.1s
 => exporting to image                                                                                                                                               0.1s
 => => exporting layers                                                                                                                                              0.1s
 => => writing image sha256:01be975d7ac4f16e3dce41a98088a6ea5e7865fd6ab08f40960e1312f8f0f63c                                                                         0.0s
 => => naming to docker.io/library/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b                                                                         0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview
flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b built locally.
```

Next, tag the image and push to my local Docker image registry:
```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator
→ docker tag flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b localhost:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b

scott /Users/scott/go/src/github.com/lyft/flinkk8soperator
→ docker push localhost:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b
The push refers to repository [localhost:56230/flinkk8soperator]
1b411adf446d: Pushed
d2d9d24a8c2a: Layer already exists
ed05140: digest: sha256:3e203d359db91d5ea61b0bbc47b28f2b5bfc79505552b7ed07663f8b8abff94a size: 739
```

#### 2. Deploy the Operator YAMLs
Update the `deploy/flinkk8soperator.yaml` file to use the image tag and
name of my local image registry:
```
        image: flink-operator-dev-registry:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b
```

Apply all the YAMLs in `deploy` using the [Quick Start
Guide](https://github.com/lyft/flinkk8soperator/blob/master/docs/quick-start-guide.md#operator-installation).

#### 3. Build the Word Count app
```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount
→ docker build . -t flink-wordcount:latest
[+] Building 9.4s (15/15) FINISHED                                                                                                                   docker:desktop-linux
 => [internal] load .dockerignore                                                                                                                                    0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [internal] load build definition from Dockerfile                                                                                                                 0.0s
 => => transferring dockerfile: 362B                                                                                                                                 0.0s
 => [internal] load metadata for docker.io/library/flink:1.16.2-scala_2.12-java11                                                                                    1.0s
 => [internal] load metadata for docker.io/library/maven:3.9.3                                                                                                       1.0s
 => [auth] library/flink:pull token for registry-1.docker.io                                                                                                         0.0s
 => [auth] library/maven:pull token for registry-1.docker.io                                                                                                         0.0s
 => [builder 1/4] FROM docker.io/library/maven:3.9.3@sha256:de70becda2f183567e105530460b6cbedfc014b64f27c2003d36d7fc85bc222f                                         0.0s
 => [internal] load build context                                                                                                                                    0.0s
 => => transferring context: 1.74kB                                                                                                                                  0.0s
 => CACHED [stage-1 1/3] FROM docker.io/library/flink:1.16.2-scala_2.12-java11@sha256:f11b1cd44c964cd644d0ee6b01281f3c5a57f2c7a519939c9529e6cbbd96f615               0.0s
 => CACHED [builder 2/4] COPY src /usr/src/app/src                                                                                                                   0.0s
 => [builder 3/4] COPY pom.xml /usr/src/app                                                                                                                          0.0s
 => [builder 4/4] RUN mvn -f /usr/src/app/pom.xml clean package                                                                                                      8.2s
 => [stage-1 2/3] COPY --from=builder /usr/src/app/target/ /code/target                                                                                              0.0s
 => [stage-1 3/3] RUN ln -s /code/target /opt/flink/flink-web-upload                                                                                                 0.1s
 => exporting to image                                                                                                                                               0.0s
 => => exporting layers                                                                                                                                              0.0s
 => => writing image sha256:b1615553beb1c657957d2825d7302b472961abe93e63ce87fcb8658b5525b18e                                                                         0.0s
 => => naming to docker.io/library/flink-wordcount:latest                                                                                                            0.0s
```

Tag and push the image to my local registry:
```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount
→ docker tag flink-wordcount:latest localhost:56230/flink-wordcount:latest

scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount
→ docker push localhost:56230/flink-wordcount:latest
The push refers to repository [localhost:56230/flink-wordcount]
662da9bda804: Pushed
103047c74d48: Pushed
18c49fee7a8c: Pushed
67376cc00cc7: Pushed
63bd38732135: Pushed
a70c997c3eb4: Pushed
f087e5bef299: Layer already exists
d493a7b98dcf: Layer already exists
4fb17f51b932: Layer already exists
5a04c4a9f49c: Layer already exists
4b97ee001b98: Layer already exists
22236f4b6e33: Layer already exists
1b9698f962dc: Layer already exists
latest: digest: sha256:bf33ff2f8b9e2fb23ee9845744a174a94775b308bc34884c1335295c17c3574f size: 3040
```

#### 4. Deploy Word Count
Update the `examples/wordcount/flink-operator-custom-resource.yaml` file
to reference the new image with:
```
  image: flink-operator-dev-registry:56230/flink-wordcount:latest
  imagePullPolicy: Always
```

```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount
→ kubectl get FlinkApplication
No resources found in flink-operator namespace.
```

Apply the YAML:
```
scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount
→ kc apply -f flink-operator-custom-resource.yaml
flinkapplication.flink.k8s.io/wordcount-operator-example created
```

Monitor the Flink Application:
```
→ kubectl get FlinkApplication -w
NAME                         PHASE             CLUSTER HEALTH   JOB HEALTH   JOB RESTARTS   AGE
wordcount-operator-example   ClusterStarting                                                29s
wordcount-operator-example   Savepointing                                                   49s
wordcount-operator-example   SubmittingJob                                                  49s
wordcount-operator-example   SubmittingJob                                                  49s
wordcount-operator-example   SubmittingJob     Green                                        49s
```

---------

Co-authored-by: Scott Kidder <scott@kidder.io>
  • Loading branch information
sethsaperstein-lyft and skidder authored Nov 14, 2023
1 parent bea4e54 commit 52e4c60
Show file tree
Hide file tree
Showing 49 changed files with 2,108 additions and 1,755 deletions.
4 changes: 2 additions & 2 deletions cmd/flinkk8soperator/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import (
"strings"
"syscall"

"k8s.io/klog"
klog "k8s.io/klog/v2"

"sigs.k8s.io/controller-runtime/pkg/cache"

Expand Down Expand Up @@ -180,5 +180,5 @@ func operatorEntryPoint(ctx context.Context, metricsScope promutils.Scope, contr
// Start the Cmd
logger.Infof(ctx, "Starting the Cmd.")
ctx, _ = signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)
return mgr.Start(ctx.Done())
return mgr.Start(ctx)
}
Loading

0 comments on commit 52e4c60

Please sign in to comment.