Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add TLS support to the Operator #4796

Merged
merged 4 commits into from
Dec 2, 2024

Conversation

tchughesiv
Copy link
Contributor

@tchughesiv tchughesiv commented Nov 27, 2024

What this PR does / why we need it:

With this PR, an Operator user will be able to configure feast services with TLS. If the operator detects it's running in an OpenShift cluster, TLS is enabled by default through the use of the service serving certificates feature.

  • adds option to enable TLS on a per-service basis. k8s Secret required. TLS is not required across all services. the operator will configured them to work together on a per-service basis.
  • adds option to configure client TLS for remote registries. k8s ConfigMap required.
  • adds OpenShift cluster detection
  • when running in an OpenShift cluster, automatically configures TLS, for all services... unless disabled or overridden by the user
  • similarly, if a remote registry (in the same OpenShift cluster) is using auto-TLS, the operator will automatically configure client TLS... unless overridden by the user
  • changes OnlineStore liveness & readiness probes to httpGet type. checks /health endpoint
  • adds grpcurl initContainer to the online & offlineStore deployments, which will check the registry's grpc.health.v1.Health/Check endpoint before attempting to start those feast services.

Which issue(s) this PR fixes:

Fixes #4770

Misc

Example of an OpenShift deployment and TLS being automatically configured. In a k8s cluster, TLS would need to be manually configured in the FeatureStore.spec. Any required certs, keys, and k8s secrets/configmaps would also have to be created by the user -

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: example
  namespace: default
spec:
  feastProject: my_project
  services:
    offlineStore: {}
    onlineStore: {}
status:
  applied:
    feastProject: my_project
    services:
      offlineStore:
        image: 'feastdev/feature-server:0.41.0'
        persistence:
          file:
            type: dask
        tls:
          secretKeyNames:
            tlsCrt: tls.crt
            tlsKey: tls.key
          secretRef:
            name: feast-example-registry-tls
      onlineStore:
        image: 'feastdev/feature-server:0.41.0'
        persistence:
          file:
            path: /tmp/online_store.db
        tls:
          secretKeyNames:
            tlsCrt: tls.crt
            tlsKey: tls.key
          secretRef:
            name: feast-example-registry-tls
      registry:
        local:
          image: 'feastdev/feature-server:0.41.0'
          persistence:
            file:
              path: /tmp/registry.db
          tls:
            secretKeyNames:
              tlsCrt: tls.crt
              tlsKey: tls.key
            secretRef:
              name: feast-example-registry-tls
  clientConfigMap: feast-example-client
  conditions:
    - lastTransitionTime: '2024-11-26T19:04:50Z'
      message: Offline Store installation complete
      reason: Ready
      status: 'True'
      type: OfflineStore
    - lastTransitionTime: '2024-11-25T21:43:28Z'
      message: Online Store installation complete
      reason: Ready
      status: 'True'
      type: OnlineStore
    - lastTransitionTime: '2024-11-25T19:59:23Z'
      message: Client installation complete
      reason: Ready
      status: 'True'
      type: Client
    - lastTransitionTime: '2024-11-26T19:04:50Z'
      message: FeatureStore installation complete
      reason: Ready
      status: 'True'
      type: FeatureStore
    - lastTransitionTime: '2024-11-25T22:26:19Z'
      message: Registry installation complete
      reason: Ready
      status: 'True'
      type: Registry
  feastVersion: 0.41.0
  phase: Ready
  serviceHostnames:
    offlineStore: 'feast-example-offline.default.svc.cluster.local:443'
    onlineStore: 'feast-example-online.default.svc.cluster.local:443'
    registry: 'feast-example-registry.default.svc.cluster.local:443'

Example of a remote registry reference -

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: example
  namespace: default
spec:
  feastProject: my_project
  services:
    offlineStore: {}
    onlineStore: {}
    registry:
      remote:
        feastRef:
          name: central
          namespace: test
status:
  applied:
    feastProject: my_project
    services:
      offlineStore:
        image: 'feastdev/feature-server:0.41.0'
        persistence:
          file:
            type: dask
        tls:
          secretKeyNames:
            tlsCrt: tls.crt
            tlsKey: tls.key
          secretRef:
            name: feast-example-online-tls
      onlineStore:
        image: 'feastdev/feature-server:0.41.0'
        persistence:
          file:
            path: /tmp/online_store.db
        tls:
          secretKeyNames:
            tlsCrt: tls.crt
            tlsKey: tls.key
          secretRef:
            name: feast-example-online-tls
      registry:
        remote:
          feastRef:
            name: central
            namespace: test
          tls:
            certName: service-ca.crt
            configMapRef:
              name: feast-example-client-ca
  clientConfigMap: feast-example-client
  conditions:
    - lastTransitionTime: '2024-11-27T15:47:57Z'
      message: Offline Store installation complete
      reason: Ready
      status: 'True'
      type: OfflineStore
    - lastTransitionTime: '2024-11-25T21:43:28Z'
      message: Online Store installation complete
      reason: Ready
      status: 'True'
      type: OnlineStore
    - lastTransitionTime: '2024-11-25T19:59:23Z'
      message: Client installation complete
      reason: Ready
      status: 'True'
      type: Client
    - lastTransitionTime: '2024-11-27T15:53:42Z'
      message: FeatureStore installation complete
      reason: Ready
      status: 'True'
      type: FeatureStore
  feastVersion: 0.41.0
  phase: Ready
  serviceHostnames:
    offlineStore: 'feast-example-offline.default.svc.cluster.local:443'
    onlineStore: 'feast-example-online.default.svc.cluster.local:443'
    registry: 'feast-central-registry.test.svc.cluster.local:443'

Which results in the following client configMap & feature_store.yaml -

kind: ConfigMap
apiVersion: v1
metadata:
  name: feast-example-client
  namespace: default
  labels:
    feast.dev/name: example
    feast.dev/service-type: client
  ownerReferences:
    - apiVersion: feast.dev/v1alpha1
      kind: FeatureStore
      name: example
      uid: e4f030dc-23b1-45bc-ba86-6ef66c491701
      controller: true
      blockOwnerDeletion: true
data:
  feature_store.yaml: |
    project: my_project
    provider: local
    offline_store:
        host: feast-example-offline.default.svc.cluster.local
        type: remote
        port: 443
        scheme: https
        cert: /tls/offline/tls.crt
    online_store:
        path: https://feast-example-online.default.svc.cluster.local:443
        type: remote
        cert: /tls/online/tls.crt
    registry:
        path: feast-central-registry.test.svc.cluster.local:443
        registry_type: remote
        cert: /tls/registry/service-ca.crt
    auth:
        type: no_auth
    entity_key_serialization_version: 3

@tchughesiv tchughesiv force-pushed the operator-tls branch 3 times, most recently from 6ffa5ec to 84909fc Compare November 27, 2024 19:18
@tchughesiv tchughesiv marked this pull request as ready for review November 27, 2024 19:19
@tchughesiv tchughesiv requested a review from a team as a code owner November 27, 2024 19:19
@tchughesiv tchughesiv force-pushed the operator-tls branch 4 times, most recently from b2f5292 to b72cdac Compare November 27, 2024 23:41
Copy link
Contributor

@dmartinol dmartinol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

type OfflineTlsConfigs struct {
TlsConfigs `json:",inline"`
// verify the client TLS certificate.
VerifyClient *bool `json:"verifyClient,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why for offline store we need a different TLS config?

Copy link
Contributor Author

@tchughesiv tchughesiv Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offline offers that additional TLS related flag for some reason, whereas the others don't.

"--verify_client",

that said, its not mentioned in the docs, so maybe its unnecessary? i guess i just felt like we should offer the user all the options. it's certainly added a layer of complexity to the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lokeshrangineni thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding verify_client flag - flight server is very strict with the client validation when we pass the TLS self signed certificates which made it difficult for the integration tests so I had to add this flag for the integration tests. By default verify_client flag is enabled and in the production environment it is supposed to be enabled as well. for the integration tests i marked verify_client as false. I will add about this flag to the documentation as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add the verify_client options to all the servers @lokeshrangineni ?
this way we can keep a consistent CR definition and offer the same set of security functions to all of them

@tchughesiv
Copy link
Contributor Author

i think this needs the ok-to-test label

Signed-off-by: Tommy Hughes <tohughes@redhat.com>
Signed-off-by: Tommy Hughes <tohughes@redhat.com>
@tchughesiv tchughesiv force-pushed the operator-tls branch 3 times, most recently from e344b31 to 146c6fb Compare December 1, 2024 19:20
Signed-off-by: Tommy Hughes <tohughes@redhat.com>
@tchughesiv tchughesiv force-pushed the operator-tls branch 5 times, most recently from 759ed2f to 3c70419 Compare December 1, 2024 20:57
Signed-off-by: Tommy Hughes <tohughes@redhat.com>
@dmartinol dmartinol enabled auto-merge (squash) December 2, 2024 18:25
@dmartinol dmartinol merged commit a617a6c into feast-dev:master Dec 2, 2024
20 checks passed
tmihalac pushed a commit to tmihalac/feast that referenced this pull request Dec 3, 2024
* add tls support to the operator

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* operator tls review fix: if statement

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* rebase fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* authz rbac fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

---------

Signed-off-by: Tommy Hughes <tohughes@redhat.com>
Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
tmihalac pushed a commit to tmihalac/feast that referenced this pull request Dec 4, 2024
* add tls support to the operator

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* operator tls review fix: if statement

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* rebase fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* authz rbac fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

---------

Signed-off-by: Tommy Hughes <tohughes@redhat.com>
Signed-off-by: Theodor Mihalache <tmihalac@redhat.com>
lokeshrangineni pushed a commit to lokeshrangineni/feast that referenced this pull request Dec 5, 2024
* add tls support to the operator

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* operator tls review fix: if statement

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* rebase fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

* authz rbac fixes

Signed-off-by: Tommy Hughes <tohughes@redhat.com>

---------

Signed-off-by: Tommy Hughes <tohughes@redhat.com>
franciscojavierarceo pushed a commit that referenced this pull request Dec 5, 2024
# [0.42.0](v0.41.0...v0.42.0) (2024-12-05)

### Bug Fixes

* Add adapters for sqlite datetime conversion ([#4797](#4797)) ([e198b17](e198b17))
* Added grpcio extras to default feature-server image ([#4737](#4737)) ([e9cd373](e9cd373))
* Changing node version in release ([7089918](7089918))
* Feast create empty online table when FeatureView attribute online=False ([#4666](#4666)) ([237c453](237c453))
* Fix db store types in Operator CRD ([#4798](#4798)) ([f09339e](f09339e))
* Fix the config issue for postgres ([#4776](#4776)) ([a36f7e5](a36f7e5))
* Fixed example materialize-incremental and improved explanation ([#4734](#4734)) ([ca8a7ab](ca8a7ab))
* Fixed SparkSource docstrings so it wouldn't used inhereted class docstrings ([#4722](#4722)) ([32e6aa1](32e6aa1))
* Fixing PGVector integration tests ([#4778](#4778)) ([88a0320](88a0320))
* Incorrect type passed to assert_permissions in materialize endpoints ([#4727](#4727)) ([b72c2da](b72c2da))
* Issue of DataSource subclasses using parent abstract class docstrings ([#4730](#4730)) ([b24acd5](b24acd5))
* Operator envVar positioning & tls.SecretRef.Name ([#4806](#4806)) ([1115d96](1115d96))
* Populates project created_time correctly according to created ti… ([#4686](#4686)) ([a61b93c](a61b93c))
* Reduce feast-server container image size & fix dev image build ([#4781](#4781)) ([ccc9aea](ccc9aea))
* Removed version func from feature_store.py ([#4748](#4748)) ([f902bb9](f902bb9))
* Support registry instantiation for read-only users ([#4719](#4719)) ([ca3d3c8](ca3d3c8))
* Syntax Error in BigQuery While Retrieving Columns that Start wit… ([#4713](#4713)) ([60fbc62](60fbc62))
* Update release version in a pertinent Operator file ([#4708](#4708)) ([764a8a6](764a8a6))

### Features

* Add api contract to fastapi docs ([#4721](#4721)) ([1a165c7](1a165c7))
* Add Couchbase as an online store ([#4637](#4637)) ([824859b](824859b))
* Add Operator support for spec.feastProject & status.applied fields ([#4656](#4656)) ([430ac53](430ac53))
* Add services functionality to Operator ([#4723](#4723)) ([d1d80c0](d1d80c0))
* Add TLS support to the Operator ([#4796](#4796)) ([a617a6c](a617a6c))
* Added feast Go operator db stores support ([#4771](#4771)) ([3302363](3302363))
* Added support for setting env vars in feast services in feast controller  ([#4739](#4739)) ([84b24b5](84b24b5))
* Adding docs outlining native Python transformations on singletons ([#4741](#4741)) ([0150278](0150278))
* Adding first feast operator e2e test. ([#4791](#4791)) ([8339f8d](8339f8d))
* Adding github action to run the operator end-to-end tests. ([#4762](#4762)) ([d8ccb00](d8ccb00))
* Adding ssl support for registry server. ([#4718](#4718)) ([ccf7a55](ccf7a55))
* Adding SSL support for the React UI server and feast UI command. ([#4736](#4736)) ([4a89252](4a89252))
* Adding support for native Python transformations on a single dictionary  ([#4724](#4724)) ([9bbc1c6](9bbc1c6))
* Adding TLS support for offline server. ([#4744](#4744)) ([5d8d03f](5d8d03f))
* Building the feast image ([#4775](#4775)) ([6635dde](6635dde))
* File persistence definition and implementation ([#4742](#4742)) ([3bad4a1](3bad4a1))
* Object store persistence in operator ([#4758](#4758)) ([0ae86da](0ae86da))
* OIDC authorization in Feast Operator ([#4801](#4801)) ([eb111d6](eb111d6))
* Operator will create k8s serviceaccount for each feast service ([#4767](#4767)) ([cde5760](cde5760))
* Printing more verbose logs when we start the offline server  ([#4660](#4660)) ([9d8d3d8](9d8d3d8))
* PVC configuration and impl ([#4750](#4750)) ([785a190](785a190))
* Qdrant vectorstore support ([#4689](#4689)) ([86573d2](86573d2))
* RBAC Authorization in Feast Operator ([#4786](#4786)) ([0ef5acc](0ef5acc))
* Support for nested timestamp fields in Spark Offline store ([#4740](#4740)) ([d4d94f8](d4d94f8))
* Update the go feature server from Expedia code repo. ([#4665](#4665)) ([6406625](6406625))
* Updated feast Go operator db stores ([#4809](#4809)) ([2c5a6b5](2c5a6b5))
* Updated sample secret following review ([#4811](#4811)) ([dc9f825](dc9f825))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add TLS support to the Operator
3 participants