Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Refactor feature server helm charts to allow passing feature_store.yaml in environment variables #3113

Merged
merged 10 commits into from
Aug 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:
needs: get-version
strategy:
matrix:
component: [feature-server-python-aws, feature-server-java, feature-transformation-server]
component: [feature-server-python, feature-server-python-aws, feature-server-java, feature-transformation-server]
env:
MAVEN_CACHE: gs://feast-templocation-kf-feast/.m2.2020-08-19.tar
REGISTRY: feastdev
Expand Down
33 changes: 32 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@
- [(Contrib) Running tests for HBase online store](#contrib-running-tests-for-hbase-online-store)
- [(Experimental) Feast UI](#experimental-feast-ui)
- [Feast Java Serving](#feast-java-serving)
- [Developing the Feast Helm charts](#developing-the-feast-helm-charts)
- [Feast Java Feature Server Helm Chart](#feast-java-feature-server-helm-chart)
- [Feast Python / Go Feature Server Helm Chart](#feast-python--go-feature-server-helm-chart)
- [Feast Go Client](#feast-go-client)
- [Environment Setup](#environment-setup-1)
- [Building](#building)
Expand Down Expand Up @@ -197,7 +200,6 @@ To test across clouds, on top of setting up Redis, you also need GCP / AWS / Sno
> and commenting out tests that are added to `DEFAULT_FULL_REPO_CONFIGS`

**GCP**
### Setup your GCP BigQuery Instance
1. You can get free credits [here](https://cloud.google.com/free/docs/free-cloud-features#free-trial).
2. You will need to setup a service account, enable the BigQuery API, and create a staging location for a bucket.
* Setup your service account and project using steps 1-5 [here](https://codelabs.developers.google.com/codelabs/cloud-bigquery-python#0).
Expand Down Expand Up @@ -347,6 +349,35 @@ See [Feast contributing guide](ui/CONTRIBUTING.md)
## Feast Java Serving
See [Java contributing guide](java/CONTRIBUTING.md)

See also development instructions related to the helm chart below at [Developing the Feast Helm charts](#developing-the-feast-helm-charts)

## Developing the Feast Helm charts
There are 3 helm charts:
- Feast Java feature server
- Feast Python / Go feature server
- (deprecated) Feast Python feature server

Generally, you can override the images in the helm charts with locally built Docker images, and install the local helm
chart.

All README's for helm charts are generated using [helm-docs](https://github.com/norwoodj/helm-docs). You can install it
(e.g. with `brew install norwoodj/tap/helm-docs`) and then run `make build-helm-docs`.

### Feast Java Feature Server Helm Chart
See the Java demo example (it has development instructions too using minikube) [here](examples/java-demo/README.md)

It will:
- run `make build-java-docker-dev` to build local Java feature server binaries
- configure the included `application-override.yaml` to override the image tag to use the locally built dev images.
- install the local chart with `helm install feast-release ../../../infra/charts/feast --values application-override.yaml`

### Feast Python / Go Feature Server Helm Chart
See the Python demo example (it has development instructions too using minikube) [here](examples/python-helm-demo/README.md)

It will:
- run `make build-feature-server-dev` to build a local python feature server binary
- install the local chart with `helm install feast-release ../../../infra/charts/feast-feature-server --set image.tag=dev --set feature_store_yaml_base64=$(base64 feature_store.yaml)`

## Feast Go Client
### Environment Setup
Setting up your development environment for Feast Go SDK:
Expand Down
30 changes: 25 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ lint-go: compile-protos-go compile-go-lib

# Docker

build-docker: build-ci-docker build-feature-server-python-aws-docker build-feature-transformation-server-docker build-feature-server-java-docker
build-docker: build-ci-docker build-feature-server-python-docker build-feature-server-python-aws-docker build-feature-transformation-server-docker build-feature-server-java-docker

push-ci-docker:
docker push $(REGISTRY)/feast-ci:$(VERSION)
Expand All @@ -339,13 +339,21 @@ push-ci-docker:
build-ci-docker:
docker buildx build -t $(REGISTRY)/feast-ci:$(VERSION) -f infra/docker/ci/Dockerfile --load .

push-feature-server-python-docker:
docker push $(REGISTRY)/feature-server:$$VERSION

build-feature-server-python-docker:
docker buildx build --build-arg VERSION=$$VERSION \
-t $(REGISTRY)/feature-server:$$VERSION \
-f sdk/python/feast/infra/feature_servers/multicloud/Dockerfile --load .

push-feature-server-python-aws-docker:
docker push $(REGISTRY)/feature-server-python-aws:$$VERSION
docker push $(REGISTRY)/feature-server-python-aws:$$VERSION

build-feature-server-python-aws-docker:
docker buildx build --build-arg VERSION=$$VERSION \
-t $(REGISTRY)/feature-server-python-aws:$$VERSION \
-f sdk/python/feast/infra/feature_servers/aws_lambda/Dockerfile --load .
docker buildx build --build-arg VERSION=$$VERSION \
-t $(REGISTRY)/feature-server-python-aws:$$VERSION \
-f sdk/python/feast/infra/feature_servers/aws_lambda/Dockerfile --load .

push-feature-transformation-server-docker:
docker push $(REGISTRY)/feature-transformation-server:$(VERSION)
Expand All @@ -363,6 +371,13 @@ build-feature-server-java-docker:
-t $(REGISTRY)/feature-server-java:$(VERSION) \
-f java/infra/docker/feature-server/Dockerfile --load .

# Dev images

build-feature-server-dev:
docker buildx build --build-arg VERSION=dev \
-t feastdev/feature-server:dev \
-f sdk/python/feast/infra/feature_servers/multicloud/Dockerfile.dev --load .

build-java-docker-dev:
make build-java-no-tests REVISION=dev
docker buildx build --build-arg VERSION=dev \
Expand Down Expand Up @@ -402,6 +417,11 @@ build-sphinx: compile-protos-python
build-templates:
python infra/scripts/compile-templates.py

build-helm-docs:
cd ${ROOT_DIR}/infra/charts/feast; helm-docs
cd ${ROOT_DIR}/infra/charts/feast-feature-server; helm-docs
cd ${ROOT_DIR}/infra/charts/feast-python-server; helm-docs

# Web UI

# Note: requires node and yarn to be installed
Expand Down
43 changes: 40 additions & 3 deletions docs/getting-started/concepts/feature-retrieval.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ Generally, Feast supports several patterns of feature retrieval:

1. Training data generation (via `feature_store.get_historical_features(...)`)
2. Offline feature retrieval for batch scoring (via `feature_store.get_historical_features(...)`)
3. Online feature retrieval for real-time model predictions (via `feature_store.get_online_features(...)`)
3. Online feature retrieval for real-time model predictions
- via the SDK: `feature_store.get_online_features(...)`
- via deployed feature server endpoints: `requests.post('http://localhost:6566/get-online-features', data=json.dumps(online_request))`

Each of these retrieval mechanisms accept:

Expand Down Expand Up @@ -100,7 +102,6 @@ batch_scoring_features = store.get_historical_features(

```python
from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path=".")

Expand All @@ -124,13 +125,23 @@ batch_scoring_features = store.get_historical_features(

<details>

<summary>How to: retrieve online features for real-time model inference</summary>
<summary>How to: retrieve online features for real-time model inference (Python SDK)</summary>

Feast will ensure the latest feature values for registered features are available. At retrieval time, you need to supply a list of **entities** and the corresponding **features** to be retrieved. Similar to `get_historical_features`, we recommend using feature services as a mechanism for grouping features in a model version.

_Note: unlike `get_historical_features`, the `entity_rows` **do not need timestamps** since you only want one feature value per entity key._

```python
from feast import RepoConfig, FeatureStore
from feast.repo_config import RegistryConfig

repo_config = RepoConfig(
registry=RegistryConfig(path="gs://feast-test-gcs-bucket/registry.pb"),
project="feast_demo_gcp",
provider="gcp",
)
store = FeatureStore(config=repo_config)

features = store.get_online_features(
features=[
"driver_hourly_stats:conv_rate",
Expand All @@ -147,6 +158,32 @@ features = store.get_online_features(

</details>

<details>

<summary>How to: retrieve online features for real-time model inference (Feature Server)</summary>

Feast will ensure the latest feature values for registered features are available. At retrieval time, you need to supply a list of **entities** and the corresponding **features** to be retrieved. Similar to `get_historical_features`, we recommend using feature services as a mechanism for grouping features in a model version.

_Note: unlike `get_historical_features`, the `entity_rows` **do not need timestamps** since you only want one feature value per entity key._

This approach requires you to deploy a feature server (see [Python feature server](../../reference/feature-servers/python-feature-server)).

```python
import requests
import json

online_request = {
"features": [
"driver_hourly_stats:conv_rate",
],
"entities": {"driver_id": [1001, 1002]},
}
r = requests.post('http://localhost:6566/get-online-features', data=json.dumps(online_request))
print(json.dumps(r.json(), indent=4, sort_keys=True))
```

</details>

## Feature Services

A feature service is an object that represents a logical group of features from one or more [feature views](feature-view.md#feature-view). Feature Services allows features from within a feature view to be used as needed by an ML model. Users can expect to create one feature service per model version, allowing for tracking of the features used by models.
Expand Down
9 changes: 6 additions & 3 deletions examples/java-demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,18 +30,21 @@ For this tutorial, we setup Feast with Redis, using the Feast CLI to register an
2. Make a bucket in GCS (or S3)
3. The feature repo is already setup here, so you just need to swap in your GCS bucket and Redis credentials.
We need to modify the `feature_store.yaml`, which has two fields for you to replace:
```yaml
registry: gs://[YOUR BUCKET]/demo-repo/registry.db
```yaml
registry: gs://[YOUR GCS BUCKET]/demo-repo/registry.db
project: feast_java_demo
provider: gcp
online_store:
type: redis
# Note: this would normally be using instance URL's to access Redis
connection_string: localhost:6379,password=[YOUR PASSWORD]
offline_store:
type: file
entity_key_serialization_version: 2
```
4. Run `feast apply` to apply your local features to the remote registry
5. Materialize features to the online store:
- Note: you may need to authenticate to gcloud first with `gcloud auth login`
6. Materialize features to the online store:
```bash
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
Expand Down
1 change: 1 addition & 0 deletions examples/java-demo/feature_repo/feature_store.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ project: feast_java_demo
provider: gcp
online_store:
type: redis
# Note: this would normally be using instance URL's to access Redis
connection_string: localhost:6379,password=[YOUR PASSWORD]
offline_store:
type: file
Expand Down
89 changes: 89 additions & 0 deletions examples/python-helm-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@

# Running Feast Python / Go Feature Server with Redis on Kubernetes

For this tutorial, we set up Feast with Redis.

We use the Feast CLI to register and materialize features, and then retrieving via a Feast Python feature server deployed in Kubernetes

## First, let's set up a Redis cluster
1. Start minikube (`minikube start`)
2. Use helm to install a default Redis cluster
```bash
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm install my-redis bitnami/redis
```
![](redis-screenshot.png)
3. Port forward Redis so we can materialize features to it

```bash
kubectl port-forward --namespace default svc/my-redis-master 6379:6379
```
4. Get your Redis password using the command (pasted below for convenience). We'll need this to tell Feast how to communicate with the cluster.

```bash
export REDIS_PASSWORD=$(kubectl get secret --namespace default my-redis -o jsonpath="{.data.redis-password}" | base64 --decode)
echo $REDIS_PASSWORD
```

## Next, we setup a local Feast repo
1. Install Feast with Redis dependencies `pip install "feast[redis]"`
2. Make a bucket in GCS (or S3)
3. The feature repo is already setup here, so you just need to swap in your GCS bucket and Redis credentials.
We need to modify the `feature_store.yaml`, which has two fields for you to replace:
```yaml
registry: gs://[YOUR GCS BUCKET]/demo-repo/registry.db
project: feast_python_demo
provider: gcp
online_store:
type: redis
# Note: this would normally be using instance URL's to access Redis
connection_string: localhost:6379,password=[YOUR PASSWORD]
offline_store:
type: file
entity_key_serialization_version: 2
```
4. Run `feast apply` from within the `feature_repo` directory to apply your local features to the remote registry
- Note: you may need to authenticate to gcloud first with `gcloud auth login`
5. Materialize features to the online store:
```bash
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
```

## Now let's setup the Feast Server
1. Add the gcp-auth addon to mount GCP credentials:
```bash
minikube addons enable gcp-auth
```
2. Add Feast's Python/Go feature server chart repo
```bash
helm repo add feast-charts https://feast-helm-charts.storage.googleapis.com
helm repo update
```
3. For this tutorial, because we don't have a direct hosted endpoint into Redis, we need to change `feature_store.yaml` to talk to the Kubernetes Redis service
```bash
sed -i '' 's/localhost:6379/my-redis-master:6379/g' feature_store.yaml
```
4. Install the Feast helm chart: `helm install feast-release feast-charts/feast-feature-server --set feature_store_yaml_base64=$(base64 feature_store.yaml)`
> **Dev instructions**: if you're changing the java logic or chart, you can do
1. `eval $(minikube docker-env)`
2. `make build-feature-server-dev`
3. `helm install feast-release ../../../infra/charts/feast-feature-server --set image.tag=dev --set feature_store_yaml_base64=$(base64 feature_store.yaml)`
5. (Optional): check logs of the server to make sure it’s working
```bash
kubectl logs svc/feast-feature-server
```
6. Port forward to expose the grpc endpoint:
```bash
kubectl port-forward svc/feast-feature-server 6566:80
```
7. Run test fetches for online features:8.
- First: change back the Redis connection string to allow localhost connections to Redis
```bash
sed -i '' 's/my-redis-master:6379/localhost:6379/g' feature_store.yaml
```
- Then run the included fetch script, which fetches both via the HTTP endpoint and for comparison, via the Python SDK
```bash
python test_python_fetch.py
```
Empty file.
Binary file not shown.
61 changes: 61 additions & 0 deletions examples/python-helm-demo/feature_repo/driver_repo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from datetime import timedelta

import pandas as pd

from feast.data_source import RequestSource
from feast.on_demand_feature_view import on_demand_feature_view
from feast.types import Float32, Float64, Int64, String
from feast.field import Field

from feast import Entity, FileSource, FeatureView

driver_hourly_stats = FileSource(
path="data/driver_stats_with_string.parquet",
timestamp_field="event_timestamp",
created_timestamp_column="created",
)
driver = Entity(name="driver_id", description="driver id",)
driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=[driver],
ttl=timedelta(days=365),
schema=[
Field(name="conv_rate", dtype=Float32),
Field(name="acc_rate", dtype=Float32),
Field(name="avg_daily_trips", dtype=Int64),
Field(name="string_feature", dtype=String),
],
online=True,
source=driver_hourly_stats,
tags={},
)

# Define a request data source which encodes features / information only
# available at request time (e.g. part of the user initiated HTTP request)
input_request = RequestSource(
name="vals_to_add",
schema=[
Field(name="val_to_add", dtype=Int64),
Field(name="val_to_add_2", dtype=Int64),
],
)


# Define an on demand feature view which can generate new features based on
# existing feature views and RequestSource features
@on_demand_feature_view(
sources=[
driver_hourly_stats_view,
input_request,
],
schema=[
Field(name="conv_rate_plus_val1", dtype=Float64),
Field(name="conv_rate_plus_val2", dtype=Float64),
],
)
def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame:
df = pd.DataFrame()
df["conv_rate_plus_val1"] = inputs["conv_rate"] + inputs["val_to_add"]
df["conv_rate_plus_val2"] = inputs["conv_rate"] + inputs["val_to_add_2"]
return df

10 changes: 10 additions & 0 deletions examples/python-helm-demo/feature_repo/feature_store.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
registry: gs://[YOUR GCS BUCKET]/demo-repo/registry.db
project: feast_python_demo
provider: gcp
online_store:
type: redis
# Note: this would normally be using instance URL's to access Redis
connection_string: localhost:6379,password=[YOUR PASSWORD]
offline_store:
type: file
entity_key_serialization_version: 2
Loading