Skip to content

Commit

Permalink
Add Feast CLI / python SDK documentation (#199)
Browse files Browse the repository at this point in the history
* Storage registration quickstart

For Feast admins.

* Minimum requirements for Feast GKE cluster

* Storage specs quickstart for admins

* WIP - End user quickstart

* Example spec files

* Updated protobuf generated files

Using vendored protobuf, here are the new generated files

* Removed references to granularity from the docs

* Removed references to granularity from the docs

* Added info on how to register features and run an import job from CLI;
other fixes in the docs;

* fix typos

* - WIP on python SDK quickstart documentation

* - add training/serving data retrieval sections

* - setting core/serving tags in helm to latest release version

* - removed storage options from feature declaration specs

* Update doc for CLI

* Typo

* Update default values.yaml for Feast

Use Feast image tag 0.1.1 because it fixes some templating in BigQuery
https://github.com/gojek/feast/releases/tag/v0.1.1

* Fix incorrect destination path when installing feast cli
  • Loading branch information
romanwozniak authored and feast-ci-bot committed May 28, 2019
1 parent 7548e61 commit 1bdbfb5
Show file tree
Hide file tree
Showing 16 changed files with 683 additions and 52 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@ For Feast administrators:
* [Installation quickstart](docs/install.md)
* [Helm charts](charts/README.md) details

For Feast end users:
* [Creating features](docs/endusers.md)

For Feast developers:
* [Building the CLI](cli/README.md)

## Notice

Feast is still under active development. Your feedback and contributions are important to us. Please check our [contributing guide](CONTRIBUTING.md) for details.
Expand Down
2 changes: 1 addition & 1 deletion charts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ kubectl delete persistentvolumeclaim feast-postgresql
The following table lists the configurable parameters of the Feast chart and their default values.

| var | desc | default |
| -- | -- | -- |
| --- | --- | --- |
| `core.image.registry` | core docker image registry | feast |
| `core.image.repository` | core docker image repository | feast-core |
| `core.image.tag` | core docker image version | 0.1.0 |
Expand Down
4 changes: 2 additions & 2 deletions charts/feast/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ core:
pullPolicy: IfNotPresent
registry: gcr.io/kf-feast
repository: feast-core
tag: "1234"
tag: "0.1.1"
replicaCount: 1
resources:
limits:
Expand Down Expand Up @@ -78,7 +78,7 @@ serving:
pullPolicy: IfNotPresent
registry: gcr.io/kf-feast
repository: feast-serving
tag: "1234"
tag: "0.1.1"
replicaCount: 1
resources:
limits:
Expand Down
50 changes: 17 additions & 33 deletions cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,45 +5,29 @@ feast, as well as manage and run ingestion jobs.

## Installation

### Download the compiled binary
The quickest way to get the CLI is to download the compiled binary:

The quickest way to get the CLI is to download the compiled binary: #TODO

### Building from source

The following dependencies are required to build the CLI from source:
* [`go`](https://golang.org/)
* [`protoc`](https://developers.google.com/protocol-buffers/)
* [`dep`](https://github.com/golang/dep)

See below for specific instructions on how to install the dependencies.

After the dependencies are installed, you can build the CLI using:
```sh
# at feast top-level directory
$ make build-cli
# For Mac OS users
wget https://github.com/gojek/feast/releases/download/v0.1.1/feast-cli-v0.1.1-darwin-amd64
chmod +x feast-cli-v0.1.1-darwin-amd64
sudo mv feast-cli-v0.1.1-darwin-amd64 /usr/local/bin/feast

# For Linux users
wget https://github.com/gojek/feast/releases/download/v0.1.1/feast-cli-v0.1.1-linux-amd64
chmod +x feast-cli-v0.1.1-linux-amd64
sudo mv feast-cli-v0.1.1-linux-amd64 /usr/local/bin/feast
```

### Dependencies

#### `protoc-gen-go`

To ensure you have a matching version of `protoc-gen-go`, install the vendored version:
```sh
$ go install ./vendor/github.com/golang/protobuf/protoc-gen-go
$ which protoc-gen-go
~/go/bin/protoc-gen-go
```
### Building from source

#### `dep`
If you want to develop the CLI or build it from source, you need to have at least Golang version 1.11 installed because Feast use go modules.

On MacOS you can install or upgrade to the latest released version with Homebrew:
```sh
$ brew install dep
$ brew upgrade dep
```
git clone https://github.com/gojek/feast
cd feast
go build -o feast ./cli/feast

On other platforms you can use the `install.sh` script:
```sh
$ curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh
# Test running feast CLI
./feast
```
1 change: 0 additions & 1 deletion docs/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ A Feature is an individual measurable property or characteristic of an Entity. I
* Entity - It must be associated with a known Entity within Feast
* ValueType - The feature type must be defined, e.g. String, Bytes, Int64, Int32, Float etc.
* Requirements - Properties related to how a feature should be stored for serving and training
* Granularity - Time series features require a defined granularity
* StorageType - For both serving and training a storage type must be defined

Feast needs to know these attributes in order to be able to ingest, store and serve a feature. A Feature is only a feature when Feast knows about it; This seems contrite, but it introduces a best practice whereby a feature only becomes available for ingestion, serving and training in production when Feast has added the feature to its catalog.
Expand Down
143 changes: 143 additions & 0 deletions docs/endusers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# Feast End Users Quickstart Guide

## Pre-requisities

* A working Feast Core: Consult your Feast admin or [install your own](install.md).
* Feast CLI tools: Use [pre-built
binaries](https://github.com/gojek/feast/releases) or [compile your
own](../cli/README.md).

Make sure your CLI is correctly configured for your Feast Core. If
you're running a local Feast Core, it would be:
```sh
feast config set coreURI localhost
```

## Introduction

There are several stages to using Feast:
1. Register your feature
2. Ingest data for your feature
3. Query feature data for training your models
4. Query feature data for serving your models

## Registering your feature

In order to register a feature, you will first need to register a:
* Storage location (typically done by your Feast admin)
* Entity

All registrations are done using [specs](specs.md).

### Registering an entity

Then register an entity, which is for grouping features under a unique
key or id. Typically these map to a domain object, e.g., a customer, a
merchant, a sales region.

[`wordEntity.yml`](../examples/wordEntity.yml)
```
name: word
description: word found in shakespearean works
```

Register the entity spec:
```sh
feast apply entity wordEntity.yml
```

### Registering your feature

Next, define your feature:

[`wordCountFeature.yml`](../examples/wordCountFeature.yml)
```
id: word.count
name: count
entity: word
owner: bob@feast.com
description: number of times the word appears
valueType: INT64
uri: https://github.com/bob/example
```

Register it:
```sh
feast apply feature wordCountFeature.yml
```

## Ingest data for your feature

Feast supports ingesting feature from 4 type of sources:

* File (either CSV or JSON)
* Bigquery Table
* Pubsub Topic
* Pubsub Subscription

Let's take a look on how to create an import job spec and ingest some data from a CSV file.
You may find more information on how to ingest data from different sources
here: [[Import Specs]](specs.md#import-spec)

### Prepare your data
`word_counts.csv`
```csv
count,word
28944,the
27317,and
21120,i
20136,to
17181,of
14945,a
13989,you
12949,my
11513,in
11488,that
9545,is
8855,not
8293,with
8043,me
8003,it
...
```

And then upload it into your Google Storage bucket:

```sh
gsutil cp word_counts.csv gs://your-bucket
```

### Define the job import spec
`shakespeareWordCountsImport.yml`
```yaml
type: file.csv
sourceOptions:
path: gs://your-bucket/word_counts.csv
entities:
- word
schema:
entityIdColumn: word
timestampValue: 2019-01-01T00:00:00.000Z
fields:
- name: count
featureId: word.count
- name: word
```
### Start the ingestion job
Next, use `feast` CLI to run your ingestion job, defined in
`shakespeareWordCountsImport.yml`:
```sh
feast jobs run shakespeareWordCountsImport.yml
```

You can also list recent ingestion jobs by running:
```sh
feast list jobs
```

Or get detailed information about the results of ingestion with:
```sh
feast get job <id>
```

Loading

0 comments on commit 1bdbfb5

Please sign in to comment.