Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Feast CLI / python SDK documentation #199

Merged
merged 21 commits into from
May 28, 2019
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
0a28e5c
Storage registration quickstart
thirteen37 Mar 25, 2019
9620262
Minimum requirements for Feast GKE cluster
thirteen37 Mar 25, 2019
14382b2
Storage specs quickstart for admins
thirteen37 Mar 28, 2019
2dd62bd
WIP - End user quickstart
thirteen37 Mar 28, 2019
a3b29b0
Example spec files
thirteen37 May 2, 2019
d3df0b9
Updated protobuf generated files
thirteen37 May 2, 2019
99dc4e4
Merge branch 'upstream-master' into end-user-quickstart
romanwozniak May 23, 2019
49d6979
Removed references to granularity from the docs
romanwozniak May 23, 2019
1e57da0
Removed references to granularity from the docs
romanwozniak May 23, 2019
daee329
Added info on how to register features and run an import job from CLI;
romanwozniak May 23, 2019
ceb3198
fix typos
romanwozniak May 23, 2019
a748191
- WIP on python SDK quickstart documentation
romanwozniak May 24, 2019
2e55d1a
- add training/serving data retrieval sections
romanwozniak May 24, 2019
3afaec0
- setting core/serving tags in helm to latest release version
romanwozniak May 27, 2019
ce631cd
- removed storage options from feature declaration specs
romanwozniak May 27, 2019
cf16b9f
Update doc for CLI
davidheryanto May 27, 2019
b5482ed
Typo
davidheryanto May 27, 2019
6a8766a
Update default values.yaml for Feast
davidheryanto May 27, 2019
574f1e3
Merge branch 'master' into end-user-quickstart
davidheryanto May 27, 2019
9bc5de5
Fix incorrect destination path when installing feast cli
davidheryanto May 27, 2019
f055cc4
Merge pull request #1 from davidheryanto/cli-doc
romanwozniak May 28, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@ For Feast administrators:
* [Installation quickstart](docs/install.md)
* [Helm charts](charts/README.md) details

For Feast end users:
* [Creating features](docs/endusers.md)

For Feast developers:
* [Building the CLI](cli/README.md)

## Notice

Feast is still under active development. Your feedback and contributions are important to us. Please check our [contributing guide](CONTRIBUTING.md) for details.
Expand Down
2 changes: 1 addition & 1 deletion charts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ kubectl delete persistentvolumeclaim feast-postgresql
The following table lists the configurable parameters of the Feast chart and their default values.

| var | desc | default |
| -- | -- | -- |
| --- | --- | --- |
| `core.image.registry` | core docker image registry | feast |
| `core.image.repository` | core docker image repository | feast-core |
| `core.image.tag` | core docker image version | 0.1.0 |
Expand Down
1 change: 0 additions & 1 deletion docs/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ A Feature is an individual measurable property or characteristic of an Entity. I
* Entity - It must be associated with a known Entity within Feast
* ValueType - The feature type must be defined, e.g. String, Bytes, Int64, Int32, Float etc.
* Requirements - Properties related to how a feature should be stored for serving and training
* Granularity - Time series features require a defined granularity
* StorageType - For both serving and training a storage type must be defined

Feast needs to know these attributes in order to be able to ingest, store and serve a feature. A Feature is only a feature when Feast knows about it; This seems contrite, but it introduces a best practice whereby a feature only becomes available for ingestion, serving and training in production when Feast has added the feature to its catalog.
Expand Down
148 changes: 148 additions & 0 deletions docs/endusers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Feast End Users Quickstart Guide

## Pre-requisities

* A working Feast Core: Consult your Feast admin or [install your own](install.md).
* Feast CLI tools: Use [pre-built
binaries](https://github.com/gojek/feast/releases) or [compile your
own](../cli/README.md).

Make sure your CLI is correctly configured for your Feast Core. If
you're running a local Feast Core, it would be:
```sh
feast config set coreURI localhost
```

## Introduction

There are several stages to using Feast:
1. Register your feature
2. Ingest data for your feature
3. Query feature data for training your models
4. Query feature data for serving your models

## Registering your feature

In order to register a feature, you will first need to register a:
* Storage location (typically done by your Feast admin)
* Entity

All registrations are done using [specs](specs.md).

### Registering an entity

Then register an entity, which is for grouping features under a unique
key or id. Typically these map to a domain object, e.g., a customer, a
merchant, a sales region.

[`wordEntity.yml`](../examples/wordEntity.yml)
```
name: word
description: word found in shakespearean works
```

Register the entity spec:
```sh
feast apply entity wordEntity.yml
```

### Registering your feature

Next, define your feature:

[`wordCountFeature.yml`](../examples/wordCountFeature.yml)
```
id: word.count
name: count
entity: word
owner: bob@feast.com
description: number of times the word appears
valueType: INT64
uri: https://github.com/bob/example
dataStores:
serving:
id: REDIS1
warehouse:
id: BIGQUERY1
```

Register it:
```sh
feast apply feature wordCountFeature.yml
```

## Ingest data for your feature

Feast supports ingesting feature from 4 type of sources:

* File (either CSV or JSON)
* Bigquery Table
* Pubsub Topic
* Pubsub Subscription

Let's take a look on how to create an import job spec and ingest some data from a CSV file.
You may find more information on how to ingest data from different sources
here: [[Import Specs]](specs.md#import-spec)

### Prepare your data
`word_counts.csv`
```csv
count,word
28944,the
27317,and
21120,i
20136,to
17181,of
14945,a
13989,you
12949,my
11513,in
11488,that
9545,is
8855,not
8293,with
8043,me
8003,it
...
```

And then upload it into your Google Storage bucket:

```sh
gsutil cp word_counts.csv gs://your-bucket
```

### Define the job import spec
`shakespeareWordCountsImport.yml`
```yaml
type: file.csv
sourceOptions:
path: gs://your-bucket/word_counts.csv
entities:
- word
schema:
entityIdColumn: word
timestampValue: 2019-01-01T00:00:00.000Z
fields:
- name: count
featureId: word.count
- name: word
```

### Start the ingestion job
Next, use `feast` CLI to run your ingestion job, defined in
`shakespeareWordCountsImport.yml`:
```sh
feast jobs run shakespeareWordCountsImport.yml
```

You can also list recent ingestion jobs by running:
```sh
feast list jobs
```

Or get detailed information about the results of ingestion with:
```sh
feast get job <id>
```

Loading