Skip to content

Commit

Permalink
Merge feast-snowflake plugin into main repo with documentation (#2193)
Browse files Browse the repository at this point in the history
* Add backticks to left_table_query_string (#2250)

Signed-off-by: david <davidmiller252@gmail.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Delete entity key from Redis only when all attached feature views are gone (#2240)

* Delete entity from redis when the last attached feature view is deleted

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>

* Delete entity key from Redis only when all attached feature views are gone

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>

* make lint happy

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>

* make lint happy

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>

* one more try with mypy

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* historical_field_mappings2 merge for one sign off commit (#2252)

Signed-off-by: Michelle Rascati <michelle.rascati@sailpoint.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Correct inconsistent dependency (#2255)

Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Add snowflake environment variables to allow testing on snowflake infra (#2258)

* add snowflake environment vars to test framework

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* add snowflake environment vars to test framework

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Return `UNIX_TIMESTAMP` as Python `datetime` (#2244)

* Refactor `UNIX_TIMESTAMP` conversion

Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>

* Return `UNIX_TIMESTAMP` types as `datetime` to user

Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>

* Fix linting errors

Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>

* Rename variable to something more sensible

Signed-off-by: Judah Rand <17158624+judahrand@users.noreply.github.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Feast plan clean up (#2256)

* Run validation and inference on views and entities during plan

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Do not log objects that are unchanged

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Rename Fco to FeastObject

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Remove useless method

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Lint

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Always initialize registry during feature store initialization

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Fix usage test

Signed-off-by: Felix Wang <wangfelix98@gmail.com>

* Remove print statements

Signed-off-by: Felix Wang <wangfelix98@gmail.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Squash commits

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Add error type and refactor query execution to have retries

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Handle more snowflake errors

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Fix lint errors

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Fix lint errors

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Fix lint errors

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* Fix wrong import

Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* modify registry.db s3 object initialization to work in S3 subdirectory with Java Feast Server (#2259)

Signed-off-by: NalinGHub <nalinm01@gmail.com>
Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* clean up docs

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* lint-python

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* fixed historical test

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

* fixed historical test

Signed-off-by: sfc-gh-madkins <miles.adkins@snowflake.com>

Co-authored-by: David Miller <david@patagona.ca>
Co-authored-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
Co-authored-by: Michelle Rascati <44408275+michelle-rascati-sp@users.noreply.github.com>
Co-authored-by: Judah Rand <17158624+judahrand@users.noreply.github.com>
Co-authored-by: Felix Wang <wangfelix98@gmail.com>
Co-authored-by: Danny Chiao <danny@tecton.ai>
Co-authored-by: Nalin Mehra <37969183+NalinGHub@users.noreply.github.com>
  • Loading branch information
8 people authored Jan 31, 2022
1 parent 2e4f3e5 commit f2bc411
Show file tree
Hide file tree
Showing 45 changed files with 2,005 additions and 54 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,24 +136,24 @@ The list below contains the functionality that contributors are planning to deve
* Want to speak to a Feast contributor? We are more than happy to jump on a call. Please schedule a time using [Calendly](https://calendly.com/d/x2ry-g5bb/meet-with-feast-team).

* **Data Sources**
* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
* [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] Kafka source (with [push support into the online store](reference/alpha-stream-ingestion.md))
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source
* **Offline Stores**
* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] [In-memory / Pandas](https://docs.feast.dev/reference/offline-stores/file)
* [x] [Custom offline store support](https://docs.feast.dev/how-to-guides/adding-a-new-offline-store)
* [x] [Snowflake (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [x] [Trino (communiuty plugin)](https://github.com/Shopify/feast-trino)
* **Online Stores**
* [x] [DynamoDB](https://docs.feast.dev/reference/online-stores/dynamodb)
Expand Down Expand Up @@ -208,7 +208,7 @@ The list below contains the functionality that contributors are planning to deve
Please refer to the official documentation at [Documentation](https://docs.feast.dev/)
* [Quickstart](https://docs.feast.dev/getting-started/quickstart)
* [Tutorials](https://docs.feast.dev/tutorials/tutorials-overview)
* [Running Feast with GCP/AWS](https://docs.feast.dev/how-to-guides/feast-gcp-aws)
* [Running Feast with Snowflake/GCP/AWS](https://docs.feast.dev/how-to-guides/feast-snowflake-gcp-aws)
* [Change Log](https://github.com/feast-dev/feast/blob/master/CHANGELOG.md)
* [Slack (#Feast)](https://slack.feast.dev/)

Expand All @@ -224,4 +224,4 @@ Thanks goes to these incredible people:

<a href="https://github.com/feast-dev/feast/graphs/contributors">
<img src="https://contrib.rocks/image?repo=feast-dev/feast" />
</a>
</a>
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,6 @@ Explore the following resources to get started with Feast:
* [Concepts](getting-started/concepts/) describes all important Feast API concepts
* [Architecture](getting-started/architecture-and-components/) describes Feast's overall architecture.
* [Tutorials](tutorials/tutorials-overview.md) shows full examples of using Feast in machine learning applications.
* [Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/) provides a more in-depth guide to using Feast.
* [Running Feast with Snowflake/GCP/AWS](how-to-guides/feast-snowflake-gcp-aws/) provides a more in-depth guide to using Feast.
* [Reference](reference/feast-cli-commands.md) contains detailed API and design documents.
* [Contributing](project/contributing.md) contains resources for anyone who wants to contribute to Feast.
5 changes: 4 additions & 1 deletion docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,11 @@
* [Driver ranking](tutorials/driver-ranking-with-feast.md)
* [Fraud detection on GCP](tutorials/fraud-detection.md)
* [Real-time credit scoring on AWS](tutorials/real-time-credit-scoring-on-aws.md)
* [Driver Stats using Snowflake](tutorials/driver-stats-using-snowflake.md)

## How-to Guides

* [Running Feast with GCP/AWS](how-to-guides/feast-gcp-aws/README.md)
* [Running Feast with Snowflake/GCP/AWS](how-to-guides/feast-snowflake-gcp-aws/README.md)
* [Install Feast](how-to-guides/feast-gcp-aws/install-feast.md)
* [Create a feature repository](how-to-guides/feast-gcp-aws/create-a-feature-repository.md)
* [Deploy a feature store](how-to-guides/feast-gcp-aws/deploy-a-feature-store.md)
Expand All @@ -54,10 +55,12 @@

* [Data sources](reference/data-sources/README.md)
* [File](reference/data-sources/file.md)
* [Snowflake](reference/data-sources/snowflake.md)
* [BigQuery](reference/data-sources/bigquery.md)
* [Redshift](reference/data-sources/redshift.md)
* [Offline stores](reference/offline-stores/README.md)
* [File](reference/offline-stores/file.md)
* [Snowflake](reference/offline-stores/snowflake.md)
* [BigQuery](reference/offline-stores/bigquery.md)
* [Redshift](reference/offline-stores/redshift.md)
* [Online stores](reference/online-stores/README.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/getting-started/third-party-integrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,26 +13,26 @@ Don't see your offline store or online store of choice here? Check out our guide

### **Data Sources**

* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
* [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] Kafka source (with [push support into the online store](https://docs.feast.dev/reference/alpha-stream-ingestion))
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source

### Offline Stores

* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
* [x] [Hive (community plugin)](https://github.com/baineng/feast-hive)
* [x] [Postgres (community plugin)](https://github.com/nossrannug/feast-postgres)
* [x] [In-memory / Pandas](https://docs.feast.dev/reference/offline-stores/file)
* [x] [Custom offline store support](https://docs.feast.dev/how-to-guides/adding-a-new-offline-store)
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [x] [Trino (communiuty plugin)](https://github.com/Shopify/feast-trino)

### Online Stores
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,21 @@ Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% tabs %}
{% tab title="Snowflake template" %}
```bash
feast init -t snowflake
Snowflake Deployment URL: ...
Snowflake User Name: ...
Snowflake Password: ...
Snowflake Role Name: ...
Snowflake Warehouse Name: ...
Snowflake Database Name: ...

Creating a new Feast repository in /<...>/tiny_pika.
```
{% endtab %}

{% tab title="GCP template" %}
```text
feast init -t gcp
Expand All @@ -30,7 +45,7 @@ Redshift Database Name: ...
Redshift User Name: ...
Redshift S3 Staging Location (s3://*): ...
Redshift IAM Role for S3 (arn:aws:iam::*:role/*): ...
Should I upload example data to Redshift (overwriting 'feast_driver_hourly_stats' table)? (Y/n):
Should I upload example data to Redshift (overwriting 'feast_driver_hourly_stats' table)? (Y/n):
Creating a new Feast repository in /<...>/tiny_pika.
```
Expand Down Expand Up @@ -63,4 +78,3 @@ You can now use this feature repository for development. You can try the followi
* Run `feast apply` to apply these definitions to Feast.
* Edit the example feature definitions in `example.py` and run `feast apply` again to change feature definitions.
* Initialize a git repository in the same directory and checking the feature repository into version control.

Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ Install Feast using [pip](https://pip.pypa.io):
pip install feast
```

Install Feast with Snowflake dependencies (required when using Snowflake):

```
pip install 'feast[snowflake]'
```

Install Feast with GCP dependencies (required when using BigQuery or Firestore):

```
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/data-sources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Please see [Data Source](../../getting-started/concepts/feature-view.md#data-sou

{% page-ref page="file.md" %}

{% page-ref page="snowflake.md" %}

{% page-ref page="bigquery.md" %}

{% page-ref page="redshift.md" %}

44 changes: 44 additions & 0 deletions docs/reference/data-sources/snowflake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Snowflake

## Description

Snowflake data sources allow for the retrieval of historical feature values from Snowflake for building training datasets as well as materializing features into an online store.

* Either a table reference or a SQL query can be provided.

## Examples

Using a table reference

```python
from feast import SnowflakeSource

my_snowflake_source = SnowflakeSource(
database="FEAST",
schema="PUBLIC",
table="FEATURE_TABLE",
)
```

Using a query

```python
from feast import SnowflakeSource

my_snowflake_source = SnowflakeSource(
query="""
SELECT
timestamp_column AS "ts",
"created",
"f1",
"f2"
FROM
`FEAST.PUBLIC.FEATURE_TABLE`
""",
)
```

One thing to remember is how Snowflake handles table and column name conventions.
You can read more about quote identifiers [here](https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html)

Configuration options are available [here](https://rtd.feast.dev/en/latest/index.html#feast.data_source.SnowflakeSource).
3 changes: 2 additions & 1 deletion docs/reference/offline-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Please see [Offline Store](../../getting-started/architecture-and-components/off

{% page-ref page="file.md" %}

{% page-ref page="snowflake.md" %}

{% page-ref page="bigquery.md" %}

{% page-ref page="redshift.md" %}

30 changes: 30 additions & 0 deletions docs/reference/offline-stores/snowflake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Snowflake

## Description

The Snowflake offline store provides support for reading [SnowflakeSources](../data-sources/snowflake.md).

* Snowflake tables and views are allowed as sources.
* All joins happen within Snowflake.
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. Pandas dataframes will be uploaded to Snowflake in order to complete join operations.
* A [SnowflakeRetrievalJob](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/snowflake.py#L185) is returned when calling `get_historical_features()`.

## Example

{% code title="feature_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: local
offline_store:
type: snowflake.offline
account: snowflake_deployment.us-east-1
user: user_login
password: user_password
role: sysadmin
warehouse: demo_wh
database: FEAST
```
{% endcode %}
Configuration options are available [here](https://github.com/feast-dev/feast/blob/bf557bcb72c7878a16dccb48443bbbe9dc3efa49/sdk/python/feast/infra/offline_stores/snowflake.py#L39).
26 changes: 0 additions & 26 deletions docs/reference/offline-stores/untitled.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/reference/online-stores/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,3 @@ Please see [Online Store](../../getting-started/architecture-and-components/onli
{% page-ref page="datastore.md" %}

{% page-ref page="dynamodb.md" %}

1 change: 0 additions & 1 deletion docs/reference/providers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,3 @@ Please see [Provider](../../getting-started/architecture-and-components/provider
{% page-ref page="google-cloud-platform.md" %}

{% page-ref page="amazon-web-services.md" %}

2 changes: 2 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ The list below contains the functionality that contributors are planning to deve
* Want to speak to a Feast contributor? We are more than happy to jump on a call. Please schedule a time using [Calendly](https://calendly.com/d/x2ry-g5bb/meet-with-feast-team).

* **Data Sources**
* [x] [Snowflake source](https://docs.feast.dev/reference/data-sources/snowflake)
* [x] [Redshift source](https://docs.feast.dev/reference/data-sources/redshift)
* [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery)
* [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file)
Expand All @@ -18,6 +19,7 @@ The list below contains the functionality that contributors are planning to deve
* [x] [Snowflake source (community plugin)](https://github.com/sfc-gh-madkins/feast-snowflake)
* [ ] HTTP source
* **Offline Stores**
* [x] [Snowflake](https://docs.feast.dev/reference/offline-stores/snowflake)
* [x] [Redshift](https://docs.feast.dev/reference/offline-stores/redshift)
* [x] [BigQuery](https://docs.feast.dev/reference/offline-stores/bigquery)
* [x] [Synapse (community plugin)](https://github.com/Azure/feast-azure)
Expand Down
22 changes: 18 additions & 4 deletions docs/specs/offline_store_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ One of the design goals of Feast is being able to plug seamlessly into existing

Feast provides first class support for the following data warehouses (DWH) to store feature data offline out of the box:
* [BigQuery](https://cloud.google.com/bigquery)
* [Snowflake](https://www.snowflake.com/) (Coming Soon)
* [Redshift](https://aws.amazon.com/redshift/) (Coming Soon)
* [Snowflake](https://www.snowflake.com/)
* [Redshift](https://aws.amazon.com/redshift/)

The integration between Feast and the DWH is highly configurable, but at the same time there are some non-configurable implications and assumptions that Feast imposes on table schemas and mapping between database-native types and Feast type system. This is what this document is about.

Expand All @@ -28,14 +28,14 @@ Feature data is stored in tables in the DWH. There is one DWH table per Feast Fe
## Type mappings

#### Pandas types
Here's how Feast types map to Pandas types for Feast APIs that take in or return a Pandas dataframe:
Here's how Feast types map to Pandas types for Feast APIs that take in or return a Pandas dataframe:

| Feast Type | Pandas Type |
|-------------|--|
| Event Timestamp | `datetime64[ns]` |
| BYTES | `bytes` |
| STRING | `str` , `category`|
| INT32 | `int32`, `uint32` |
| INT32 | `int16`, `uint16`, `int32`, `uint32` |
| INT64 | `int64`, `uint64` |
| UNIX_TIMESTAMP | `datetime64[ns]`, `datetime64[ns, tz]` |
| DOUBLE | `float64` |
Expand Down Expand Up @@ -80,3 +80,17 @@ Here's how Feast types map to BigQuery types when using BigQuery for offline sto
| BOOL\_LIST | `ARRAY<BOOL>`|

Values that are not specified by the table above will cause an error on conversion.

#### Snowflake Types
Here's how Feast types map to Snowflake types when using Snowflake for offline storage
See source here:
https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#snowflake-to-pandas-data-mapping

| Feast Type | Snowflake Python Type |
|-------------|--|
| Event Timestamp | `DATETIME64[NS]` |
| UNIX_TIMESTAMP | `DATETIME64[NS]` |
| STRING | `STR` |
| INT32 | `INT8 / UINT8 / INT16 / UINT16 / INT32 / UINT32` |
| INT64 | `INT64 / UINT64` |
| DOUBLE | `FLOAT64` |
Loading

0 comments on commit f2bc411

Please sign in to comment.