Handle retry for redis io flow #274

khorshuheng · 2019-10-26T14:45:10Z

Currently, the Redis write to the serving store does not handle transient connection failures, which results in data loss. This PR augment the pipeline to retry the connection up to a certain maximum limit. The limit and backoff time are configurable via RedisConfig.

feast-ci-bot · 2019-10-26T14:45:25Z

Hi @khorshuheng. Thanks for your PR.

I'm waiting for a gojek member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

davidheryanto · 2019-10-29T07:26:06Z

/ok-to-test

davidheryanto · 2019-10-29T07:41:56Z

/test test-golang-sdk
Test cases in golang-sdk unit test need to be updated to ignore actual results that is not in order
Retrying will likely to pass the test for now
Not related to this pull request

davidheryanto · 2019-10-29T07:58:35Z

protos/feast/core/Store.proto

@@ -110,6 +110,8 @@ message Store {
  message RedisConfig {
    string host = 1;
    int32 port = 2;
+    int32 backoff_ms = 3;


Suggest some comments to the new proto fields.

Suggested change

int32 backoff_ms = 3;

// Optional. The number of milliseconds to wait before retrying failed Redis connection.

// By default, Feast uses exponential backoff policy and "backoff_ms" sets the initial wait duration.

int32 backoff_ms = 3;

// Optional. Maximum total number of retries for connecting to Redis. Default to zero retries.

int32 max_retries = 4;

Just nitpick, the initial backoff duration could be renamed as follow

int32 initial_backoff_ms = 3;

So that the generated code will be clearer

@davidheryanto @pradithya Thanks for the suggestion, i have implemented the change as suggested.

zhilingc · 2019-10-29T08:06:20Z

Looks great!
Is it possible to output failed rows to write to the deadletter table?

khorshuheng · 2019-10-29T08:27:55Z

Looks great!
Is it possible to output failed rows to write to the deadletter table?

Thanks! There are a few things which needs to be discussed before the implementation:

Where do we plan to store the dead letters? Should we stored them in Kafka, or Big Query? Should it be configurable?
Apart from the error message, what kind of information would be required for the dead letters? Is it sufficient to pass in the key / value bytes? Something like:

FailedRedisMessage

keyBytes
valueBytes
errMessage?

zhilingc · 2019-10-29T08:58:13Z

Looks great!
Is it possible to output failed rows to write to the deadletter table?

Thanks! There are a few things which needs to be discussed before the implementation:

Where do we plan to store the dead letters? Should we stored them in Kafka, or Big Query? Should it be configurable?

Apart from the error message, what kind of information would be required for the dead letters? Is it sufficient to pass in the key / value bytes? Something like:

FailedRedisMessage

keyBytes

valueBytes

errMessage?

We currently write deadletters to BQ using the WriteFailedElementToBigQuery PTransform, we can extend it to other types of sinks (kafka, file, etc) later.

As for the failed element message, just key/value bytes together with the error message and timestamp would be great!

khorshuheng · 2019-11-25T07:12:57Z

Looks great!
Is it possible to output failed rows to write to the deadletter table?

Thanks! There are a few things which needs to be discussed before the implementation:

Where do we plan to store the dead letters? Should we stored them in Kafka, or Big Query? Should it be configurable?

Apart from the error message, what kind of information would be required for the dead letters? Is it sufficient to pass in the key / value bytes? Something like:

FailedRedisMessage

keyBytes

valueBytes

errMessage?

We currently write deadletters to BQ using the WriteFailedElementToBigQuery PTransform, we can extend it to other types of sinks (kafka, file, etc) later.

As for the failed element message, just key/value bytes together with the error message and timestamp would be great!

I have implemented this. Though, as discussed earlier, the bytes value are currently represented as string, due to the BigQuery schema.

woop · 2019-11-28T05:03:13Z

/lgtm

woop · 2019-11-28T05:03:26Z

/approve

feast-ci-bot · 2019-11-28T05:03:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: khorshuheng, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [woop]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

woop · 2019-11-28T06:25:09Z

/retest

woop · 2019-12-22T13:51:16Z

@khorshuheng Happy to merge this in if we can resolve the conflicts.

woop · 2019-12-25T13:37:59Z

/retest

woop · 2020-01-05T04:34:14Z

/lgtm

khorshuheng requested review from davidheryanto, pradithya, thirteen37, tims, woop and zhilingc as code owners October 26, 2019 14:45

feast-ci-bot added the needs-ok-to-test label Oct 26, 2019

feast-ci-bot added the size/L label Oct 26, 2019

feast-ci-bot added ok-to-test and removed needs-ok-to-test labels Oct 29, 2019

davidheryanto reviewed Oct 29, 2019

View reviewed changes

khorshuheng force-pushed the redis-with-retry branch 2 times, most recently from 1db929d to 51c5ec2 Compare November 5, 2019 01:43

woop changed the base branch from 0.3-dev to master November 17, 2019 10:51

khorshuheng force-pushed the redis-with-retry branch from f44241b to a97fdc2 Compare November 22, 2019 06:29

feast-ci-bot assigned woop Nov 28, 2019

feast-ci-bot added the lgtm label Nov 28, 2019

feast-ci-bot added the approved label Nov 28, 2019

Handle retry for redis io flow

b8c3b30

khorshuheng force-pushed the redis-with-retry branch from a97fdc2 to b8c3b30 Compare December 24, 2019 03:51

feast-ci-bot removed the lgtm label Dec 24, 2019

feast-ci-bot added the lgtm label Jan 5, 2020

feast-ci-bot merged commit 8a0f53b into feast-dev:master Jan 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle retry for redis io flow #274

Handle retry for redis io flow #274

khorshuheng commented Oct 26, 2019

feast-ci-bot commented Oct 26, 2019

davidheryanto commented Oct 29, 2019

davidheryanto commented Oct 29, 2019 •

edited

Loading

davidheryanto Oct 29, 2019

pradithya Oct 30, 2019

khorshuheng Nov 25, 2019

zhilingc commented Oct 29, 2019

khorshuheng commented Oct 29, 2019

zhilingc commented Oct 29, 2019

khorshuheng commented Nov 25, 2019

woop commented Nov 28, 2019

woop commented Nov 28, 2019

feast-ci-bot commented Nov 28, 2019

woop commented Nov 28, 2019

woop commented Dec 22, 2019

woop commented Dec 25, 2019

woop commented Jan 5, 2020

-    int32 backoff_ms = 3;
+    // Optional. The number of milliseconds to wait before retrying failed Redis connection.
+    // By default, Feast uses exponential backoff policy and "backoff_ms" sets the initial wait duration.
+    int32 backoff_ms = 3;
+    // Optional. Maximum total number of retries for connecting to Redis. Default to zero retries.
+    int32 max_retries = 4;

Handle retry for redis io flow #274

Handle retry for redis io flow #274

Conversation

khorshuheng commented Oct 26, 2019

feast-ci-bot commented Oct 26, 2019

davidheryanto commented Oct 29, 2019

davidheryanto commented Oct 29, 2019 • edited Loading

davidheryanto Oct 29, 2019

Choose a reason for hiding this comment

pradithya Oct 30, 2019

Choose a reason for hiding this comment

khorshuheng Nov 25, 2019

Choose a reason for hiding this comment

zhilingc commented Oct 29, 2019

khorshuheng commented Oct 29, 2019

zhilingc commented Oct 29, 2019

khorshuheng commented Nov 25, 2019

woop commented Nov 28, 2019

woop commented Nov 28, 2019

feast-ci-bot commented Nov 28, 2019

woop commented Nov 28, 2019

woop commented Dec 22, 2019

woop commented Dec 25, 2019

woop commented Jan 5, 2020

davidheryanto commented Oct 29, 2019 •

edited

Loading