storage: S3 test refactoring and fix some bugs #548

kennytm · 2020-10-05T08:03:41Z

What problem does this PR solve?

What is changed and how it works?

Upgrade fake-gcs-server from 1.17.0 to 1.19.0 to include Support Windows fsouza/fake-gcs-server#228 (note: the latest version is 1.21.1 but that upgrades the gRPC dependency to 1.31.1 which maybe too new for other dependencies such as etcd and the TiDB cluster. we choose 1.19.0 to avoid upgrading the deps too much).
Removed the storage.s3Handlers type, because s3iface.S3API exists. We also removed the mockS3Handler type in s3_test.go in favor of a real mocking framework.
Naively using step 2 caused a dependency cycle between pkg/mock and pkg/storage. Therefore we moved s3_test.go into its own package storage_test, which leads to some privacy problems. We have performed the following changes:
- RangeInfo is now public.
- The TestS3Others test, relying on defineS3Flags but performs no tests, is removed.
- The checkS3 hack is removed, in favor of options passed to the ExternalStorage constructor (see below).
Responding to the change in step 3, we generalized the storage.Create function to accept an ExternalStorageOptions struct replacing the sendCreds boolean. The options are:
- SendCredentials — same as the old sendCreds boolean
- SkipCheckPath — same as enabling the checkS3 hack, and generalized to Local and GCS
- HTTPClient — used to unify the newGCSStorage and newGCSStorageWithHTTPClient functions, and generalized to S3.
While running the tests, the following bugs are also fixed in this PR:
- S3Storage.FileExists was returning true, nil if the HeadObject requests returned any non-AWS errors. This is now changed to return false, err.
- s3StorageReader.Read returns 0, nil when reaching EOF, because we used io.ReadFull rather than (io.Reader).Read (dunno why). This causes io.ErrNoProgress when wrapping a bufio.Reader on the reader, which affects Lightning's S3 data source. Changed to the latter so it properly return 0, io.EOF.
- WalkDir previously is implemented using the ListObjects API, using the NextMarker field as continuation token. Unfortunately, this field is missing on AWS S3 when Delimiter is empty (it is always populated on minio and Aliyun OSS), so when there are over 1000 objects, WalkDir will enter an infinite loop! Here we change the Marker calculation by using the last Key in the Contents instead, as recommended in the docs.

Check List

Tests

Unit test

Code changes

Has exported function/method change
- storage.Create is soft-deprecated and storage.NewS3Storage is hard-deprecated. They can still be used but are discouraged. In particular TiCDC should stop using storage.NewS3Storage, it makes no sense to not use storage.ParseBackend + storage.New.

Side effects

Related changes

Need to cherry-pick to the release branch

Release Note

(Lightning, not relevant to BR) Fixed several errors when using AWS S3 as data source, including: (1) not supporting reading directory with over 1000 objects; (2) when reading schemas it always fail with "failed to extract table schema" with error "multiple Read calls return no data or error".

this requires s3_test to be placed into a separate storage_test package to avoid circular dependency. this makes the `checkS3` hack not usable. instead, we introduced a new option struct to the constructor to disable to the check.

kennytm · 2020-10-05T08:29:36Z

The io.ReadFull was introduced in 7581dac and 3554ada because

s3 api may not return enough data, so we need to loop fetch enough

...

TODO: currently, if read to the end, s3 will return io.EOF

Previously it was just Read in #446.

However, according to the interface comment of Reader.Read, this is exactly the intended behavior:

Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered.

...

Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0. Callers should treat a return of 0 and nil as indicating that nothing happened; in particular it does not indicate EOF.

So @glorv do you have the actual error which leads to the Read loop in the first place, so we could add it to the test case?

tirsen · 2020-10-05T12:37:51Z

I can confirm that this fixes our issue.

overvenus · 2020-10-09T02:34:58Z

LGTM

glorv · 2020-10-09T03:18:58Z

The io.ReadFull was introduced in 7581dac and 3554ada because

s3 api may not return enough data, so we need to loop fetch enough
...
TODO: currently, if read to the end, s3 will return io.EOF

Previously it was just Read in #446.

However, according to the interface comment of Reader.Read, this is exactly the intended behavior:

Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered.
...
Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0. Callers should treat a return of 0 and nil as indicating that nothing happened; in particular it does not indicate EOF.

So @glorv do you have the actual error which leads to the Read loop in the first place, so we could add it to the test case?

The root cause is that some code in parquet-go use reader.Read instead of io.ReadFull to read into a buffer but does not check if the buffer is fully filled. This should be a bug, I filed a pr to fix this bug.

glorv · 2020-10-09T03:42:22Z

LGTM

ti-srebot · 2020-10-09T05:49:38Z

cherry pick to release-4.0 failed

This reverts commit 9066e09.

* storage: replace s3Handlers by the standard s3iface.S3API interface * storage: publish RangeInfo for testing * storage: use gomock instead of mockS3Handlers for proper mocking this requires s3_test to be placed into a separate storage_test package to avoid circular dependency. this makes the `checkS3` hack not usable. instead, we introduced a new option struct to the constructor to disable to the check. * storage/s3: fix FileExists always `return true, nil` on non-AWS error * storage/s3: do not use ReadFull to implement Read * storage/s3: fix incorrect usage of NextMarker in WalkDir

kennytm added 2 commits October 5, 2020 05:45

storage: replace s3Handlers by the standard s3iface.S3API interface

abe43fb

storage: publish RangeInfo for testing

4da5533

kennytm added the needs-cherry-pick-release-4.0 label Oct 5, 2020

kennytm force-pushed the s3iface branch from 3611a3d to ef17c41 Compare October 5, 2020 08:08

kennytm added 3 commits October 5, 2020 16:15

storage/s3: fix FileExists always return true, nil on non-AWS error

acc5050

storage/s3: do not use ReadFull to implement Read

62f230f

kennytm force-pushed the s3iface branch from ef17c41 to 62f230f Compare October 5, 2020 08:16

kennytm changed the title ~~storage: S3 test refactoring and some fix some minor bugs~~ storage: S3 test refactoring and fix some minor bugs Oct 5, 2020

storage/s3: fix incorrect usage of NextMarker in WalkDir

17f0da8

kennytm changed the title ~~storage: S3 test refactoring and fix some minor bugs~~ storage: S3 test refactoring and fix some bugs Oct 5, 2020

kennytm requested a review from glorv October 5, 2020 20:00

ti-srebot added the status/LGT1 LGTM1 label Oct 9, 2020

glorv mentioned this pull request Oct 9, 2020

fix the Reader.Read bug in parquet-go pingcap/tidb-lightning#416

Closed

ti-srebot added status/LGT2 LGTM2 and removed status/LGT1 LGTM1 labels Oct 9, 2020

ti-srebot approved these changes Oct 9, 2020

View reviewed changes

kennytm merged commit 9066e09 into pingcap:master Oct 9, 2020

kennytm deleted the s3iface branch October 9, 2020 05:49

coocood added a commit to coocood/br that referenced this pull request Oct 14, 2020

Revert "storage: S3 test refactoring and fix some bugs (pingcap#548)"

3aa481c

This reverts commit 9066e09.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: S3 test refactoring and fix some bugs #548

storage: S3 test refactoring and fix some bugs #548

kennytm commented Oct 5, 2020 •

edited

Loading

kennytm commented Oct 5, 2020 •

edited

Loading

tirsen commented Oct 5, 2020

overvenus commented Oct 9, 2020

glorv commented Oct 9, 2020 •

edited

Loading

glorv commented Oct 9, 2020

ti-srebot commented Oct 9, 2020

storage: S3 test refactoring and fix some bugs #548

storage: S3 test refactoring and fix some bugs #548

Conversation

kennytm commented Oct 5, 2020 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release Note

kennytm commented Oct 5, 2020 • edited Loading

tirsen commented Oct 5, 2020

overvenus commented Oct 9, 2020

glorv commented Oct 9, 2020 • edited Loading

glorv commented Oct 9, 2020

ti-srebot commented Oct 9, 2020

kennytm commented Oct 5, 2020 •

edited

Loading

kennytm commented Oct 5, 2020 •

edited

Loading

glorv commented Oct 9, 2020 •

edited

Loading