Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass in request payer for ListObjectsV2Request #23906

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

zhaner08
Copy link
Contributor

@zhaner08 zhaner08 commented Oct 24, 2024

Description

Without this, query will fail with S3 403 error when listing objects in the buckets as below.

After adding this, we have all the failing queries succeeded

According to the AWS public document "You must authenticate all requests involving Requester Pays buckets. The request authentication enables Amazon S3 to identify and charge the requester for their use of the Requester Pays bucket."

Error Stacktrace: FailureException
	at io.trino.plugin.hive.fs.HiveFileIterator$FileStatusIterator.processException(HiveFileIterator.java:210)
	at io.trino.plugin.hive.fs.HiveFileIterator$FileStatusIterator.<init>(HiveFileIterator.java:178)
	at io.trino.plugin.hive.fs.HiveFileIterator.getLocatedFileStatusRemoteIterator(HiveFileIterator.java:98)
	at io.trino.plugin.hive.fs.HiveFileIterator.<init>(HiveFileIterator.java:69)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.createInternalHiveSplitIterator(BackgroundHiveSplitLoader.java:1115)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:726)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:463)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:371)
	at io.trino.plugin.hive.util.ResumableTasks$1.run(ResumableTasks.java:38)
	at io.trino.$xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
	at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: FailureException at Failed to list location: s3://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
	xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
	at io.trino.plugin.hive.fs.TransactionScopeCachingDirectoryLister.createListingRemoteIterator(TransactionScopeCachingDirectoryLister.java:97)
	at io.trino.plugin.hive.fs.TransactionScopeCachingDirectoryLister.lambda$listInternal$0(TransactionScopeCachingDirectoryLister.java:78)
	at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4927)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3571)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2313)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4012)
	at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4922)
	at io.trino.cache.EvictableCache.get(EvictableCache.java:112)
	at io.trino.plugin.hive.fs.TransactionScopeCachingDirectoryLister.listInternal(TransactionScopeCachingDirectoryLister.java:78)
	at io.trino.plugin.hive.fs.TransactionScopeCachingDirectoryLister.listFilesRecursively(TransactionScopeCachingDirectoryLister.java:70)
	at io.trino.plugin.hive.fs.HiveFileIterator$FileStatusIterator.<init>(HiveFileIterator.java:168)
	at io.trino.plugin.hive.fs.HiveFileIterator.getLocatedFileStatusRemoteIterator(HiveFileIterator.java:98)
	at io.trino.plugin.hive.fs.HiveFileIterator.<init>(HiveFileIterator.java:69)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.createInternalHiveSplitIterator(BackgroundHiveSplitLoader.java:1115)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:726)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:463)
	at io.trino.plugin.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:371)
	at io.trino.plugin.hive.util.ResumableTasks$1.run(ResumableTasks.java:38)
	at io.trino.$xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
	at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: FailureException at Access Denied (Service: S3, Status Code: 403, Request ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, Extended Request ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
	at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
	at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
	at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
	at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
	at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:52)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:37)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$Composing

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(X ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Hive
* Fix S3 file listing of buckets that enforce requester pays ({issue}`23906`)

@pettyjamesm pettyjamesm merged commit 41ebd0f into trinodb:master Oct 25, 2024
58 checks passed
@pettyjamesm
Copy link
Member

Thanks, @zhaner08!

@github-actions github-actions bot added this to the 464 milestone Oct 25, 2024
@findinpath
Copy link
Contributor

@zhaner08 can you pls add a more detailed description how the call fails?
Thank you in advance for the fix.

@zhaner08
Copy link
Contributor Author

@findinpath Done, added more details in the description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants