-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉Destination-s3: added support for AWS Glue crawler #11173
🎉Destination-s3: added support for AWS Glue crawler #11173
Conversation
/test connector=connectors/destination-s3 |
…-for-aws-glue-crawler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. please reformat the code and update the related documentation
/test connector=connectors/destination-s3 |
/test connector=connectors/destination-s3 |
// more details https://docs.aws.amazon.com/glue/latest/dg/crawler-s3-folder-table-partition.html | ||
paths.add(streamName); | ||
}else { | ||
paths.add(NAME_TRANSFORMER.convertStreamName(streamName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we create a specific NamingConventionTransformer nameTransformer
for S3 when instantiating the S3Destination
?
It would be S3NameTransformer
instead of a generic ExtendedNameTransformer
:
Line 12 in ea92a24
public static final ExtendedNameTransformer NAME_TRANSFORMER = new ExtendedNameTransformer(); |
That would allow us to specify what are the allowed characters in the bucket name/path such as =
or other characters, no?
Non-allowed characters are then following the rule:
all special symbols transform to "_" symbol.
Here are some examples of how it's done in other connectors:
- Postgres:
Line 38 in ea92a24
super(DRIVER_CLASS, new PostgresSQLNameTransformer(), new PostgresSqlOperations()); - Snowflake:
Line 33 in ea92a24
this(new SnowflakeSQLNameTransformer()); - MongoDB:
Line 64 in ea92a24
namingResolver = new MongodbNameTransformer(); - etc
Or at least, when we construct the consumer, we pass it to the factory:
- Redshift:
Line 59 in ea92a24
return new RedshiftSQLNameTransformer();
WDYT @tuliren?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We should hide that complexity in a custom NameTransformer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understood, this converter is used for creating a stream name only. A user wants to use some specific feature([AWS Glue crawler). That's why I removed any auto-replacement if "=" symbol is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etsybaev the logic is sound, but we should encapsulate it into a S3NameTransformer
// more details https://docs.aws.amazon.com/glue/latest/dg/crawler-s3-folder-table-partition.html | ||
paths.add(streamName); | ||
}else { | ||
paths.add(NAME_TRANSFORMER.convertStreamName(streamName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We should hide that complexity in a custom NameTransformer
/test connector=connectors/destination-s3
|
...tination-s3/src/main/java/io/airbyte/integrations/destination/s3/util/S3NameTransformer.java
Show resolved
Hide resolved
/publish connector=connectors/destination-s3
|
What
A user wants to use the AWS Glue crawler feature, but currently, all special symbols transform to "_" symbol.
How
Updated logic to do not transform special symbols in stream name if user adds the "=" symbol in prefix.
Tested locally:
🚨 User Impact 🚨
No breaking changes expected
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.