Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connector-V2] Add File Sink Connector #2117

Merged
merged 101 commits into from
Jul 7, 2022
Merged
Show file tree
Hide file tree
Changes from 100 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
3282ce3
tmp commit
EricJoy2048 Jun 18, 2022
385af98
add hadoop2 and hadoop3 shade jar
EricJoy2048 Jun 18, 2022
e97035b
add hadoop2 and hadoop3 shade jar
EricJoy2048 Jun 18, 2022
03abf08
add license head
EricJoy2048 Jun 18, 2022
de02eeb
change know denpendencies
EricJoy2048 Jun 18, 2022
97e8fee
merge from remote
EricJoy2048 Jun 20, 2022
14b8da0
tmp commit
EricJoy2048 Jun 20, 2022
7356bbc
merge from remote
EricJoy2048 Jun 20, 2022
6b61086
tmp commit
EricJoy2048 Jun 21, 2022
f9be235
tmp commit
EricJoy2048 Jun 22, 2022
7cd6fac
change hadoop dependency scope to provide
EricJoy2048 Jun 23, 2022
92142b7
back pom
EricJoy2048 Jun 23, 2022
28062cd
fix checkstyle
EricJoy2048 Jun 23, 2022
26fd27a
add example
EricJoy2048 Jun 23, 2022
ef996f1
Merge remote-tracking branch 'apache/api-draft' into new_file_sink_co…
EricJoy2048 Jun 23, 2022
e282b61
fix example bug
EricJoy2048 Jun 23, 2022
7815578
remove file connector from example and e2e because hadoop2 can not co…
EricJoy2048 Jun 24, 2022
c746b0a
no need jdk8 and jdk11 profile because we don't use hadoop shade jar
EricJoy2048 Jun 24, 2022
8fe742a
change hadoop jar dependency scope to provided
EricJoy2048 Jun 24, 2022
5f694fa
back
EricJoy2048 Jun 24, 2022
d61bd98
file connector can not build in jdk11
EricJoy2048 Jun 24, 2022
4690544
merge api-draft code
EricJoy2048 Jun 24, 2022
baf3fb9
drop hadoop shade
EricJoy2048 Jun 24, 2022
68a8dea
add gitignore item
EricJoy2048 Jun 25, 2022
7a1f529
add hadoop and local file sink
EricJoy2048 Jun 26, 2022
d5949df
tmp commit
EricJoy2048 Jun 29, 2022
97a92c4
fix pom error
EricJoy2048 Jun 29, 2022
bb6be0d
fix pom error
EricJoy2048 Jun 29, 2022
9d7bbdb
fix pom error
EricJoy2048 Jun 29, 2022
a436f34
implement new interface
EricJoy2048 Jun 29, 2022
5622197
fix UT error
EricJoy2048 Jun 29, 2022
a9e097c
fix e2e error
EricJoy2048 Jun 29, 2022
30e4f46
update build timeout from 30min to 40min
EricJoy2048 Jun 29, 2022
72cc766
fix e2e error
EricJoy2048 Jun 29, 2022
2aeeb60
remove auto service
EricJoy2048 Jun 29, 2022
70dbbec
fix e2e error
EricJoy2048 Jun 30, 2022
24987ec
fix e2e error
EricJoy2048 Jun 30, 2022
bea876d
fix e2e error
EricJoy2048 Jun 30, 2022
e7f3600
found e2e error
EricJoy2048 Jun 30, 2022
320ab08
fix e2e error
EricJoy2048 Jun 30, 2022
4d86ac3
merge from upstream
EricJoy2048 Jun 30, 2022
1310e02
fix e2e error
EricJoy2048 Jun 30, 2022
c2391b8
fix e2e error
EricJoy2048 Jun 30, 2022
e40a64c
merge from upstream
EricJoy2048 Jun 30, 2022
821d6af
merge from upstream
EricJoy2048 Jun 30, 2022
ff1b76a
merge from upstream
EricJoy2048 Jun 30, 2022
c4a02c3
merge from upstream
EricJoy2048 Jun 30, 2022
bf14737
merge from upstream
EricJoy2048 Jun 30, 2022
5eb2c7d
add mvn jvm option
EricJoy2048 Jun 30, 2022
1d592aa
add mvn jvm option
EricJoy2048 Jun 30, 2022
2ccfd7e
merge from upstream
EricJoy2048 Jun 30, 2022
f33aff0
add license
EricJoy2048 Jun 30, 2022
29c7774
add license
EricJoy2048 Jun 30, 2022
4c8c3ef
add licnese
EricJoy2048 Jun 30, 2022
ed0eb26
add licnese
EricJoy2048 Jul 1, 2022
3e1289d
fix dependency
EricJoy2048 Jul 1, 2022
0ed249d
fix build jvm oom
EricJoy2048 Jul 1, 2022
1185212
fix build jvm oom
EricJoy2048 Jul 1, 2022
8d7b27b
fix build jvm oom
EricJoy2048 Jul 1, 2022
80aee0e
fix dependency
EricJoy2048 Jul 1, 2022
5fc62ef
fix dependency
EricJoy2048 Jul 1, 2022
d2abc46
fix e2e error
EricJoy2048 Jul 1, 2022
5eac1e5
add codeql check timeout from 30min to 60min
EricJoy2048 Jul 1, 2022
541542a
add file connector v2
EricJoy2048 Jul 2, 2022
a3acd92
merge from dev
EricJoy2048 Jul 2, 2022
95a47f7
merge from dev
EricJoy2048 Jul 2, 2022
c59b8a3
fix ci error
EricJoy2048 Jul 2, 2022
31f8701
fix checkstyle
EricJoy2048 Jul 2, 2022
b8341d0
fix ci
EricJoy2048 Jul 2, 2022
c0b9dcd
fix ci
EricJoy2048 Jul 2, 2022
61d2905
aa
EricJoy2048 Jul 2, 2022
e2b4a93
aa
EricJoy2048 Jul 2, 2022
a4eb828
aa
EricJoy2048 Jul 2, 2022
a72f00f
add .idea
EricJoy2048 Jul 2, 2022
1096e09
del .idea
EricJoy2048 Jul 2, 2022
4b77455
del .idea
EricJoy2048 Jul 2, 2022
3b54d02
del .idea
EricJoy2048 Jul 3, 2022
ffc7c26
del .idea
EricJoy2048 Jul 3, 2022
4cb4f28
remove no use license
EricJoy2048 Jul 4, 2022
3333af0
remove no use before and after method in test
EricJoy2048 Jul 4, 2022
3305c69
fix license; remove dependency
EricJoy2048 Jul 4, 2022
e1f5532
fix review
EricJoy2048 Jul 4, 2022
c0ffd05
fix review
EricJoy2048 Jul 4, 2022
39ce1a5
fix build order
EricJoy2048 Jul 4, 2022
321913e
fix license
EricJoy2048 Jul 5, 2022
426978c
fix license
EricJoy2048 Jul 5, 2022
5f7e1f5
fix review
EricJoy2048 Jul 5, 2022
a805066
fix review
EricJoy2048 Jul 5, 2022
2126246
fix review
EricJoy2048 Jul 5, 2022
e9ec2ba
fix review
EricJoy2048 Jul 5, 2022
9c7e3d0
fix review
EricJoy2048 Jul 5, 2022
3922864
fix review
EricJoy2048 Jul 5, 2022
5fcd2bf
fix review
EricJoy2048 Jul 5, 2022
a7dc64b
fix review
EricJoy2048 Jul 5, 2022
aa11454
fix review
EricJoy2048 Jul 5, 2022
7d99bd3
merge from dev
EricJoy2048 Jul 6, 2022
beeb57f
add code-analysys timeout to 120
EricJoy2048 Jul 6, 2022
a20f4ff
retry ci
EricJoy2048 Jul 6, 2022
67c1e21
update license and remove no use jar from LICENSE file
EricJoy2048 Jul 6, 2022
07b68bc
retry ci
EricJoy2048 Jul 6, 2022
b0434e7
Merge branch 'dev' into add_file_sink_v2
Hisoka-X Jul 7, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions .github/workflows/backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ jobs:
java: [ '8', '11' ]
os: [ 'ubuntu-latest', 'windows-latest' ]
runs-on: ${{ matrix.os }}
timeout-minutes: 50
timeout-minutes: 80
steps:
- uses: actions/checkout@v3
with:
Expand All @@ -115,7 +115,7 @@ jobs:
name: Dependency licenses
needs: [ sanity-check ]
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 40
steps:
- uses: actions/checkout@v3
with:
Expand Down Expand Up @@ -155,7 +155,9 @@ jobs:
cache: 'maven'
- name: Run Unit tests
run: |
./mvnw -T 2C -B clean verify -D"maven.test.skip"=false -D"checkstyle.skip"=true -D"scalastyle.skip"=true -D"license.skipAddThirdParty"=true --no-snapshot-updates
./mvnw -B -T 1C clean verify -D"maven.test.skip"=false -D"checkstyle.skip"=true -D"scalastyle.skip"=true -D"license.skipAddThirdParty"=true --no-snapshot-updates
env:
MAVEN_OPTS: -Xmx2048m

integration-test:
name: Integration Test
Expand All @@ -176,4 +178,6 @@ jobs:
cache: 'maven'
- name: Run Integration tests
run: |
./mvnw -T 2C -B verify -DskipUT=true -DskipIT=false -D"checkstyle.skip"=true -D"scalastyle.skip"=true -D"license.skipAddThirdParty"=true --no-snapshot-updates
./mvnw -T 1C -B verify -DskipUT=true -DskipIT=false -D"checkstyle.skip"=true -D"scalastyle.skip"=true -D"license.skipAddThirdParty"=true --no-snapshot-updates
env:
MAVEN_OPTS: -Xmx2048m
1 change: 1 addition & 0 deletions .github/workflows/code-analysys.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ on:
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 120
steps:
- uses: actions/checkout@v2
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ jobs:
check:
name: Spark
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 60
steps:
- uses: actions/checkout@v2
- name: Set up JDK 1.8
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ target/
# Intellij Idea files
.idea/
*.iml
.idea/*

.DS_Store

Expand Down Expand Up @@ -40,4 +41,4 @@ Test.scala
test.conf
log4j.properties
spark-warehouse
*.flattened-pom.xml
*.flattened-pom.xml
2 changes: 2 additions & 0 deletions plugin-mapping.properties
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,5 @@ seatunnel.sink.Clickhouse = connector-clickhouse
seatunnel.sink.ClickhouseFile = connector-clickhouse
seatunnel.source.Jdbc = connector-jdbc
seatunnel.sink.Jdbc = connector-jdbc
seatunnel.sink.HdfsFile = connector-file-hadoop
seatunnel.sink.LocalFile = connector-file-local
15 changes: 13 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@
<slf4j.version>1.7.25</slf4j.version>
<guava.version>19.0</guava.version>
<auto-service.version>1.0.1</auto-service.version>
<powermock.version>2.0.9</powermock.version>
<hadoop2.version>2.6.5</hadoop2.version>
<hadoop3.version>3.0.0</hadoop3.version>
<seatunnel.shade.package>org.apache.seatunnel.shade</seatunnel.shade.package>
Expand Down Expand Up @@ -659,12 +660,23 @@
<version>${guava.version}</version>
</dependency>

<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<version>${powermock.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito2</artifactId>
<version>${powermock.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.github.jsonzou</groupId>
<artifactId>jmockdata</artifactId>
<version>${jmockdata.version}</version>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
Expand Down Expand Up @@ -1277,7 +1289,6 @@
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
</plugin>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
Expand Down
12 changes: 11 additions & 1 deletion seatunnel-connectors-v2-dist/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
limitations under the License.

-->

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
Expand Down Expand Up @@ -80,9 +81,18 @@
<artifactId>connector-hive</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>connector-file-hadoop</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>connector-file-local</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>


<build>
<plugins>
<plugin>
Expand Down
72 changes: 72 additions & 0 deletions seatunnel-connectors-v2/connector-file/connector-file-base/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--

Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

-->
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>connector-file</artifactId>
<groupId>org.apache.seatunnel</groupId>
<version>${revision}</version>
</parent>
<modelVersion>4.0.0</modelVersion>

<artifactId>connector-file-base</artifactId>

<dependencies>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>seatunnel-api</artifactId>
<version>${project.version}</version>
</dependency>

<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>seatunnel-core-base</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito2</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.seatunnel.connectors.seatunnel.file.config;

import static com.google.common.base.Preconditions.checkNotNull;

import org.apache.seatunnel.shade.com.typesafe.config.Config;

import lombok.Data;
import lombok.NonNull;
import org.apache.commons.lang3.StringUtils;

import java.io.Serializable;
import java.util.Locale;

@Data
public class AbstractTextFileConfig implements DelimiterConfig, CompressConfig, Serializable {
private static final long serialVersionUID = 1L;

protected String compressCodec;

protected String fieldDelimiter = String.valueOf('\001');

protected String rowDelimiter = "\n";

protected String path;
protected String fileNameExpression;
protected FileFormat fileFormat = FileFormat.TEXT;

public AbstractTextFileConfig(@NonNull Config config) {
checkNotNull(config.getString(Constant.PATH));

if (config.hasPath(Constant.COMPRESS_CODEC)) {
throw new RuntimeException("compress not support now");
}

if (config.hasPath(Constant.FIELD_DELIMITER) && !StringUtils.isBlank(config.getString(Constant.FIELD_DELIMITER))) {
this.fieldDelimiter = config.getString(Constant.FIELD_DELIMITER);
}

if (config.hasPath(Constant.ROW_DELIMITER) && !StringUtils.isBlank(config.getString(Constant.ROW_DELIMITER))) {
this.rowDelimiter = config.getString(Constant.ROW_DELIMITER);
}

if (config.hasPath(Constant.PATH) && !StringUtils.isBlank(config.getString(Constant.PATH))) {
this.path = config.getString(Constant.PATH);
}

if (config.hasPath(Constant.FILE_NAME_EXPRESSION) && !StringUtils.isBlank(config.getString(Constant.FILE_NAME_EXPRESSION))) {
this.fileNameExpression = config.getString(Constant.FILE_NAME_EXPRESSION);
}

if (config.hasPath(Constant.FILE_FORMAT) && !StringUtils.isBlank(config.getString(Constant.FILE_FORMAT))) {
this.fileFormat = FileFormat.valueOf(config.getString(Constant.FILE_FORMAT).toUpperCase(Locale.ROOT));
}
}

protected AbstractTextFileConfig() {
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.seatunnel.connectors.seatunnel.file.config;

public interface CompressConfig {
String getCompressCodec();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.seatunnel.connectors.seatunnel.file.config;

public class Constant {
public static final String SEATUNNEL = "seatunnel";
public static final String NON_PARTITION = "NON_PARTITION";
public static final String TRANSACTION_ID_SPLIT = "_";
public static final String TRANSACTION_EXPRESSION = "transactionId";

public static final String SAVE_MODE = "save_mode";
public static final String COMPRESS_CODEC = "compress_codec";

public static final String PATH = "path";
public static final String FIELD_DELIMITER = "field_delimiter";
public static final String ROW_DELIMITER = "row_delimiter";
public static final String PARTITION_BY = "partition_by";
public static final String PARTITION_DIR_EXPRESSION = "partition_dir_expression";
public static final String IS_PARTITION_FIELD_WRITE_IN_FILE = "is_partition_field_write_in_file";
public static final String TMP_PATH = "tmp_path";
public static final String FILE_NAME_EXPRESSION = "file_name_expression";
public static final String FILE_FORMAT = "file_format";
public static final String SINK_COLUMNS = "sink_columns";
public static final String FILENAME_TIME_FORMAT = "filename_time_format";
public static final String IS_ENABLE_TRANSACTION = "is_enable_transaction";
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.seatunnel.connectors.seatunnel.file.config;

public interface DelimiterConfig {
String getFieldDelimiter();

String getRowDelimiter();
}
Loading