-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add seatunnel datatype and convert origin value into seatunnel data t…
…ype (#1797) * Add seatunnel datatype
- Loading branch information
1 parent
4639ba1
commit c340795
Showing
127 changed files
with
1,776 additions
and
627 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
--- | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Intro to config file | ||
|
||
In SeaTunnel, the most important thing is the Config file, through which users can customize their own data | ||
synchronization requirements to maximize the potential of SeaTunnel. So next, I will introduce you how to | ||
configure the Config file. | ||
|
||
## Example | ||
|
||
Before you read on, you can find config file | ||
examples [here](https://github.com/apache/incubator-seatunnel/tree/dev/config) and in distribute package's | ||
config directory. | ||
|
||
## Config file structure | ||
|
||
The Config file will be similar to the one below. | ||
|
||
```hocon | ||
env { | ||
execution.parallelism = 1 | ||
} | ||
source { | ||
FakeSource { | ||
result_table_name = "fake" | ||
field_name = "name,age" | ||
} | ||
} | ||
transform { | ||
sql { | ||
sql = "select name,age from fake" | ||
} | ||
} | ||
sink { | ||
Clickhouse { | ||
host = "clickhouse:8123" | ||
database = "default" | ||
table = "seatunnel_console" | ||
fields = ["name"] | ||
username = "default" | ||
password = "" | ||
} | ||
} | ||
``` | ||
|
||
As you can see, the Config file contains several sections: env, source, transform, sink. Different modules | ||
have different functions. After you understand these modules, you will understand how SeaTunnel works. | ||
|
||
### env | ||
|
||
Used to add some engine optional parameters, no matter which engine (Spark or Flink), the corresponding | ||
optional parameters should be filled in here. | ||
|
||
<!-- TODO add supported env parameters --> | ||
|
||
### source | ||
|
||
source is used to define where SeaTunnel needs to fetch data, and use the fetched data for the next step. | ||
Multiple sources can be defined at the same time. The supported source at now | ||
check [Source of SeaTunnel](../connector/source). Each source has its own specific parameters to define how to | ||
fetch data, and SeaTunnel also extracts the parameters that each source will use, such as | ||
the `result_table_name` parameter, which is used to specify the name of the data generated by the current | ||
source, which is convenient for follow-up used by other modules. | ||
|
||
### transform | ||
|
||
When we have the data source, we may need to further process the data, so we have the transform module. Of | ||
course, this uses the word 'may', which means that we can also directly treat the transform as non-existent, | ||
directly from source to sink. Like below. | ||
|
||
```hocon | ||
transform { | ||
// no thing on here | ||
} | ||
``` | ||
|
||
Like source, transform has specific parameters that belong to each module. The supported source at now check. | ||
The supported transform at now check [Transform of SeaTunnel](../transform) | ||
|
||
### sink | ||
|
||
Our purpose with SeaTunnel is to synchronize data from one place to another, so it is critical to define how | ||
and where data is written. With the sink module provided by SeaTunnel, you can complete this operation quickly | ||
and efficiently. Sink and source are very similar, but the difference is reading and writing. So go check out | ||
our [supported sinks](../connector/sink). | ||
|
||
### Other | ||
|
||
You will find that when multiple sources and multiple sinks are defined, which data is read by each sink, and | ||
which is the data read by each transform? We use `result_table_name` and `source_table_name` two key | ||
configurations. Each source module will be configured with a `result_table_name` to indicate the name of the | ||
data source generated by the data source, and other transform and sink modules can use `source_table_name` to | ||
refer to the corresponding data source name, indicating that I want to read the data for processing. Then | ||
transform, as an intermediate processing module, can use both `result_table_name` and `source_table_name` | ||
configurations at the same time. But you will find that in the above example Config, not every module is | ||
configured with these two parameters, because in SeaTunnel, there is a default convention, if these two | ||
parameters are not configured, then the generated data from the last module of the previous node will be used. | ||
This is much more convenient when there is only one source. | ||
|
||
## What's More | ||
|
||
If you want to know the details of this format configuration, Please | ||
see [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md). |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# UUID | ||
|
||
## Description | ||
|
||
Generate a universally unique identifier on a specified field. | ||
|
||
:::tip | ||
|
||
This transform **ONLY** supported by Spark. | ||
|
||
::: | ||
|
||
## Options | ||
|
||
| name | type | required | default value | | ||
| -------------- | ------ | -------- | ------------- | | ||
| fields | string | yes | - | | ||
| prefix | string | no | - | | ||
| secure | boolean| no | false | | ||
|
||
### field [string] | ||
|
||
The name of the field to generate. | ||
|
||
### prefix [string] | ||
|
||
The prefix string constant to prepend to each generated UUID. | ||
|
||
### secure [boolean] | ||
|
||
the cryptographically secure algorithm can be comparatively slow | ||
The nonSecure algorithm uses a secure random seed but is otherwise deterministic | ||
|
||
### common options [string] | ||
|
||
Transform plugin common parameters, please refer to [Transform Plugin](common-options.mdx) for details | ||
|
||
## Examples | ||
|
||
```bash | ||
UUID { | ||
fields = "u" | ||
prefix = "uuid-" | ||
secure = true | ||
} | ||
} | ||
``` | ||
|
||
Use `UUID` as udf in sql. | ||
|
||
```bash | ||
UUID { | ||
fields = "u" | ||
prefix = "uuid-" | ||
secure = true | ||
} | ||
|
||
# Use the uuid function (confirm that the fake table exists) | ||
sql { | ||
sql = "select * from (select raw_message, UUID() as info_row from fake) t1" | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
23 changes: 23 additions & 0 deletions
23
seatunnel-api/src/main/java/org/apache/seatunnel/api/table/type/BasicType.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
package org.apache.seatunnel.api.table.type; | ||
|
||
public class BasicType<T> implements DataType<T> { | ||
|
||
private final Class<T> typeClass; | ||
|
||
public BasicType(Class<T> typeClass) { | ||
if (typeClass == null) { | ||
throw new IllegalArgumentException("typeClass cannot be null"); | ||
} | ||
this.typeClass = typeClass; | ||
} | ||
|
||
@Override | ||
public boolean isBasicType() { | ||
return true; | ||
} | ||
|
||
@Override | ||
public Class<T> getTypeClass() { | ||
return this.typeClass; | ||
} | ||
} |
26 changes: 26 additions & 0 deletions
26
seatunnel-api/src/main/java/org/apache/seatunnel/api/table/type/BooleanType.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.seatunnel.api.table.type; | ||
|
||
public class BooleanType extends BasicType<Boolean> { | ||
private static final BooleanType INSTANCE = new BooleanType(Boolean.class); | ||
|
||
private BooleanType(Class<Boolean> typeClass) { | ||
super(typeClass); | ||
} | ||
} |
Oops, something went wrong.