Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connector-V2] [Hive connector] file list should not contain '_SUCCESS' file #2235

Closed
3 tasks done
TyrantLucifer opened this issue Jul 21, 2022 · 1 comment · Fixed by #2236
Closed
3 tasks done

[Connector-V2] [Hive connector] file list should not contain '_SUCCESS' file #2235

TyrantLucifer opened this issue Jul 21, 2022 · 1 comment · Fixed by #2236
Labels

Comments

@TyrantLucifer
Copy link
Member

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

When hive connector scan hdfs dirs and put file in list, it will not filter the '_SUCCESS' file that usually generated by spark. It will lead to cause the task failed.

image

SeaTunnel Version

dev

SeaTunnel Config

none

Running Command

none

Error Exception

none

Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

TyrantLucifer added a commit to TyrantLucifer/incubator-seatunnel that referenced this issue Jul 21, 2022
@TyrantLucifer
Copy link
Member Author

pr #2236 , @Hisoka-X @CalvinKirs please review, thank you!

@CalvinKirs CalvinKirs linked a pull request Jul 22, 2022 that will close this issue
3 tasks
CalvinKirs added a commit that referenced this issue Jul 22, 2022
* Delete a repeated dependency libary. (#2180)

Signed-off-by: root <l-shen@localhost.localdomain>

Co-authored-by: root <l-shen@localhost.localdomain>

* update flinkCommand to sparkCommand in spark example (#2184)

* update doc about module desc to keep consistent with the real module name (#2185)

* [Connector-V2] Add Hive sink connector v2 (#2158)

* tmp commit

* add hadoop2 and hadoop3 shade jar

* add hadoop2 and hadoop3 shade jar

* add license head

* change know denpendencies

* tmp commit

* tmp commit

* change hadoop dependency scope to provide

* back pom

* fix checkstyle

* add example

* fix example bug

* remove file connector from example and e2e because hadoop2 can not compile with jdk11

* no need jdk8 and jdk11 profile because we don't use hadoop shade jar

* change hadoop jar dependency scope to provided

* back

* file connector can not build in jdk11

* drop hadoop shade

* add gitignore item

* add hadoop and local file sink

* fix pom error

* fix pom error

* fix pom error

* implement new interface

* fix UT error

* fix e2e error

* update build timeout from 30min to 40min

* fix e2e error

* remove auto service

* fix e2e error

* fix e2e error

* fix e2e error

* found e2e error

* fix e2e error

* fix e2e error

* fix e2e error

* merge from upstream

* merge from upstream

* merge from upstream

* merge from upstream

* merge from upstream

* add mvn jvm option

* add mvn jvm option

* add license

* add licnese

* add licnese

* fix dependency

* fix build jvm oom

* fix build jvm oom

* fix build jvm oom

* fix dependency

* fix dependency

* fix e2e error

* add codeql check timeout from 30min to 60min

* merge from dev

* merge from dev

* fix ci error

* fix checkstyle

* fix ci

* fix ci

* aa

* aa

* aa

* add .idea

* del .idea

* del .idea

* del .idea

* del .idea

* remove no use license

* remove no use before and after method in test

* fix license; remove dependency

* fix review

* fix build order

* fix license

* fix license

* fix review

* fix review

* fix review

* fix review

* fix review

* fix review

* fix review

* fix review

* fix review

* add code-analysys timeout to 120

* retry ci

* update license and remove no use jar from LICENSE file

* retry ci

* add hive sink

* add hive sink connector doc

* add hive sink connector doc

* fix checkstyle error.

* fix bug

* tmp

* fix hive shade error

* fix hive shade error

* fix commit bug

* optimaze doc

* optimaze doc

* optimize doc

* optimize code

* [Feat][UI] Add login page. (#2183)

* [bug]fix  commandArgs  -t(--check)  conflict  with flink deployment t… (#2174)

* [bug]fix  commandArgs  -t(--check)  conflict  with flink deployment target

* [bug]fix  commandArgs  -t(--check)  conflict  with flink deployment target

* [Bug][spark-connector-v2-example] fix the bug of no class found. (#2191) (#2192)

* [Bug][spark-connector-v2-example] fix the bug of no class found. (#2191)

* add the janino dependency in pom

* [Bug][spark-connector-v2-example] remove janino dependency in main pom and add it to connector[v2]-hive (#2191)

* [Bug][spark-connector-v2-example] add janino-3.0.9.jar in known-dependencies.txt to fix dependency license error (#2191)

* update the condition to 1 = 0 about get table operation (#2186)

* [Docs] Add connectors-v2 to docs item (#2187)

* [Feat][UI] Add dashboard layout. (#2198)

* [checkstyle] Improved validation scope of MagicNumber (#2194)

* [Bug][Connector]Hudi Source loads the data twice

* add unknown exception message (#2204)

* [Bug] [seatunnel-api-flink] Connectors dependencies repeat additions (#2207)

* [Bug] [connector-v2] When outputting data to clickhouse, a ClassCastException was encountered

* [Bug] [seatunnel-api-flink] Connectors dependencies repeat additions

* [Bug][Script]Fix the problem that the help command is invalid

* [Fix][CI] Add remove jar from /tmp/seatunnel-dependencies before run

* [Feat][UI] Add dashboard default router. (#2216)

* [Feat][UI] Add the header component in the dashboard layout. (#2218)

* [Core][Starter] Change jar connector load logic (#2193)

* [Docs]Fix Flink engine version requirements (#2220)

Flink 1.13.6 version is compatible with 1.12, but not applicable to below 1.12

* [Feat][UI] Add the setting dropdown in the dashboard layout. (#2225)

* [Feat][UI] Add the user dropdown in the dashboard layout. (#2228)

* [Bug][hive-connector-v2] Resolve the schema inconsistency bug (#2229) (#2230)

* [doc] Correct v2 connector avoid duplicate slug (#2231)

Currently, url https://seatunnel.apache.org/docs/category/source
will expand two parent sidebar with both source and source-v2.
This is because we're using same slug in our sidebars.js.

* [Build]Optimize license check (#2232)

* [Core][Starter] Fix connector v2 can't deserialize on spark (#2221)

* [Core][Starter] Fix connector v2 can't deserialize on spark

* [Core][Starter] Add SerializationUtils Unit Test

* [Core][Starter] Add SerializationUtils Unit Test

* [Core][Flink] Fixed FlinkEnvironment registerPlugin logic both old and new api

* [Bug][connector-hive] filter '_SUCCESS' file in file list (#2235) (#2236)

* StateT of SeaTunnelSource should extend `Serializable` (#2214)

* [Improvement][core] StateT of SeaTunnelSource should extend `Serializable`
,so that `org.apache.seatunnel.api.source.SeaTunnelSource.getEnumeratorStateSerializer` can support a default implementation.
This will be useful to each SeaTunnelSource subclass implementation.

* repetitive dependency

repetitive dependency

* [Improvement][connector-v2] postgre jar should be contained in container like mysql-java, so it should be  provided, not compile

* [Improvement][connector-v2] remove the code block in the implementation class to keep code clean.

* [Improvement][connector-v2] remove unused import

* [Improvement][connector-v2] modify import order

Co-authored-by: bjyflihongyu <lihongyuinfo@jd.com>

* [Feat][UI] Add the table in the user manage. (#2234)

* Merge dev to st-engine branch

Co-authored-by: l-shen <lijieliang@cmss.chinamobile.com>
Co-authored-by: root <l-shen@localhost.localdomain>
Co-authored-by: Xiao Zhao <49054376+zhaomin1423@users.noreply.github.com>
Co-authored-by: Eric <gaojun2048@gmail.com>
Co-authored-by: songjianet <1778651752@qq.com>
Co-authored-by: sandyfog <154525105@qq.com>
Co-authored-by: TyrantLucifer <TyrantLucifer@gmail.com>
Co-authored-by: Zongwen Li <zongwen.li.tech@gmail.com>
Co-authored-by: superzhang0929 <45145852+superzhang0929@users.noreply.github.com>
Co-authored-by: Kerwin <37063904+zhuangchong@users.noreply.github.com>
Co-authored-by: gaara <85996062+gaaraG@users.noreply.github.com>
Co-authored-by: lvlv <40759793+lvlv-feifei@users.noreply.github.com>
Co-authored-by: Hisoka <fanjiaeminem@qq.com>
Co-authored-by: Jiajie Zhong <zhongjiajie955@gmail.com>
Co-authored-by: Jared Li <lhyundeadsoul@gmail.com>
Co-authored-by: bjyflihongyu <lihongyuinfo@jd.com>
TyrantLucifer added a commit to TyrantLucifer/incubator-seatunnel that referenced this issue Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants