[BUG] TiSpark recent release is slower than before #2105

birdstorm · 2021-09-08T07:41:29Z

The problem is located that in the recent release of TiSpark, the default value of PARTITION_PER_SPLIT is changed from 10 to 1. It results in increasing Spark tasks.

some related problems:

ScanRequest receive a slower response from tikv when scanning meta data
- cause: scanning is not concurrent.
Memory usage incresed.
- cause: the memory usage of ColumnVector should be optimized.

Affected versions: v2.3.14 to v2.3.16, v2.4.1

The text was updated successfully, but these errors were encountered:

crabo · 2022-05-11T06:55:01Z

"Memory usage": Pls also check DAGIterator.process(), the underlying grpc always thow OutOfDirectMemoryError even in unpooled mode. As 10Million rows table scan in ETL, Off-Heap mem is requried roughly 5GB, that's really a big waste.

birdstorm added the type/bug label Sep 8, 2021

birdstorm self-assigned this Sep 8, 2021

shiyuhang0 added type/enhancement and removed type/bug labels Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] TiSpark recent release is slower than before #2105

[BUG] TiSpark recent release is slower than before #2105

birdstorm commented Sep 8, 2021 •

edited

Loading

crabo commented May 11, 2022

[BUG] TiSpark recent release is slower than before #2105

[BUG] TiSpark recent release is slower than before #2105

Comments

birdstorm commented Sep 8, 2021 • edited Loading

crabo commented May 11, 2022

birdstorm commented Sep 8, 2021 •

edited

Loading