Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VeloxRuntimeError when reading parquet file with only meta data #770

Closed
zhixingheyi-tian opened this issue Dec 27, 2022 · 1 comment
Closed
Labels
bug Something isn't working velox backend works for Velox backend

Comments

@zhixingheyi-tian
Copy link
Contributor

zhixingheyi-tian commented Dec 27, 2022

Describe the bug
This issue is exposed from :

  GlutenDataFrameJoinSuite:
   "SPARK-24690 enables star schema detection even if CBO disabled"

Error:

Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 3) (sr250 executor driver): java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (365 vs. 365)
Retriable: False
Expression: footerLength + 12 < fileLength_
Function: loadFileMetaData
File: ../../velox/dwio/parquet/reader/ParquetReader.cpp
Line: 64
Stack trace:
# 0  
# 1  
# 2  
# 3  
# 4  
# 5  
# 6  
# 7  
# 8  
# 9  
# 10 
# 11 
# 12 
# 13 
# 14 
# 15 
# 16 

To Reproduce

val path = "/tmp/data/"
spark.range(1).selectExpr("id AS c", "id AS f")
  .write.mode("overwrite").parquet(s"$path")
spark.read.parquet(s"$path").show()

Analysis

This is because Spark will generate two parquet files. One has one row data. The other has no data, but with meta data.
Velox parquet reader will encounter issue, when parse the second parquet file.

@zhixingheyi-tian zhixingheyi-tian added the bug Something isn't working label Dec 27, 2022
@zhixingheyi-tian zhixingheyi-tian changed the title VeloxRuntimeError when read parquet file with only meta data VeloxRuntimeError when reading parquet file with only meta data Dec 27, 2022
@zhixingheyi-tian
Copy link
Contributor Author

Now, it is fixed by oap-project/velox#105

facebook-github-bot pushed a commit to facebookincubator/velox that referenced this issue Jan 9, 2023
…3605)

Summary:
Issue is exposed from apache/incubator-gluten#770

Pull Request resolved: #3605

Reviewed By: mbasmanova

Differential Revision: D42386816

Pulled By: Yuhta

fbshipit-source-id: 3f266cf46068b9ab540ea0e04d9f2fc70f5022a5
@weiting-chen weiting-chen added the velox backend works for Velox backend label Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working velox backend works for Velox backend
Projects
None yet
Development

No branches or pull requests

3 participants