Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for dictionary encoded INT96 timestamp in parquet files #4680

Closed
wants to merge 3 commits into from

Conversation

rui-mo
Copy link
Collaborator

@rui-mo rui-mo commented Apr 20, 2023

Support timestamp reader for Parquet file format to read from dictionary-
encoded INT96 timestamps. Hive configs kReadTimestampUnit and
kReadTimestampUnitSession are added to control the precision when
reading timestamps from files.
Parquet documentation for INT96:
https://github.com/apache/parquet-format/pull/49/files#diff-0e877db0daf579f98a11e5e113b29250a2dcae3decb1e83a88db1e6f092bee96R149-R157

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 20, 2023
@netlify
Copy link

netlify bot commented Apr 20, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit fdceec2
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6699f6bec2d39a0008c88862

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo the implementation looks good to me. Left a couple of comments. Can you add a Parquet File with timestamp type to velox/dwio/parquet/tests/examples and add a test?

velox/dwio/parquet/reader/TimestampColumnReader.h Outdated Show resolved Hide resolved
Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp

You probably need to make the writer to generate INT96 in writeToMemory

velox/dwio/parquet/reader/TimestampColumnReader.h Outdated Show resolved Hide resolved
@majetideepak
Copy link
Collaborator

majetideepak commented Apr 21, 2023

You probably need to make the writer to generate INT96 in writeToMemory

@Yuhta I doubt if the Arrow Bridge supports int96 type. But worth checking. The alternative is to check in a file.
Arrow Bridge has a similar issue with Parquet Decimal types backed by int64.

@Yuhta
Copy link
Contributor

Yuhta commented Apr 21, 2023

@majetideepak Using a fixed file gives less coverage, but if the writer is not working then we have to do it this way for now. Either way we should make sure the result is correct with or without filters.

@rui-mo
Copy link
Collaborator Author

rui-mo commented Apr 23, 2023

@majetideepak @Yuhta Thanks for your review! Your comments are well received, and I'm working on them.

@rui-mo rui-mo force-pushed the wip_ts_reader branch 3 times, most recently from f5c944e to a8dee34 Compare April 28, 2023 06:15
@rui-mo
Copy link
Collaborator Author

rui-mo commented Apr 28, 2023

Please add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp

You probably need to make the writer to generate INT96 in writeToMemory

@Yuhta I also tried that. Use enable_deprecated_int96_timestamps can make the arrow writer generate INT96. Since int128_t is used in Timestamp reader for now, the decoder calls readInt128() but little endian is not support currently (see IntDecoder.h). I will continue to work on this after the type issue is decided.

velox/type/Timestamp.h Outdated Show resolved Hide resolved
@rui-mo rui-mo marked this pull request as draft May 8, 2023 05:35
@rui-mo rui-mo force-pushed the wip_ts_reader branch 6 times, most recently from 53d9408 to 1fe1694 Compare May 18, 2023 07:03
@rui-mo
Copy link
Collaborator Author

rui-mo commented May 18, 2023

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

hi @Yuhta, I spent more time on Int96Timestamp type support but it is not easy to make it work through.
We also made more tests on the int128_t workaround, and found it could work for a pure scan. As posted in this PR, int96 in Parquet is converted to Velox Timestamp type (which is of 16-byte length) in PageReader (see link), and only numValues * sizeof(Int96Timestamp) bytes of data was read in PageReader.

Below is the stack of current timestamp scan.

facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test
 1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test
 2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test
 3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_test

Could you explain more about the possible risks? Thank you.

@Yuhta Yuhta self-requested a review May 18, 2023 15:25
Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

hi @Yuhta, I spent more time on Int96Timestamp type support but it is not easy to make it work through. We also made more tests on the int128_t workaround, and found it could work for a pure scan. As posted in this PR, int96 in Parquet is converted to Velox Timestamp type (which is of 16-byte length) in PageReader (see link), and only numValues * sizeof(Int96Timestamp) bytes of data was read in PageReader.

Below is the stack of current timestamp scan.

facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test
 1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test
 2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test
 3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_test

Could you explain more about the possible risks? Thank you.

So the assumption here is it is always dictionary-encoded? If this assumption holds all the time, we can probably go this way. It's only a problem when we want to apply a filter on the column of flat values.

Make sure to beef up your E2E filter tests with filters on some primary keys (int64 is fine), and also put timestamp in complex types (array, map, struct) in addition to top-level column.

@rui-mo
Copy link
Collaborator Author

rui-mo commented May 23, 2023

@Yuhta Thanks for your reply.

So the assumption here is it is always dictionary-encoded? If this assumption holds all the time, we can probably go this way. It's only a problem when we want to apply a filter on the column of flat values.

Understood the gap here. I guess RLEV1 and Plain encoding are also possible because the column encoding can be set during Parquet write, but we only tested the Parquet generated with default configs.

Make sure to beef up your E2E filter tests with filters on some primary keys (int64 is fine), and also put timestamp in complex types (array, map, struct) in addition to top-level column.

Got it, will do.

@rui-mo
Copy link
Collaborator Author

rui-mo commented Jun 27, 2024

@bikramSingh91 Could you help import and merge this PR? Thanks!

@@ -99,13 +99,11 @@ PlanBuilder& PlanBuilder::tableScan(
const RowTypePtr& dataColumns,
const std::unordered_map<
std::string,
std::shared_ptr<connector::ColumnHandle>>& assignments,
bool isFilterPushdownEnabled) {
std::shared_ptr<connector::ColumnHandle>>& assignments) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is removing isFilterPushdownEnabled parameter related to the Timestamp reader?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. To add isFilterPushdownEnabled parameter was a temporary change before supporting the filter pushdown of Timestamp. After its support in Filter.h, this change has been removed from this PR.

@@ -272,6 +272,16 @@ bool HiveConfig::s3UseProxyFromEnv() const {
return config_->get<bool>(kS3UseProxyFromEnv, false);
}

uint8_t HiveConfig::readTimestampUnit(const Config* session) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit should be read from the Parquet logical type for this column, not set by the user as a config property. See https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp

Copy link
Contributor

@Yuhta Yuhta Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is corresponding to the timestamp unit that can be handled in compute engine (usually milliseconds for Presto and maybe some other values for Spark), not related to the type in the file. The reader should use the more coarse one of both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yingsu00 In a Parquet file, the unit of int96 is fixed because it is made up of days and nanos, unlike int64-timestamp, which can have different units. We parse days and nanos from Parquet, while the compute engine may need different units of timestamps, e.g. Presto needs milli while Spark needs micro. This config allows us to adjust timestamp precision according to user's requirement.

Without this change, the filter result could become incorrect. For example, for a Spark filter a == 2000-09-12 22:36:29.000000, if a is stored as nano unit in Velox, when a is 2000-09-12 22:36:29.000000111 Velox returns false but Spark needs true because it only cares about the micro digits.

Therefore, we need to truncate the value and this logic is also needed for int64-timestamp reader. Does this makes sense? Thanks.

Reference for Int96 in Parquet: https://github.com/apache/parquet-format/pull/49/files#diff-0e877db0daf579f98a11e5e113b29250a2dcae3decb1e83a88db1e6f092bee96R149-R150

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yingsu00 In a Parquet file, the unit of int96 is fixed because it is made up of days and nanos, unlike int64-timestamp, which can have different units. We parse days and nanos from Parquet, while the compute engine may need different units of timestamps, e.g. Presto needs milli while Spark needs micro. This config allows us to adjust timestamp precision according to user's requirement.

Without this change, the filter result could become incorrect. For example, for a Spark filter a == 2000-09-12 22:36:29.000000, if a is stored as nano unit in Velox, when a is 2000-09-12 22:36:29.000000111 Velox returns false but Spark needs true because it only cares about the micro digits.

Therefore, we need to truncate the value and this logic is also needed for int64-timestamp reader. Does this makes sense? Thanks.

Reference for Int96 in Parquet: https://github.com/apache/parquet-format/pull/49/files#diff-0e877db0daf579f98a11e5e113b29250a2dcae3decb1e83a88db1e6f092bee96R149-R150

THanks @rui-mo for explaining. Sorry I didn't check the INT96 Timestamp spec. Just approved this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for helping review this PR.

@chliang71
Copy link
Contributor

We have ported this PR internally and so far running fine. Thanks for working on this @rui-mo ! We do encounter one issue though related IntDecoder reading int128.

Since int128_t is used in Timestamp reader for now, the decoder calls readInt128() but little endian is not support currently (see IntDecoder.h). I will continue to work on this after the type issue is decided.

Any quick insights on what needs to be done here? i.e. If the data file uses INT96 (12 bytes), readInt128() would read 16 bytes? Then will the reader need to re-align the bytes correspondingly, plus use little endian?

@rui-mo
Copy link
Collaborator Author

rui-mo commented Jul 9, 2024

We do encounter one issue though related IntDecoder reading int128.

@chliang71 Thanks for your feedback. I assume this issue is on plain-encoded timestamp reading, while this PR focuses on dictionary-encoding. There is a draft on plain-encoding support oap-project@533bb9e by @mskapilks, which may go into a separate PR after this one.

@mskapilks
Copy link

We do encounter one issue though related IntDecoder reading int128.

@chliang71 Thanks for your feedback. I assume this issue is on plain-encoded timestamp reading, while this PR focuses on dictionary-encoding. There is a draft on plain-encoding support oap-project@533bb9e by @mskapilks, which may go into a separate PR after this one.

I can raise the follow up PR for that change once this is done

@rui-mo rui-mo mentioned this pull request Jul 17, 2024
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mbasmanova
Copy link
Contributor

@rui-mo @Yuhta Folks, is anything blocking this PR from being merged?

@rui-mo
Copy link
Collaborator Author

rui-mo commented Jul 18, 2024

@mbasmanova I assume there has been some discussions on whether to merge this one or #8325 first. Seeing #8325 (comment) & #8325 (comment). If possible, we would like merge this one first as it is ready.
cc: @yingsu00 @mskapilks

@Yuhta
Copy link
Contributor

Yuhta commented Jul 18, 2024

@rui-mo There is a assertion failure in unit test:

Note: Google Test filter = E2EFilterTest.timestampDictionary
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from E2EFilterTest
[ RUN      ] E2EFilterTest.timestampDictionary

terminate called after throwing an instance of 'facebook::velox::VeloxUserError'
  what():  Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: (11646767826930344353 vs. 999999999) Timestamp nanos out of range
Retriable: False
Expression: nanos <= kMaxNanos
Function: Timestamp
File: buck-out/v2/gen/fbcode/5ce5662abd58612b/velox/type/__velox_timestamp__/buck-headers/velox/type/Timestamp.h
Line: 113
Stack trace:
Stack trace has been disabled. Use --velox_exception_user_stacktrace_enabled=true to enable it.

*** Aborted at 1721281199 (Unix time, try 'date -d @1721281199') ***
*** Signal 6 (SIGABRT) (0x75590001bb2f) received by PID 113455 (pthread TID 0x7fc9d1292d80) (linux TID 113455) (maybe from PID 113455, UID 30041) (code: -6), stack trace: ***
    @ 000000000000fd47 folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)
                       ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:453
    @ 000000000000e4c1 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:474
    @ 000000000004455f (unknown)
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:8
                       -> /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c
    @ 000000000009c993 __GI___pthread_kill
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/nptl/pthread_kill.c:46
    @ 00000000000444ac __GI_raise
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/posix/raise.c:26
    @ 000000000002c432 __GI_abort
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/stdlib/abort.c:79
    @ 00000000000a3fd4 __gnu_cxx::__verbose_terminate_handler()
                       /home/engshare/third-party2/libgcc/11.x/src/gcc-11.x/x86_64-facebook-linux/libstdc++-v3/libsupc++/../../.././libstdc++-v3/libsupc++/vterminate.cc:95
    @ 00000000000a1b39 __cxxabiv1::__terminate(void (*)())
                       /home/engshare/third-party2/libgcc/11.x/src/gcc-11.x/x86_64-facebook-linux/libstdc++-v3/libsupc++/../../.././libstdc++-v3/libsupc++/eh_terminate.cc:48
    @ 00000000000a1ba4 std::terminate()
                       /home/engshare/third-party2/libgcc/11.x/src/gcc-11.x/x86_64-facebook-linux/libstdc++-v3/libsupc++/../../.././libstdc++-v3/libsupc++/eh_terminate.cc:58
    @ 00000000000a1e6f __cxa_throw
                       /home/engshare/third-party2/libgcc/11.x/src/gcc-11.x/x86_64-facebook-linux/libstdc++-v3/libsupc++/../../.././libstdc++-v3/libsupc++/eh_throw.cc:95
    @ 000000000001464f void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxUserError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
                       fbcode/velox/common/base/Exceptions.h:75
                       -> ./fbcode/velox/common/base/Exceptions.cpp
    @ 000000000285ca2f facebook::velox::Timestamp::Timestamp(long, unsigned long)
                       fbcode/velox/type/Timestamp.h:113
                       -> ./fbcode/velox/dwio/parquet/reader/ParquetColumnReader.cpp
    @ 000000000285a0f3 facebook::velox::parquet::TimestampColumnReader::getValues(folly::Range<int const*>, std::shared_ptr<facebook::velox::BaseVector>*)
                       fbcode/velox/dwio/parquet/reader/TimestampColumnReader.h:61
                       -> ./fbcode/velox/dwio/parquet/reader/ParquetColumnReader.cpp
    @ 0000000000f1db0b facebook::velox::dwio::common::SelectiveStructColumnReaderBase::getValues(folly::Range<int const*>, std::shared_ptr<facebook::velox::BaseVector>*)
                       ./fbcode/velox/dwio/common/SelectiveStructColumnReader.cpp:397
    @ 0000000000f1a066 facebook::velox::dwio::common::SelectiveStructColumnReaderBase::next(unsigned long, std::shared_ptr<facebook::velox::BaseVector>&, facebook::velox::dwio::common::Mutation const*)
                       ./fbcode/velox/dwio/common/SelectiveStructColumnReader.cpp:127
    @ 0000000002916615 facebook::velox::parquet::ParquetRowReader::Impl::next(unsigned long, std::shared_ptr<facebook::velox::BaseVector>&, facebook::velox::dwio::common::Mutation const*)
                       ./fbcode/velox/dwio/parquet/reader/ParquetReader.cpp:851
    @ 00000000028ccef8 facebook::velox::parquet::ParquetRowReader::next(unsigned long, std::shared_ptr<facebook::velox::BaseVector>&, facebook::velox::dwio::common::Mutation const*)
                       ./fbcode/velox/dwio/parquet/reader/ParquetReader.cpp:948
    @ 0000000000099dc9 facebook::velox::dwio::common::E2EFilterTestBase::readWithFilter(std::shared_ptr<facebook::velox::common::ScanSpec>, facebook::velox::dwio::common::MutationSpec const&, std::vector<std::shared_ptr<facebook::velox::RowVector>, std::allocator<std::shared_ptr<facebook::velox::RowVector> > > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long&, bool, bool)
                       ./fbcode/velox/dwio/common/tests/utils/E2EFilterTestBase.cpp:197
    @ 00000000000d2338 facebook::velox::dwio::common::E2EFilterTestBase::testFilterSpecs(std::vector<std::shared_ptr<facebook::velox::RowVector>, std::allocator<std::shared_ptr<facebook::velox::RowVector> > > const&, std::vector<facebook::velox::dwio::common::FilterSpec, std::allocator<facebook::velox::dwio::common::FilterSpec> > const&)
                       ./fbcode/velox/dwio/common/tests/utils/E2EFilterTestBase.cpp:306
    @ 00000000000d4d23 facebook::velox::dwio::common::E2EFilterTestBase::testNoRowGroupSkip(std::vector<std::shared_ptr<facebook::velox::RowVector>, std::allocator<std::shared_ptr<facebook::velox::RowVector> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int)
                       ./fbcode/velox/dwio/common/tests/utils/E2EFilterTestBase.cpp:336
    @ 00000000000de35e facebook::velox::dwio::common::E2EFilterTestBase::testScenario(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int)
                       ./fbcode/velox/dwio/common/tests/utils/E2EFilterTestBase.cpp:417
    @ 0000000000370b18 E2EFilterTest::testWithTypes(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, int)
                       ./fbcode/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp:44
    @ 00000000003398c7 E2EFilterTest_timestampDictionary_Test::TestBody()
                       ./fbcode/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp:263
    @ 0000000000123f7e void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)
                       fbsource/src/gtest.cc:2675
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 0000000000123804 testing::Test::Run()
                       fbsource/src/gtest.cc:2692
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 000000000012943f testing::TestInfo::Run()
                       fbsource/src/gtest.cc:2841
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 00000000001313f6 testing::TestSuite::Run()
                       fbsource/src/gtest.cc:3020
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 000000000016cd5b testing::internal::UnitTestImpl::RunAllTests()
                       fbsource/src/gtest.cc:5925
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 000000000016bdbb bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)
                       fbsource/src/gtest.cc:2675
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 000000000016b2f9 testing::UnitTest::Run()
                       fbsource/src/gtest.cc:5489
                       -> ./third-party/googletest/1.14.0/googletest/googletest/src/gtest-all.cc
    @ 00000000004c9820 RUN_ALL_TESTS()
                       fbsource/gtest/gtest.h:2317
                       -> ./fbcode/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp
    @ 00000000004c96ec main
                       ./fbcode/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp:720
    @ 000000000002c656 __libc_start_call_main
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58
                       -> /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86/libc-start.c
    @ 000000000002c717 __libc_start_main_alias_2
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:409
                       -> /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86/libc-start.c
    @ 000000000032c160 _start
                       /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:116

Test was never completed. The test process might have crashed.

@rui-mo
Copy link
Collaborator Author

rui-mo commented Jul 19, 2024

@rui-mo There is a assertion failure in unit test:

@Yuhta Thanks for the catch. I reproduced locally on debug mode and fixed with this change: https://github.com/facebookincubator/velox/pull/4680/files#diff-ae87451c1577f3b47d2863187de8bf30c7351484d39537419016487cc7b2f71cR49-R51. Would you take another look? Thank you.

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in facd967.

Copy link

Conbench analyzed the 1 benchmark run on commit facd967a.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@rui-mo
Copy link
Collaborator Author

rui-mo commented Jul 22, 2024

Thank you all for helping review this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.