Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-13554: [C++] Remove deprecated Scanner::Scan #11991

Conversation

westonpace
Copy link
Member

No description provided.

@github-actions
Copy link

@@ -2239,10 +2233,6 @@ cdef class Scanner(_Weakrefable):
use_threads : bool, default True
If enabled, then maximum parallelism will be used determined by
the number of available CPU cores.
use_async : bool, default False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in Python, this keyword was not actually deprecated (in docs or deprecation warning)?
Therefore, we could maybe still temporary just keep this keyword in the signature (without that it does anything, except from raising a warning if specified by the user, or something like that)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the parameters back in. PTAL. One thing I wasn't sure of...

All of the places the parameter was declared it was a bint which means I couldn't change the default to None which means there was no good way to tell if the user specified it or not.

So I just made the default True and emit the warning if the user specifies False. However, if a user has:

dataset.to_table(use_async=True) they will not get the warning.

Alternatively, I could change the cython type from bint to object. Then I should be able to warn on both use_async=True and use_async=False. Would this possible have any backwards compatibility ramifications?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was bugging me so I went ahead and played around with it. I don't see any consequences if I change bint use_async=True to object use_async = None and that allows me to emit the deprecation warning on both use_async=True and use_async=False (i.e. emit a warning if the user uses the flag in any way).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also can't think of any concern for changing bint to object (the other way around would limit what you can pass, but for here it just broadens what you can pass, so that should not be a compatibility issue)

@westonpace westonpace force-pushed the feature/ARROW-13554--unscan-sync-scanner branch 2 times, most recently from 582ae32 to 189dce2 Compare January 4, 2022 23:48
@westonpace westonpace force-pushed the feature/ARROW-13554--unscan-sync-scanner branch from 0b3397d to 4fd7b1e Compare January 6, 2022 01:54
@westonpace westonpace marked this pull request as ready for review January 6, 2022 09:51
@westonpace
Copy link
Member Author

The only failure remaining is a Java failure. Unfortunately, the test failure is missing line numbers so I'm not sure exactly what assert is failing (it expects 2 but I can't find any assert expecting 2 in the test case) so I'll try and reproduce it locally. This is ready for review in the meantime.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we file a JIRA for removing the deprecated flags in 8.0.0?

cpp/src/arrow/dataset/scanner.h Outdated Show resolved Hide resolved
cpp/src/arrow/dataset/scanner_test.cc Show resolved Hide resolved
r/R/dataset-scan.R Show resolved Hide resolved
c_glib/arrow-dataset-glib/scanner.cpp Outdated Show resolved Hide resolved
c_glib/arrow-dataset-glib/scanner.cpp Show resolved Hide resolved
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a couple of final minor comments.

cpp/src/arrow/dataset/scanner.cc Show resolved Hide resolved

for (size_t i = 0; i < exprs.size(); ++i) {
if (auto ref = exprs[i].field_ref()) {
if (!ref->name()) return NestedFieldRefsNotImplemented();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rebase #11704 when this lands.

cpp/src/arrow/dataset/scanner.h Show resolved Hide resolved
@westonpace
Copy link
Member Author

Did we file a JIRA for removing the deprecated flags in 8.0.0?

I just created ARROW-15283

@westonpace westonpace force-pushed the feature/ARROW-13554--unscan-sync-scanner branch from da58e06 to 3629ed1 Compare January 7, 2022 19:12
@ursabot
Copy link

ursabot commented Jan 10, 2022

Benchmark runs are scheduled for baseline = 43bc33b and contender = 2e8b836. 2e8b836 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️5.38% ⬆️1.79%] ursa-i9-9960x
[Finished ⬇️0.31% ⬆️0.04%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@kou
Copy link
Member

kou commented Jan 11, 2022

This might break test-skyhook-integration (cc: @JayjeetAtGithub):

https://github.com/ursacomputing/crossbow/runs/4771425719?check_suite_focus=true

 [141/863] Building CXX object src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/client/file_skyhook.cc.o
FAILED: src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/client/file_skyhook.cc.o 
/usr/bin/ccache /usr/bin/c++  -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_JEMALLOC -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_NO_DEPRECATED_API -DARROW_WITH_RE2 -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -Isrc -I/arrow/cpp/src -I/arrow/cpp/src/generated -isystem /arrow/cpp/thirdparty/flatbuffers/include -isystem protobuf_ep-install/include -isystem jemalloc_ep-prefix/src -isystem mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7 -isystem googletest_ep-prefix/include -isystem xsimd_ep/src/xsimd_ep-install/include -isystem google_cloud_cpp_ep-install/include -isystem absl_ep-install/include -isystem /arrow/cpp/thirdparty/hadoop/include -isystem orc_ep-install/include -isystem awssdk_ep-install/include -Wno-noexcept-type  -fdiagnostics-color=always -ggdb -O0  -Wall -Wno-conversion -Wno-deprecated-declarations -Wno-sign-conversion -Wunused-result -Werror -fno-semantic-interposition -msse4.2  -g -fPIC   -std=c++11 -MD -MT src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/client/file_skyhook.cc.o -MF src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/client/file_skyhook.cc.o.d -o src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/client/file_skyhook.cc.o -c /arrow/cpp/src/skyhook/client/file_skyhook.cc
In file included from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/arrow/cpp/src/skyhook/client/file_skyhook.h:84:51: error: 'arrow::Result<arrow::Iterator<std::shared_ptr<arrow::dataset::ScanTask> > > skyhook::SkyhookFileFormat::ScanFile(const std::shared_ptr<arrow::dataset::ScanOptions>&, const std::shared_ptr<arrow::dataset::FileFragment>&) const' marked 'override', but does not override
   84 |   arrow::Result<arrow::dataset::ScanTaskIterator> ScanFile(
      |                                                   ^~~~~~~~
/arrow/cpp/src/skyhook/client/file_skyhook.cc:30:48: error: invalid use of incomplete type 'class arrow::dataset::ScanTask'
   30 | class SkyhookScanTask : public arrow::dataset::ScanTask {
      |                                                ^~~~~~~~
In file included from /arrow/cpp/src/arrow/dataset/partition.h:31,
                 from /arrow/cpp/src/arrow/dataset/discovery.h:30,
                 from /arrow/cpp/src/arrow/dataset/file_parquet.h:28,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.h:20,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/arrow/cpp/src/arrow/dataset/type_fwd.h:99:7: note: forward declaration of 'class arrow::dataset::ScanTask'
   99 | class ScanTask;
      |       ^~~~~~~~
/arrow/cpp/src/skyhook/client/file_skyhook.cc:44:45: error: 'arrow::Result<arrow::Iterator<std::shared_ptr<arrow::RecordBatch> > > skyhook::SkyhookScanTask::Execute()' marked 'override', but does not override
   44 |   arrow::Result<arrow::RecordBatchIterator> Execute() override {
      |                                             ^~~~~~~
/arrow/cpp/src/skyhook/client/file_skyhook.cc: In constructor 'skyhook::SkyhookScanTask::SkyhookScanTask(std::shared_ptr<arrow::dataset::ScanOptions>, std::shared_ptr<arrow::dataset::Fragment>, arrow::dataset::FileSource, std::shared_ptr<skyhook::SkyhookDirectObjectAccess>, skyhook::SkyhookFileType::type, arrow::compute::Expression)':
/arrow/cpp/src/skyhook/client/file_skyhook.cc:38:9: error: class 'skyhook::SkyhookScanTask' does not have any field named 'ScanTask'
   38 |       : ScanTask(std::move(options), std::move(fragment)),
      |         ^~~~~~~~
/arrow/cpp/src/skyhook/client/file_skyhook.cc: In member function 'arrow::Result<arrow::Iterator<std::shared_ptr<arrow::RecordBatch> > > skyhook::SkyhookScanTask::Execute()':
/arrow/cpp/src/skyhook/client/file_skyhook.cc:51:29: error: 'options_' was not declared in this scope; did you mean 'optind'?
   51 |     req.filter_expression = options_->filter;
      |                             ^~~~~~~~
      |                             optind
/arrow/cpp/src/skyhook/client/file_skyhook.cc: In member function 'arrow::Result<arrow::Iterator<std::shared_ptr<arrow::dataset::ScanTask> > > skyhook::SkyhookFileFormat::Impl::ScanFile(const std::shared_ptr<arrow::dataset::ScanOptions>&, const std::shared_ptr<arrow::dataset::FileFragment>&) const':
/arrow/cpp/src/skyhook/client/file_skyhook.cc:115:88: error: no matching function for call to 'std::vector<std::shared_ptr<arrow::dataset::ScanTask> >::vector(<brace-enclosed initializer list>)'
  115 |         options, file, file->source(), doa_, file_format, file->partition_expression())};
      |                                                                                        ^
In file included from /usr/include/c++/9/vector:67,
                 from /arrow/cpp/src/arrow/array/array_base.h:24,
                 from /arrow/cpp/src/arrow/array.h:37,
                 from /arrow/cpp/src/arrow/api.h:22,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.h:19,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/usr/include/c++/9/bits/stl_vector.h:650:2: note: candidate: 'template<class _InputIterator, class> std::vector<_Tp, _Alloc>::vector(_InputIterator, _InputIterator, const allocator_type&)'
  650 |  vector(_InputIterator __first, _InputIterator __last,
      |  ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:650:2: note:   template argument deduction/substitution failed:
/arrow/cpp/src/skyhook/client/file_skyhook.cc:115:88: note:   candidate expects 3 arguments, 1 provided
  115 |         options, file, file->source(), doa_, file_format, file->partition_expression())};
      |                                                                                        ^
In file included from /usr/include/c++/9/vector:67,
                 from /arrow/cpp/src/arrow/array/array_base.h:24,
                 from /arrow/cpp/src/arrow/array.h:37,
                 from /arrow/cpp/src/arrow/api.h:22,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.h:19,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/usr/include/c++/9/bits/stl_vector.h:622:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::initializer_list<_Tp>, const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  622 |       vector(initializer_list<value_type> __l,
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:622:43: note:   no known conversion for argument 1 from 'std::shared_ptr<skyhook::SkyhookScanTask>' to 'std::initializer_list<std::shared_ptr<arrow::dataset::ScanTask> >'
  622 |       vector(initializer_list<value_type> __l,
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/9/bits/stl_vector.h:604:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  604 |       vector(vector&& __rv, const allocator_type& __m)
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:604:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/9/bits/stl_vector.h:586:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::false_type) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::false_type = std::integral_constant<bool, false>]'
  586 |       vector(vector&& __rv, const allocator_type& __m, false_type)
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:586:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/9/bits/stl_vector.h:582:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::true_type) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::true_type = std::integral_constant<bool, true>]'
  582 |       vector(vector&& __rv, const allocator_type& __m, true_type) noexcept
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:582:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/9/bits/stl_vector.h:572:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&, const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  572 |       vector(const vector& __x, const allocator_type& __a)
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:572:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/9/bits/stl_vector.h:569:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  569 |       vector(vector&&) noexcept = default;
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:569:14: note:   no known conversion for argument 1 from 'std::shared_ptr<skyhook::SkyhookScanTask>' to 'std::vector<std::shared_ptr<arrow::dataset::ScanTask> >&&'
  569 |       vector(vector&&) noexcept = default;
      |              ^~~~~~~~
/usr/include/c++/9/bits/stl_vector.h:550:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  550 |       vector(const vector& __x)
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:550:28: note:   no known conversion for argument 1 from 'std::shared_ptr<skyhook::SkyhookScanTask>' to 'const std::vector<std::shared_ptr<arrow::dataset::ScanTask> >&'
  550 |       vector(const vector& __x)
      |              ~~~~~~~~~~~~~~^~~
/usr/include/c++/9/bits/stl_vector.h:519:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const value_type&, const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::value_type = std::shared_ptr<arrow::dataset::ScanTask>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  519 |       vector(size_type __n, const value_type& __value,
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:519:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/9/bits/stl_vector.h:507:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  507 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:507:24: note:   no known conversion for argument 1 from 'std::shared_ptr<skyhook::SkyhookScanTask>' to 'std::vector<std::shared_ptr<arrow::dataset::ScanTask> >::size_type' {aka 'long unsigned int'}
  507 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |              ~~~~~~~~~~^~~
/usr/include/c++/9/bits/stl_vector.h:494:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector(const allocator_type&) [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  494 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:494:36: note:   no known conversion for argument 1 from 'std::shared_ptr<skyhook::SkyhookScanTask>' to 'const allocator_type&' {aka 'const std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >&'}
  494 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |              ~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/9/bits/stl_vector.h:484:7: note: candidate: 'std::vector<_Tp, _Alloc>::vector() [with _Tp = std::shared_ptr<arrow::dataset::ScanTask>; _Alloc = std::allocator<std::shared_ptr<arrow::dataset::ScanTask> >]'
  484 |       vector() = default;
      |       ^~~~~~
/usr/include/c++/9/bits/stl_vector.h:484:7: note:   candidate expects 0 arguments, 1 provided
In file included from /usr/include/x86_64-linux-gnu/c++/9/bits/c++allocator.h:33,
                 from /usr/include/c++/9/bits/allocator.h:46,
                 from /usr/include/c++/9/memory:63,
                 from /arrow/cpp/src/arrow/array/array_base.h:22,
                 from /arrow/cpp/src/arrow/array.h:37,
                 from /arrow/cpp/src/arrow/api.h:22,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.h:19,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/usr/include/c++/9/ext/new_allocator.h: In instantiation of 'void __gnu_cxx::new_allocator<_Tp>::construct(_Up*, _Args&& ...) [with _Up = skyhook::SkyhookFileFormat; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; _Tp = skyhook::SkyhookFileFormat]':
/usr/include/c++/9/bits/alloc_traits.h:482:2:   required from 'static void std::allocator_traits<std::allocator<_CharT> >::construct(std::allocator_traits<std::allocator<_CharT> >::allocator_type&, _Up*, _Args&& ...) [with _Up = skyhook::SkyhookFileFormat; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; _Tp = skyhook::SkyhookFileFormat; std::allocator_traits<std::allocator<_CharT> >::allocator_type = std::allocator<skyhook::SkyhookFileFormat>]'
/usr/include/c++/9/bits/shared_ptr_base.h:548:39:   required from 'std::_Sp_counted_ptr_inplace<_Tp, _Alloc, _Lp>::_Sp_counted_ptr_inplace(_Alloc, _Args&& ...) [with _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; _Tp = skyhook::SkyhookFileFormat; _Alloc = std::allocator<skyhook::SkyhookFileFormat>; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]'
/usr/include/c++/9/bits/shared_ptr_base.h:679:16:   required from 'std::__shared_count<_Lp>::__shared_count(_Tp*&, std::_Sp_alloc_shared_tag<_Alloc>, _Args&& ...) [with _Tp = skyhook::SkyhookFileFormat; _Alloc = std::allocator<skyhook::SkyhookFileFormat>; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]'
/usr/include/c++/9/bits/shared_ptr_base.h:1344:71:   required from 'std::__shared_ptr<_Tp, _Lp>::__shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<skyhook::SkyhookFileFormat>; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; _Tp = skyhook::SkyhookFileFormat; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]'
/usr/include/c++/9/bits/shared_ptr.h:359:59:   required from 'std::shared_ptr<_Tp>::shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<skyhook::SkyhookFileFormat>; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}; _Tp = skyhook::SkyhookFileFormat]'
/usr/include/c++/9/bits/shared_ptr.h:701:14:   required from 'std::shared_ptr<_Tp> std::allocate_shared(const _Alloc&, _Args&& ...) [with _Tp = skyhook::SkyhookFileFormat; _Alloc = std::allocator<skyhook::SkyhookFileFormat>; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}]'
/usr/include/c++/9/bits/shared_ptr.h:717:39:   required from 'std::shared_ptr<_Tp> std::make_shared(_Args&& ...) [with _Tp = skyhook::SkyhookFileFormat; _Args = {std::shared_ptr<skyhook::RadosConnCtx>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}]'
/arrow/cpp/src/skyhook/client/file_skyhook.cc:144:81:   required from here
/usr/include/c++/9/ext/new_allocator.h:145:20: error: invalid new-expression of abstract class type 'skyhook::SkyhookFileFormat'
  145 |  noexcept(noexcept(::new((void *)__p)
      |                    ^~~~~~~~~~~~~~~~~~
  146 |        _Up(std::forward<_Args>(__args)...)))
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/arrow/cpp/src/skyhook/client/file_skyhook.h:55:7: note:   because the following virtual functions are pure within 'skyhook::SkyhookFileFormat':
   55 | class SkyhookFileFormat : public arrow::dataset::FileFormat {
      |       ^~~~~~~~~~~~~~~~~
In file included from /arrow/cpp/src/arrow/dataset/file_parquet.h:29,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.h:20,
                 from /arrow/cpp/src/skyhook/client/file_skyhook.cc:17:
/arrow/cpp/src/arrow/dataset/file_base.h:150:40: note: 	'virtual arrow::Result<std::function<arrow::Future<std::shared_ptr<arrow::RecordBatch> >()> > arrow::dataset::FileFormat::ScanBatchesAsync(const std::shared_ptr<arrow::dataset::ScanOptions>&, const std::shared_ptr<arrow::dataset::FileFragment>&) const'
  150 |   virtual Result<RecordBatchGenerator> ScanBatchesAsync(
      |                                        ^~~~~~~~~~~~~~~~
[142/863] Building CXX object src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/protocol/rados_protocol.cc.o
FAILED: src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/protocol/rados_protocol.cc.o 
/usr/bin/ccache /usr/bin/c++  -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_JEMALLOC -DARROW_JEMALLOC_INCLUDE_DIR="" -DARROW_MIMALLOC -DARROW_NO_DEPRECATED_API -DARROW_WITH_RE2 -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -Isrc -I/arrow/cpp/src -I/arrow/cpp/src/generated -isystem /arrow/cpp/thirdparty/flatbuffers/include -isystem protobuf_ep-install/include -isystem jemalloc_ep-prefix/src -isystem mimalloc_ep/src/mimalloc_ep/include/mimalloc-1.7 -isystem googletest_ep-prefix/include -isystem xsimd_ep/src/xsimd_ep-install/include -isystem google_cloud_cpp_ep-install/include -isystem absl_ep-install/include -isystem /arrow/cpp/thirdparty/hadoop/include -isystem orc_ep-install/include -isystem awssdk_ep-install/include -Wno-noexcept-type  -fdiagnostics-color=always -ggdb -O0  -Wall -Wno-conversion -Wno-deprecated-declarations -Wno-sign-conversion -Wunused-result -Werror -fno-semantic-interposition -msse4.2  -g -fPIC   -std=c++11 -MD -MT src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/protocol/rados_protocol.cc.o -MF src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/protocol/rados_protocol.cc.o.d -o src/skyhook/CMakeFiles/arrow_skyhook_client_objlib.dir/protocol/rados_protocol.cc.o -c /arrow/cpp/src/skyhook/protocol/rados_protocol.cc
In file included from /arrow/cpp/src/skyhook/protocol/rados_protocol.h:24,
                 from /arrow/cpp/src/skyhook/protocol/rados_protocol.cc:17:
/arrow/cpp/src/skyhook/client/file_skyhook.h:84:51: error: 'arrow::Result<arrow::Iterator<std::shared_ptr<arrow::dataset::ScanTask> > > skyhook::SkyhookFileFormat::ScanFile(const std::shared_ptr<arrow::dataset::ScanOptions>&, const std::shared_ptr<arrow::dataset::FileFragment>&) const' marked 'override', but does not override
   84 |   arrow::Result<arrow::dataset::ScanTaskIterator> ScanFile(
      |                                                   ^~~~~~~~

@lidavidm
Copy link
Member

It does, @kou see #12123 which should fix that.

@kou
Copy link
Member

kou commented Jan 11, 2022

Oh, sorry. I haven't seen it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants