-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dictionary encoded array #228
Add dictionary encoded array #228
Conversation
{ | ||
|
||
template <typename KeysArray, typename ValuesArrayBitmapRange> | ||
class dictionary_bitmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class is required because of the implementation of bitmap_begin()
in the CRTP base class which assumes that each array class has a bitmap attribute:
template <class D>
auto array_crtp_base<D>::bitmap_begin() const -> const_bitmap_iterator
{
return derived_cast().get_bitmap().begin();
}
With an alternative implementation not assuming the existence of a bitmap
member, but only a way to iterate over a bitmap (either "pysical" or "logical"), such as
template <class D>
auto array_crtp_base<D>::bitmap_begin() const -> const_bitmap_iterator
{
return derived_cast().bitmap_begin_impl();
}
where bitmap_begin_impl
is defined as return m_bitmap.begin()
in the classes with a bitmap attribute, and as return validity_iterator(...)
in the dictionary_encoded_layout , we should be able to remove this class and the validity_iterator
class, and get something really close to the previous implementation.
This requires defining boilerplate code for bitmap_begin_impl
(and bitmap_end_impl
) in all array classes but dicitonary_encoded_layout, but that can be done with a macro, and refactored inside an intermediate CRTP later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
constexpr size_t bitmap_buffer_index = 0; | ||
SPARROW_ASSERT_TRUE(arrow_proxy.buffers().size() > bitmap_buffer_index); | ||
const auto bitmap_size = arrow_proxy.length() + arrow_proxy.offset(); | ||
return bitmap_type(arrow_proxy.buffers()[bitmap_buffer_index].data(), bitmap_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implemetation is common to all the array classes but the dictionary encoded layout. I think it could be factorize into a free function somewhere (probably in the array_base.hpp
file, inside the detail
namespace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are more layouts with a special bitmap btw.
The run-end-encoded and the sparse and dense union also need a special bitmap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
ca5d541
to
29e41c5
Compare
No description provided.