Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] JNI support get map value for all value types and for all basic key types #9109

Closed
revans2 opened this issue Aug 24, 2021 · 5 comments · Fixed by #10380
Closed

[FEA] JNI support get map value for all value types and for all basic key types #9109

revans2 opened this issue Aug 24, 2021 · 5 comments · Fixed by #10380
Assignees
Labels
feature request New feature or request Java Affects Java cuDF API. Spark Functionality that helps Spark RAPIDS

Comments

@revans2
Copy link
Contributor

revans2 commented Aug 24, 2021

Is your feature request related to a problem? Please describe.
We would like to be able to support getting map values on more than just string to string maps. CUDF does not officially support maps and in JNI we have a stand alone special API just for string to string get map values. This is to update that existing code unless we can add some list API operations to support it more generally.

Describe the solution you'd like
Update the existing JNI cudf code for get map values to be more generic. Longer term support more list operations so we can do this correctly.

Describe alternatives you've considered
It would be great to have a list_position. Which would return the position of the first element in a list that is equal to a value provided as either a scalar or a column vector.

We could then create map get from this by pulling out the keys from the map and calling list_position on it, and then pulling out the values list column and calling extract_list_element on it and the result of list_position.

@revans2 revans2 added feature request New feature or request Needs Triage Need team to review and classify labels Aug 24, 2021
@sameerz sameerz changed the title [FEA] JNI support get map value for ally value types and for all basic key types [FEA] JNI support get map value for all value types and for all basic key types Aug 24, 2021
@beckernick beckernick added libcudf Affects libcudf (C++/CUDA) code. Java Affects Java cuDF API. and removed Needs Triage Need team to review and classify labels Aug 26, 2021
@beckernick beckernick removed the libcudf Affects libcudf (C++/CUDA) code. label Aug 26, 2021
@mythrocks mythrocks self-assigned this Aug 30, 2021
@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@revans2
Copy link
Contributor Author

revans2 commented Nov 16, 2021

We are exploring doing this a different way, but for now we still want it.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@revans2
Copy link
Contributor Author

revans2 commented Dec 21, 2021

Same comment as before

@sameerz sameerz added the Spark Functionality that helps Spark RAPIDS label Jan 12, 2022
@mythrocks
Copy link
Contributor

mythrocks commented Jan 12, 2022

I've bumped this to 22.04 for now.
Most of the machinery required to replace the current map_lookup kernel is in place.
I'll post a separate PR for the map_view when viable. Once that's available, we can begin the assembly.

rapids-bot bot pushed a commit that referenced this issue Mar 14, 2022
…0380)

Fixes #9109.

This commit adds a `map` abstraction over a `column_view` of type `LIST<STRUCT<K,V>>`, where `K` and `V` are key and value types. A list column of structs with two members may thus be viewed as a `map` column. 

`maps_column_view` is to a `LIST<STRUCT<K,V>>` column what `lists_column_view` is to a `LIST` column.

The `maps_column_view` abstraction provides methods to fetch lists of keys and values (as `LIST<K>` and `LIST<V>` respectively). It also provides map lookup methods to find the values corresponding to a specified key, for each row in the "map" column.

E.g.
```c++
auto input_column = get_list_of_structs_col();
// input_column == [ {1:10, 2:20}, {1:100, 3:300}, {2:2000, 3:3000, 4:4000} ];

auto maps_view = cudf::jni::maps_column_view{input_column->view()};
auto keys = maps_view.keys();     // keys   == [ {1,2},   {1,3},      {2,3,4} ];
auto values = maps_view.values(); // values == [ {10,20}, {100, 300}, {2000, 3000, 4000} ];

auto lookup_1 = maps_view.get_values_for( *make_numeric_scalar(1) );
// lookup_1 = [ {10, 100, null} ];
```

This abstraction should help replace the Java/JNI `map_lookup` and `map_contains` kernels, which only handles `MAP<STRING, STRING>`.

Authors:
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Jason Lowe (https://github.com/jlowe)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Nghia Truong (https://github.com/ttnghia)
  - Jake Hemstad (https://github.com/jrhemstad)

URL: #10380
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Java Affects Java cuDF API. Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants