Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10804: [Rust] Removed some unsafe code from the parquet crate #8829

Closed
wants to merge 4 commits into from
Closed

ARROW-10804: [Rust] Removed some unsafe code from the parquet crate #8829

wants to merge 4 commits into from

Conversation

jorgecarleitao
Copy link
Member

@jorgecarleitao jorgecarleitao commented Dec 4, 2020

This PR removes an unsafe code by its safe counterpart.

@github-actions
Copy link

github-actions bot commented Dec 4, 2020

@jorgecarleitao jorgecarleitao deleted the remove_unsafe branch December 4, 2020 07:41
@jorgecarleitao jorgecarleitao restored the remove_unsafe branch December 4, 2020 07:41
@jorgecarleitao jorgecarleitao reopened this Dec 4, 2020
@jorgecarleitao
Copy link
Member Author

@jhorstmann and @Dandandan , I notice that you are quite experienced with these types of problems by your PRs.

I am trying to remove this FatPtr with a safe counter part, which is blocking the build of #8796 by making this code safe and not use pointers from the buffers, but I am struggling a lot here (as you can see by the red on the CI 😭), and I am a bit blocked. I would be grateful for your help here.

def_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
rep_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
values_buf.to_slice_mut(),
def_levels,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error originates from here, the error shows that it is moved here. A clone likely "solves" the issue? Usually taking the value by & in the function also solves this as it will stop it from taking ownership?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have some time tomorrow to check out the code to see what can be done about it otherwise

@Dandandan
Copy link
Contributor

@jorgecarleitao Did not find an easy fix yet. What works, though ugly, is using the self.def_levels again in the two if branches.

image

@jorgecarleitao
Copy link
Member Author

Did not find an easy fix yet. What works, though ugly, is using the self.def_levels again in the two if branches.

image

Thanks for the help. I tried that, but it is causing a double mutable borrow on my code. Maybe I am doing something wrong. Could you push to the change to a branch in your repo and share it with me?

@jorgecarleitao
Copy link
Member Author

@Dandandan , your tip helped. Thanks a lot! It compiles now. The semantics are still wrong and I need to figure out why.

@Dandandan
Copy link
Contributor

@jorgecarleitao great to hear! I saw it depends at least on changing the value of values_written but hope it helps in finding a solution 👍

@codecov-io
Copy link

codecov-io commented Dec 12, 2020

Codecov Report

Merging #8829 (55974f6) into master (0c8b990) will increase coverage by 0.00%.
The diff coverage is 88.88%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8829   +/-   ##
=======================================
  Coverage   76.77%   76.77%           
=======================================
  Files         181      181           
  Lines       41009    40990   -19     
=======================================
- Hits        31485    31472   -13     
+ Misses       9524     9518    -6     
Impacted Files Coverage Δ
rust/parquet/src/arrow/record_reader.rs 96.25% <88.88%> (+1.71%) ⬆️
rust/parquet/src/encodings/encoding.rs 92.99% <0.00%> (-0.18%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c8b990...55974f6. Read the comment docs.

@jorgecarleitao jorgecarleitao changed the title ARROW-10804: Removed unsafe from parquet, causing a compilation error. ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail Dec 12, 2020
@jorgecarleitao
Copy link
Member Author

jorgecarleitao commented Dec 12, 2020

@nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.

The background of this change is that FatPtr is using Buffer::capacity and Buffer::raw_data() which is brittle and difficult to reason because a pointer is only valid within a container's len, not capacity. This is also leading to the error that the rust compiler safeguard us against when borrowing a mutable and immutable reference at the same time.

This is blocking #8796

I verified that in all 4 tests the content passed to ColumnReaderImpl::read_batch is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other invariant beyond the ones that we are telling the compiler about through the typing and lifetime system.

@nevi-me
Copy link
Contributor

nevi-me commented Dec 12, 2020

@nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.

Hey Jorge, I've been out of town, got back this morning. I'm going to catch up on PRs this weekend. I'll have a look at this first.

@nevi-me
Copy link
Contributor

nevi-me commented Dec 12, 2020

Hi @jorgecarleitao, I also spent most of today looking into this :(

The problem was a bit vs byte issue on consume_record_data and consume_rep_levels, so what was mostly needed was a fresh pair of eyes. I opened https://github.com/jorgecarleitao/arrow/pull/22. I didn't address the rustfmt lint, so you could see what I changed.

I verified that in all 4 tests the content passed to ColumnReaderImpl::read_batch is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other invariant beyond the ones that we are telling the compiler about through the typing and lifetime system.

This was hidden away by the way that we compare arrays. Because we only print the first 10 & last 10 values of arrays, the issue wasn't visible, but comparing ArrayData surfaced the issue with the failing tests. I had to change the test utilities that generate data to return sequential values instead of randomly generated that. Then I was able to see why consume_record_data was filling the first 12 and last 12 slots with data (then the part that we don't print was just 0s).

I also changed the code slightly to rather initialise the buffers with let new_buffer = MutableBuffer::new(0) because we always perform an allocation on new_buffer.resize(num_bytes) as we never get the initial allocation right anyways.

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback

rust/parquet/src/arrow/record_reader.rs Outdated Show resolved Hide resolved
rust/parquet/src/arrow/record_reader.rs Outdated Show resolved Hide resolved
@jorgecarleitao jorgecarleitao changed the title ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail ARROW-10804: Removed some unsafe code from the parquet crate Dec 12, 2020
@jorgecarleitao jorgecarleitao marked this pull request as ready for review December 13, 2020 06:34
@alamb
Copy link
Contributor

alamb commented Dec 13, 2020

FWIW I am not sure but this PR may conflict with #8698

@nevi-me
Copy link
Contributor

nevi-me commented Dec 13, 2020

@alamb they fortunately don't seem to conflict

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kou kou changed the title ARROW-10804: Removed some unsafe code from the parquet crate ARROW-10804: [Rust] Removed some unsafe code from the parquet crate Dec 13, 2020
@nevi-me nevi-me requested a review from sunchao December 15, 2020 13:06
@alamb
Copy link
Contributor

alamb commented Dec 15, 2020

I just merged this code to master locally to test the effects of #8698 -- I am happy to report, as @nevi-me said, the tests still pass just fine

alamb pushed a commit that referenced this pull request Dec 16, 2020
I would like to propose that we outline and enforce guidelines on the arrow crate implementation with respect to the usage of `unsafe`.

The background of this proposal are PRs #8645 and #8829. In both cases, while addressing an unrelated issue, they hit undefined behavior (UB) due to an incorrect usage of `unsafe` in the code base. This UB was very time-consuming to identify and debug: combined, they accounted for more than 12hs of my time.

Safety against undefined behavior is the core premise of the Rust language. In many cases, the maintenance burden (time to find and fix bugs) does not justify the performance improvements and the decrease in motivation in handling them (they are just painful due to how difficult they are to debug). In particular, IMO those 12 hours would have been better spent in other parts of the code if `unsafe` would have not been used in the first place, which would have been likely translated in faster code or more features.

There are situations where `unsafe` is necessary, and the guidelines outline these cases. However, I also see many uses of `unsafe` that are not necessary nor properly documented.

The goal of these guidelines is to motivate contributors of the Rust implementation to be conscious about the maintenance cost of `unsafe`, and outline specific necessary conditions for any new `unsafe` to be introduced in the code base.

Closes #8901 from jorgecarleitao/arrow_unsafe

Lead-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Co-authored-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
@jorgecarleitao jorgecarleitao deleted the remove_unsafe branch January 28, 2021 18:16
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
This PR removes an unsafe code by its safe counterpart.

Closes apache#8829 from jorgecarleitao/remove_unsafe

Lead-authored-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
Co-authored-by: Neville Dipale <nevilledips@gmail.com>
Signed-off-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
I would like to propose that we outline and enforce guidelines on the arrow crate implementation with respect to the usage of `unsafe`.

The background of this proposal are PRs apache#8645 and apache#8829. In both cases, while addressing an unrelated issue, they hit undefined behavior (UB) due to an incorrect usage of `unsafe` in the code base. This UB was very time-consuming to identify and debug: combined, they accounted for more than 12hs of my time.

Safety against undefined behavior is the core premise of the Rust language. In many cases, the maintenance burden (time to find and fix bugs) does not justify the performance improvements and the decrease in motivation in handling them (they are just painful due to how difficult they are to debug). In particular, IMO those 12 hours would have been better spent in other parts of the code if `unsafe` would have not been used in the first place, which would have been likely translated in faster code or more features.

There are situations where `unsafe` is necessary, and the guidelines outline these cases. However, I also see many uses of `unsafe` that are not necessary nor properly documented.

The goal of these guidelines is to motivate contributors of the Rust implementation to be conscious about the maintenance cost of `unsafe`, and outline specific necessary conditions for any new `unsafe` to be introduced in the code base.

Closes apache#8901 from jorgecarleitao/arrow_unsafe

Lead-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Co-authored-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants