Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdenv/setup.sh: fix consumeEntire in bash5 #108101

Closed

Conversation

matthewbauer
Copy link
Member

Unfortunately, bash5 has changed how read -N 0 works, meaning that
consumeEntire no longer works as expected. Compare:

$ nix shell github:nixos/nixpkgs\?ref=nixos-20.03#bash -c bash -c 'read -N 0 contents < /bin/sh; echo $?'
1
$ nix shell github:nixos/nixpkgs\?ref=nixos-20.03#bash_5 -c bash -c 'read -N 0 contents < /bin/sh; echo $?'
0

Bash 5 no longer reaches EOF when nchars=0. I could not find this
documented anywhere.

The best substitute I can find is to provide the max read nchars
value (2147483647). This appears to have the same effect as nchars=0
in practice.

Unfortunately, bash5 has changed how read -N 0 works, meaning that
consumeEntire no longer works as expected. Compare:

$ nix shell github:nixos/nixpkgs\?ref=nixos-20.03#bash -c bash -c 'read -N 0 contents < /bin/sh; echo $?'
1
$ nix shell github:nixos/nixpkgs\?ref=nixos-20.03#bash_5 -c bash -c 'read -N 0 contents < /bin/sh; echo $?'
0

Bash 5 no longer reaches EOF when nchars=0. I could not find this
documented anywhere.

The best substitute I can find is to provide the max read nchars
value (2147483647). This appears to have the same effect as nchars=0
in practice.
@matthewbauer matthewbauer changed the base branch from master to staging January 1, 2021 06:15
@ofborg ofborg bot added the 6.topic: stdenv Standard environment label Jan 1, 2021
@abathur
Copy link
Member

abathur commented Jan 1, 2021

Not that it really adds much here, but I looked this up in the bash source:

@@ -711,7 +711,7 @@ substituteStream() {

consumeEntire() {
# read returns non-0 on EOF, so we want read to fail
if IFS='' read -r -N 0 $1; then
if IFS='' read -r -N 2147483647 $1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on this magic number perhaps?

@abathur
Copy link
Member

abathur commented Jan 1, 2021

I wonder (but have not extensively tested) if you can do something like read -d $'\x04' (source)? Not that you wouldn't still need to comment on a magical value... :)

Comment on lines +715 to 716
if ! IFS='' read -r -d $'\x04' $1; then
echo "consumeEntire(): ERROR: Input null bytes, won't process" >&2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t actually reject null bytes as claimed, nor does it preserve them, since Bash is not capable of storing a null byte in a variable. How about -d ''?

@happysalada
Copy link
Contributor

Please forgive me for coming a little late to the party.
I just gave a go at upgrading to bash5 and ran into this particular issue.
I was trying to understand the reason of existence of consumeEntire wouldn't something like $(<$input) work as well ?
(my understanding is that it just wants to read the content of the file. I don't understand what the problem with null bytes could come from)
I'm going to test running it tonight to see if it changes much.

The other solution that was mentioned in this thread is using the -d ''. I'm not sure how it affects the null byte problem since I don't understand it in the first place.

I understand if you are busy. No need to feel any pressure to reply to this.

Thanks for documenting this in the first place! (I was going insane through the changelogs trying to find why the behavior changed).

@andersk
Copy link
Contributor

andersk commented Aug 2, 2021

@happysalada "$(<"$input")" would strip trailing whitespace (we want to preserve it), and would drop everything after a null byte (we want to at least detect this).

@happysalada
Copy link
Contributor

Thank you for the explanation!
I'm trying out the solution you suggested with the -d ''

The idea behind the nullbyte thing is that perhaps there are hooks provided by users which might contain a null byte ?
Is that correct?
On a first look it seemed to me that we are only getting the input from files that we know about, I couldn't figure out in which case there could be a null byte.

@andersk
Copy link
Contributor

andersk commented Aug 2, 2021

@happysalada This is used in the implementation of substitute, substituteInPlace, substituteAll, and substituteAllInPlace, which in turn are used in the implementation of thousands of packages in nixpkgs and many packages outside nixpkgs. Although it is a mistake to use these on binary files, we want to detect this mistake, not just allow it to silently corrupt your files.

@happysalada happysalada mentioned this pull request Aug 2, 2021
11 tasks
@happysalada
Copy link
Contributor

@andersk Thank you for the clarification!
So the intent is to fail on a binary, it wasn't clear to me just from the function name.
Would something like isutf8 from moreutils be more appropriate ?
I'm not sure it's a great idea to add another dependency, but if it make the intent clearer it might be a good idea.

also if anybody is interested, I'm trying the bash upgrade in this PR #131681
(I'm not saying it will succeed :-) but I want to try)

@andersk
Copy link
Contributor

andersk commented Aug 3, 2021

Would something like isutf8 from moreutils be more appropriate ?

Pure Bash is better, both for performance, and for minimizing unnecessary rebuilds—we wouldn’t want every moreutils upgrade to trigger a full rebuild of every package. And we only care about null bytes, which Bash is unable to handle correctly; we don’t care about invalid UTF-8, which it doesn’t have any particular issues with.

@happysalada
Copy link
Contributor

@andersk Thank you for your answer. It makes sense

Just a small update on this. The bash_5 update is stuck with an error that doesn't make sense to me (one of the nix tests failing). I'm going to try to get some more information on it.
If I fail to do that in a timely manner, I will batch this fix with other proposed shell improvements and submit it to the hydra team for testing, in the hope of getting this merged before the bash_5 upgrade.

@happysalada
Copy link
Contributor

This was just merged to staging in #133495
Thank you again for making this PR and those who reviewed it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants