-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow all characters in runfile source and target paths #23331
base: master
Are you sure you want to change the base?
Conversation
d4842e1
to
8a03d1b
Compare
@lberki I assigned you as reviewer for your general knowledge of runfiles, but feel free to redirect if needed. |
CC @meteorcloudy for Windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is great!
@@ -166,18 +166,6 @@ private DetailedExitCode checkCwdInWorkspace(EventHandler eventHandler) { | |||
} | |||
|
|||
Path workspacePath = workspace.getWorkspace(); | |||
// TODO(kchodorow): Remove this once spaces are supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the runfiles path the only reason we didn't support space in workspace path? /cc @lberki
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see missing escaping in the test setup shell script as one other source of issues. When the rest of this PR and the general approach look good to you, I could run Bazel's test suite under a path with a space to shake out these issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, @fmeum 's approach is reasonable; spaces in workspace paths is the kind of thing that could break in unexpected places. Yes, we should properly quote everywhere, but that doesn't mean that we do if the functionality is untested.
Hi @meteorcloudy, Since i can see that this PR has been approved, Please let us know whether gTech team should proceed with importing it. Thanks! |
@fmeum: is the aim to get this into Bazel 8.0.0? |
In principle, adding the functionality to encode spaces in runfiles manifests is fine. I'm not sure, though, how I feel about this particular one because it's tailored to spaces and only spaces. How about implementing a scheme that could be more easily extended? Spitballing, something like this: if a line in the runfiles manifest starts with a space, the algorithm to parse the line is as follows: the separator is between the two paths is a space, end of the line is a newline, any other character is a file name. If a file name contains a space, a newline or a backslash, these need to be escaped as This is easy to implement, provably supports everything is just as incompatible with the existing system and looks more conventional than the encoding proposed in this pull request. @fmeum WDYT? It looks like if we go through the pain of adding syntax to runfiles manifests, we could as well make it future-proof. I could also be convinced to make them some subset of JSON so that parsing them is possible with standard libraries but also with random logic so that one doesn't need to add additional dependencies. |
It lacks one important property, but we can also fix that: With this scheme in its current form, the Bash runfiles setup snippet as well as third-party runfiles libraries would need to be modified to continue to support manifest entries in which the rlocationpath contains no spaces, but the target path does (common on Windows). We can fix that by not escaping spaces in the target paths and declaring that the first space on a line that doesn't start with a space is the separator. We would then only add a space at the beginning of a line if either the rlocationpath contains a space or newline or if the target path contains a newline. At that point it's similar to the scheme in this PR, it just uses escape characters instead of length encoding. I agree that escaping is more conventional and probably not much harder to implement in most languages (it does change the length of the strings vs. just taking a substring, but that seems okay). @lberki What do you think, should I go ahead and implement the modified version of your scheme? |
605c2b5
to
34d1f84
Compare
I implemented the scheme without support for newlines for now. They could always be supported in a follow-up, which is much simpler with the new scheme (just introduce a new escape character everywhere). |
9a141b6
to
0d7a3cc
Compare
The modified scheme looks reasonable. I'm travelling today so I can't review the pull request but will do so as soon as I can. The main question in my mind is whether we do some sort of custom encoding (like this one) or go with a standard one. Of the standard ones, JSON seems to be the most palatable given the use case (simple list of key/value pairs intended to be parseable without any dependencies), but I'm not sure if it's worth the hassle and if it is, we are probably better off having an additional file called |
6384db4
to
ebf84cd
Compare
@lberki This is ready for review again, I fixed all test cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat improbably, I actually found a bug!
# with a space and spaces, newlines, and backslashes have to be escaped as | ||
# \s, \n, and \b. | ||
if [[ "$1" == *" "* || "$1" == *$'\n'* ]]; then | ||
local search_prefix=" $(echo -n "$1" | sed 's/\\/\\b/g; s/ /\\s/g')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you forgot to also escape newlines here (but you check whether there are newlines in the runfiles path in the line just above?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newlines are escaped in the line below. Escaping newlines with sed
requires commands other than s
, which I found to be too hard to read. I could replace the sed
usage with Bash variable substitution entirely, if you prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I overlooked that. You're good to go.
echo >&2 "INFO[runfiles.bash]: rlocation($1): looking for prefix ($prefix)" | ||
fi | ||
if [[ "$prefix" == *" "* || "$prefix" == *$'\n'* ]]; then | ||
search_prefix=" $(echo -n "$prefix" | sed 's/\\/\\b/g; s/ /\\s/g')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also escape newlines here
# with a space and spaces, newlines, and backslashes have to be escaped as | ||
# \s, \n, and \b. | ||
if [[ "$1" == *" "* || "$1" == *$'\n'* ]]; then | ||
local search_prefix=" $(echo -n "$1" | sed 's/\\/\\b/g; s/ /\\s/g')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I overlooked that. You're good to go.
This is a bit of a doozy -- Google disallows the usage of |
8775fde
to
818e9af
Compare
@Wyverald Done |
818e9af
to
b621e65
Compare
b621e65
to
4629915
Compare
Adds support for spaces and newlines in source and target paths of runfiles symlinks to
build-runfiles
as well as to the Bash, C++, and Java runfiles libraries (the Python runfiles library has moved out of the repo).If a symlink has spaces or newlines in its source path, it is prefixed with a space in the manifest and spaces, newlines, and backslashes in the source path are escaped with
\s
,\n
, and\b
respectively. This scheme has been chosen as it has the following properties:build-runfiles
, but were usable on Windows and supported by the official runfiles libraries. This also ensures that the initialization snippet for the Bash runfiles library doesn't need to change, even when used on Unix with a space in the output base path.Fixes #4327
RELNOTES: Bazel now supports all characters in the rlocation and target paths of runfiles and can be run from workspaces with a space in their full path.