-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect normalization of POSIX paths with exactly two leading slashes #51345
Comments
The same specification says in Section 4.13:
However, to the best of my knowledge, only very few POSIX-like platforms take advantage of this ancient edge case (Cygwin, IBM z/OS, ...). There are also some applications that treat such paths differently than the underlying operating system (e.g., Blender). |
The case is not ancient, and it is especially relevant to Windows enterprise-level development, where these POSIX paths represent UNC paths. The case is indeed supported correctly in Cygwin and similar tools, while it is true that tools under Unix for the most part ignore it (notably, with the exception of bash): which is also partly due to the fact that initial versions of the POSIX spec were quite cryptic on that specific point. [P.S. Which is also the reason why lots of references online are to "4.13 Pathname Resolution", but conformance to "3.271 Pathname", i.e. to the very format, is what our issue is about.] But now the spec is clear as well as the use case, and I'd propose that either 1) we fix normalization in 'path/posix', or 2) a note is added to the docs to say that the implementation is conformant with POSIX except on that point. -- Where the latter, IMO, would only make sense for backward compatibility, otherwise the fix itself is rather simple. As to the need to take some action here, please consider this scenario: I am a developer, my app is required to support only POSIX paths (in config files and internally), and now I need either a link to an issue here or a link to the docs to tell my customer that, at least for the time being, the system has that specific limitation (and that they'll have to either map UNC paths to local drives to make it work, or add coming up with a conformant implementation to the backlog...). |
My bad, I didn't realize Cygwin was still present in production environments - as far as I can tell, its popularity has been decreasing rapidly for many years. On top of that, Windows UNC paths have sadly always been a mess... Anyway, I guess not changing WDYT @nodejs/path @nodejs/platform-zos? |
This issue does not need Cygwin or else to occur, and I have even offered a concrete scenario. And have you read that UNC is deprecated or obsolete? A mess is non-conformance! Then I must stress again that the issue here is conformance to the POSIX *path format*, not conformance of some tools / conformance of path resolution -- not even of 'node:path/posix' at that, at least not insofar as I suppose once
I don't think I/we need anything more to support the present case, but if you still think those two references are relevant, please at least provide links and a hint as to why. |
I never said that the issue is limited to Cygwin, nor that UNC is deprecated or obsolete. I even agreed that changing the behavior of Node.js appears to be in line with the POSIX standard. (However, given that Node.js has likely exhibited this behavior for a long, long time and I don't recall similar bug reports in recent years, I'd assume that this is a rather rare issue on modern systems — which doesn't mean that it shouldn't be addressed, but it probably makes it low-priority for most folks.)
I notified relevant teams that might have additional insights or potentially even an interest in implementing the suggested change, but since you seem displeased with that, please feel free to open a pull request yourself. |
You seem displeased, I am fine with that. |
IMO it makes sense to do both, but in the other order: first a PR to document the limitation that can be backported to current release lines, and after that a second one that makes the implementation compliant with the spec can land as
semver-major
PRs welcome! |
Given I am not particularly familiar with node's history and statistics, the only thing I am still a bit worried about is backward compatibility, but if you guys think that is not an issue, I'd definitely second the approach you have described.
At the moment I happen not to have much time, but if this is still open and unassigned by the end of the month, I will gladly try and give a hand... |
The |
Hi guys, I am having second thoughts on this fix, as meanwhile I am finding more non-conformities:
And I'd think the backward-compatility impact becomes more and more significant, also considering that:
In light of the above, thinking overall a PR and how extensive the changes seem to be, and mainly for concerns of backward compatibility, I'm rather thinking of the following approach:
I am in fact already getting into doing 1) (and I'd make sure I share it under a suitable license): anyway it starts with formalizing the specification, which is an initial step I'd think is needed in any case. But it might take a while... Your thoughts? |
Each changes would need to be discussed separately IMO; if there are more bugs, they can fixed later or documented as known deviations from the spec. if you send a PR to fix the double slash bug, it will likely get accepted. If you prefer to invest your time if a rewrite, that's fine too, but consider it's less likely to get accepted, because as you said the backward-compat will be a concern. |
Honestly, I'd advice against that approach, in this specific case (I think it makes things more difficult, not less, especially transition-wise), as well as more generally (see this post of mine for some reasons): but I understand this is not the place to discuss methodological issues, nor I'd actually have any problem with a piece-meal approach on your side. In any case, I'd start by writing down a formal specification of the protocol we are implementing, in the simplest scenario to be kept as a comment in the code. Also, to be clear and upfront, my personal task is to have a "conformant path module", not just a single issue fixed, moreover, once I do it, I must do it in the context of "verified software" (with Coq specifically): so my personal plan is to start by formalizing the spec to then, eventually, generate from it the JavaScript... Whether any of that might be of interest to the 'node' project is not for me to say (though I'd hope so, at least in the long term: that some components come from a "verified pipeline"), In any case, I will be sharing my work to the extent possible: e.g. I will share here (if the case is still open) the formal specification of the protocol in question (which, to my present understanding, is "POSIX Pathnames plus support for Windows paths"). |
I think For now, I think we should document that the |
PR-URL: nodejs#51513 Refs: nodejs#51345 Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
PR-URL: nodejs#51513 Refs: nodejs#51345 Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Version
v20.10.0
Platform
Microsoft Windows NT 10.0.19045.0 x64
Subsystem
lib/path.js ("node:path/posix")
What steps will reproduce the bug?
When I
normalize
a POSIX path with 'node:path/posix', if the path contains exactly two leading slashes, I'd expect the two leading slashes NOT to be collapsed to one, as per the POSIX spec (see [1]), but the opposite appears to happen.Here is an example with CMD on Win 10:
How often does it reproduce? Is there a required condition?
Always.
What is the expected behavior? Why is that the expected behavior?
Expected result is
'//alpha/beta'
, i.e. the two leading slashes are NOT collapsed to one.What do you see instead?
Actual result is
'/alpha/beta'
, where the two leading slashes have been collapsed to one.Additional information
[1] See POSIX.1-2017 / Base Definitions / 3.271 Pathname, which states:
The text was updated successfully, but these errors were encountered: