-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
listdir() doesn't work with non-trivial symlinks #57981
Comments
Note how _getfinalpathname works and calling listdir on the final path name also works: >>> os.symlink(".\\test", "Lib\\bar")
>>> os.listdir("Lib\\bar")[:4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NotADirectoryError: [Error 267] The directory name is invalid: 'Lib\\bar\\*.*'
>>> nt._getfinalpathname("Lib\\bar")
'\\\\?\\C:\\t\\pathlib\\Lib\\test'
>>> os.listdir(nt._getfinalpathname("Lib\\bar"))[:4]
['185test.db', 'audiotest.au', 'autotest.py', 'badcert.pem'] Calling listdir on the non-final extended path doesn't work: >>> os.listdir('\\\\?\\C:\\t\\pathlib\\Lib\\bar')[:4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NotADirectoryError: [Error 267] The directory name is invalid: '\\\\?\\C:\\t\\pa
thlib\\Lib\\bar\\*.*' But open() works: >>> open('Lib\\bar\\regrtest.py')
<_io.TextIOWrapper name='Lib\\bar\\regrtest.py' mode='r' encoding='cp1252'> |
I think this is because "Lib\\bar" is NOT being created as a directory symlink, but rather as a regular one. As such, the documentation for symlink where it states the optional |
Confirmed (on latest py33 build): >>> os.listdir('Lib\\bar')[:4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NotADirectoryError: [Error 267] The directory name is invalid: 'Lib\\bar\\*.*'
[61658 refs] ... after manually deleting the file-symlink >>> os.symlink('.\\test', 'Lib\\bar', target_is_directory=True)
[61658 refs]
>>> os.listdir('Lib\\bar')[:4]
['185test.db', 'audiotest.au', 'autotest.py', 'badcert.pem']
[61666 refs] |
Ha! I didn't even know about that option. Thanks for noticing.
I don't know if auto-detection is a good idea. Of a course, from a Unix |
I agree that it's probably not a good idea. Windows users should be aware of symlink limitation on the platform to begin with, IMHO. Besides, keeping the detection behaviour in light of this defect would be adding unnecessary complexity to the code. Currently, the To make it aware of its path relativity, in the context of the target, is much harder than just changing the documentation. ;-) |
Ah, sorry, I had misunderstood your comment. Indeed, os.symlink |
Another issue with the detection scheme is that it's not atomic: there can be a race condition between the GetFileAttributesExW() and CreateSymbolicLinkW() calls. I propose applying the following patch to 3.2 and 3.3 (+ doc fix, not included in the patch). |
Looks good to me. |
New changeset 839fa289e226 by Antoine Pitrou in branch '3.2': New changeset a7406565ef1c by Antoine Pitrou in branch 'default': |
Should be fixed now! |
I apologize I missed this issue when it arose, so my comments come late. The reason for inferring the directory status of the targets was so that use of os.symlink(link, target) would behave much the same on Unix as on Windows for the most common use cases. By requiring the directory status to be supplied when calling os.symlink, it makes the function barely portable (it becomes only portable for symlinks to files, not directories). As was indicated by Larry Hastings in bpo-14917, there's an expectation that Python could easily detect the directory status of the target. There was apparently a bug in the earlier detection code for directories, which doesn't exist in the reference implementation (done in ctypes): PS C:\Users\jaraco\projects\jaraco.windows> mkdir -p Lib/test/foo
PS C:\Users\jaraco\projects\jaraco.windows> python
Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from jaraco.windows import filesystem as fs
>>> fs.symlink('.\\test', 'Lib/bar')
>>> import os
>>> os.listdir('Lib/bar')
['foo'] I suspect this bug crept in as we worked through the various challenges with directory detection at that low level of Python. As you can see, the reference implementation is clean an straightforward and correct. In my opinion, it would be worthwhile restoring the directory detection and creating tests to capture cases where detection fails, rather than removing the implementation. I've been using the ctypes implementation for some time, and having automatic directory detection is a huge benefit to portability and simplicity of use. It was pointed out that there is a race condition, and theoretically, that is true, but I believe this not to be a problem because the automatic detection is a best-effort. It's used to guess the best possible directory status for the symlink. If there's a directory there one millisecond, then the target is then removed, and the symlink is created as a directory symlink, that's most probably still what the user would have wanted. As a heavy user of symlinks in Windows, I would strongly prefer inference of directory status. Of course, I can always continue to use the ctypes implementation in my environments, but I imagine that other potential users would feel the same as I do and would appreciate a more portable implementation. Without the directory feature, many uses of os.symlink are not portable and will fail (with ugly results) on Windows. With the directory detection feature, most (if not all) will simply work as expected. Removal of this feature is a regression of the intended functionality, and I'd like it to be considered to be restored. |
Le jeudi 26 juillet 2012 à 19:12 +0000, Jason R. Coombs a écrit :
The target_is_directory argument is supposed to be "supported" (i.e. I'm not against restoring detection, but it should be non-buggy and |
I've started work on restoring the directory detection in my bitbucket repo (https://bitbucket.org/jaraco/cpython-issue13772). I have added a regression test to capture the basic failure (where the link is not created in the current working directory) as well as a fix. The fix uses the Microsoft Shell Lightweight Utility Functions which has one benefit of being tested and robust, but has two disadvantages, namely:
Interestingly, it does detect the '/' character as a separator - it just doesn't treat it as one when stripping the trailing path. Given these disadvantages, I'm inclined to write custom functions to support the directory detection. Any suggestions are appreciated. |
I tried the create-patch function, but it didn't work against bitbucket (produced an empty diff). So I'm attaching a diff explicitly. |
Please disregard the patch 8bf88b31ebb7. Antoine, Brian, or anyone else - would you please review the attached patch (b744517b90bc.diff)? It adds a test for creating directory symlinks that are non-local (don't depend on the current directory for reference), backs out Antoine's changes, and replaces them with directory detection that fixes this issue while maintaining directory detection. I'm not super-confident about the implementation of the path-manipulation functions, and I would prefer to use the Python implementations of the path-manipulation (dirname and join) instead. If there are any suggestions in this regard, I'd appreciate them. That said, the tests now pass for me on Windows (and I don't see why the tests wouldn't also pass on Linux). |
From an implementation standpoint, I would indeed prefer the path handling From a principle standpoint, I'm not sure it's a good idea for os.symlink() Also, since in general you always know whether you're making a link to a |
Thanks for the suggestions.
Is there an example of how a function like win32_symlink could call into ntpath.dirname()? I believe the reason we did not invoke ntpath was due to a possible race condition where the Python interpreter might not be fully started before os.symlink is invoked. Perhaps there are mitigating factors, but I'm not familiar enough with the Python internals to know what's safe and what is not (or even how to invoke that behavior).
I agree there exists a small possibility of a race condition here. I believe this race condition is of little concern to the vast majority of use-cases. For those use cases where a race condition might be a factor, the target_is_directory parameter may be supplied by that code to overcome the condition. Furthermore, since the Windows APIs don't provide a way to match the symlink creation with its target, it's not possible to avoid this race condition.
For the vast majority of use cases, target detection will behave much like symlinks do in Unix, which allows os.symlink to be used portably without specifically accounting for Windows behavior. I'm not as concerned about Windows programmers using Python (who might not mind specifying the directory-ness of their targets), but Python programs written by Unix developers running on Windows. In the latter use-case, it's much harder to have those programs adopt an additional parameter to support Windows. I could imagine some library maintainers rejecting a patch that includes target_is_directory just because it's not elegant and seems unnecessary from a Unix programmer's perspective. For that use case and in order to support the general expectation that os.symlink should provide the most uniform behavior it can, the directory detection is highly valuable. Indeed, both your initial usage of the function and Larry Hastings' initial reaction betray the expectation that os.symlink should create a directory-type symlink when the target is a directory. I don't see how it's not a huge plus to support a developer's expectations when possible. Personally, I would prefer to not have the parameter at all. Unfortunately, to support cases where one might need to create directory symlinks to non-existent targets, the parameter is necessary. However, I'd like to limit its required usage to only the cases where it's necessary.
I'd like to focus on restoring detection, which was vetted and approved in the original implementation. Obviously, I can't guarantee that it doesn't have additional bugs. Provably-correct code is expensive and way outside the scope of this effort. As you could see from your patch, there were tests capturing the desired behavior. That test has been restored. Additionally, this patch includes a new test to capture the case of a non-local link (a behavior that was never tested for Unix symlinks). |
Since there's been no additional criticism or discussion on this matter, I'll plan to commit the proposed patch to restore target directory detection. Since the functionality was removed in a bugfix release, it should be acceptable to restore it in a bugfix release. I know 3.2 no longer gets non-security releases, so I'll port the patch to 3.3 and apply it to 3.3 and default. I'll plan to do this tomorrow unless there's further discussion. |
Ugh. So it seems the 3.3 implementation of win_symlink, which is now been renamed to posix_symlink, is now much more complicated. Not only does it handle symlink arguments on Unix platforms, providing abstraction for a couple of underlying system calls, but the Windows implementation also supports once again wide and narrow characters for the path, which means that the directory detection functions I created are no longer suitable as they also need wide and narrow versions. |
New changeset 29a2557d693e by Jason R. Coombs in branch '3.3': |
New changeset f431cd0edd85 by Victor Stinner in branch 'default': |
New changeset c351591f1f63 by Victor Stinner in branch 'default': |
@jason R. Coombs: Can you please review my following commit? It looks like you made a copy/paste failure in _check_dirA() :-)
|
@victor Stinner Thanks for fixing this. You're absolutely right. |
New changeset e024236ea253 by Victor Stinner in branch 'default': New changeset d9f3ea27f826 by Victor Stinner in branch 'default': |
New changeset c8212fca8747 by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: