Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOError: [Errno 22] when running files with non-ascii characters. #206

Closed
karthiknadig opened this issue May 29, 2019 · 6 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@karthiknadig
Copy link
Member

Environment data

  • PTVSD version: master
  • OS and version: windows
  • Python version (& distribution if applicable, e.g. Anaconda): 2.7.16
  • Using VS Code or Visual Studio:

Additional info:
File System encoding: mbcs
Default encoding: ascii

Actual behavior

This error is shown:

Traceback (most recent call last):
  File "c:/Users/kanadig/.vscode/extensions/ms-python.python-2019.5.17059/pythonFiles/ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "C:\GIT\ptvsd\src\ptvsd\__main__.py", line 434, in main
    run()
  File "C:\GIT\ptvsd\src\ptvsd\__main__.py", line 312, in run_file
    runpy.run_path(target, run_name='__main__')
  File "C:\Python27\lib\runpy.py", line 251, in run_path
    code = _get_code_from_file(path_name)
  File "C:\Python27\lib\runpy.py", line 227, in _get_code_from_file
    with open(fname, "rb") as f:
IOError: [Errno 22] invalid mode ('rb') or filename: 'c:\\GIT\\pyscratch2\\??\\experiment.py'

image

Expected behavior

Should launch the file, like in pyhton 3.*

Steps to reproduce:

  1. Create a file with path C:\测试\experiment.py
  2. Open the directory in vscode
  3. Start debugging.
@int19h int19h transferred this issue from microsoft/ptvsd May 4, 2020
@int19h
Copy link
Contributor

int19h commented May 4, 2020

Recent fixes to encoding issues and argument quoting might have fixed this.

@int19h int19h added the bug Something isn't working label Jun 19, 2020
@fabioz
Copy link
Collaborator

fabioz commented Nov 12, 2020

I can still reproduce this.

@fabioz fabioz self-assigned this Nov 27, 2020
@fabioz
Copy link
Collaborator

fabioz commented Nov 27, 2020

I'll start to take a look at this.

@fabioz
Copy link
Collaborator

fabioz commented Nov 28, 2020

I investigated it a bit more and it seems that Python 2.7 itself is broken in this regard.

Further info:

In Python 2.7 it's not even possible to use subprocess.Popen with a unicode path. For instance, running the code below has as a result:

  File "W:\pydev.debugger\check\snippet5.py", line 27, in <module>
    subprocess.Popen([sys.executable, code_to_debug])
  File "C:\bin\Miniconda\envs\py27_tests\lib\subprocess.py", line 394, in __init__
    errread, errwrite)
  File "C:\bin\Miniconda\envs\py27_tests\lib\subprocess.py", line 644, in _execute_child
    startupinfo)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 68-69: ordinal not in range(128)

---
C:\bin\Miniconda\envs\py27_tests\python.exe: can't open file 'W:\pydev.debugger\check\??\experiment.py': [Errno 22] Invalid argument

Code:

# coding: utf-8
import os.path
import subprocess
import sys

unicode_chars = u"测试"

tmpdir = os.path.abspath(os.path.dirname(__file__))

directory = os.path.join(tmpdir.decode('utf-8'), unicode_chars)

try:
    # Note: accepts unicode on Python 2.
    os.makedirs(directory)
except:
    pass

code_to_debug = os.path.join(directory, u"experiment.py")
with open(code_to_debug, "w") as stream:
    stream.write(
        """
print('launched')
"""
    )

try:
    subprocess.Popen([sys.executable, code_to_debug])
except:
    import traceback;traceback.print_exc()

sys.stderr.write('\n---\n')
try:
    subprocess.Popen([sys.executable, code_to_debug.encode(sys.getfilesystemencoding())])
except:
    print('Unable to launch as mbcs')
    import traceback;traceback.print_exc()

This happens because the APIs used in Python 2.7 aren't unicode-compatible. There are workarounds such as using CreateProcessW directly (for instance: https://gist.github.com/vaab/2ad7051fc193167f15f85ef573e54eb9).

So, it's not possible to launch unicode that's not compatible with the current machine in Python 2.7 out of the box.

Now, what should be possible is to launch unicode chars that are compatible with the current machine. So, for instance, having unicode chars as: unicode_chars = u"á" in the code above does work in my machine -- where the default locale is cp1252 -- when the Popen args are encoded with the filesystem encoding.

So, I'm working on fixing this use case (but not the use case for any unicode chars).

@int19h
Copy link
Contributor

int19h commented Dec 3, 2020

Yep, that's the expectation in general for anything Win32 and Python 2.7 related that deals with filenames or the console; best case, with everything set up correctly, it should support whatever the "non-Unicode language" is in Windows settings.

I'm not entirely sure, though, but it might actually be possible to get full Unicode support in 2.7 by setting the locale to UTF-8, since it's an option in Win10 - and maybe even prior to that, via chcp 65001.

@fabioz
Copy link
Collaborator

fabioz commented Dec 3, 2020

As a note, I've been able to make the debugger work, but making the test for it is taking a bit longer than I thought.

fabioz added a commit to fabioz/debugpy that referenced this issue Dec 4, 2020
fabioz added a commit to fabioz/debugpy that referenced this issue Dec 4, 2020
fabioz added a commit to fabioz/debugpy that referenced this issue Dec 4, 2020
fabioz added a commit to fabioz/debugpy that referenced this issue Dec 10, 2020
@fabioz fabioz closed this as completed in f9b54cd Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants