Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-16575: Add checks for unions passed by value to functions. #16799

Merged
merged 1 commit into from
Oct 31, 2019
Merged

bpo-16575: Add checks for unions passed by value to functions. #16799

merged 1 commit into from
Oct 31, 2019

Conversation

vsajip
Copy link
Member

@vsajip vsajip commented Oct 15, 2019

@vsajip
Copy link
Member Author

vsajip commented Oct 15, 2019

Branch was pushed to buildbots - there were no test_ctypes failures in the corresponding test runs.

@vsajip
Copy link
Member Author

vsajip commented Oct 26, 2019

As the buildbots were happy and there seem to be no takers for reviews, I plan to merge this in the next day or two. Please review if you have any concerns.

@vsajip vsajip merged commit 79d4ed1 into python:master Oct 31, 2019
@vsajip vsajip deleted the fix-16575 branch October 31, 2019 08:03
@miss-islington
Copy link
Contributor

Thanks @vsajip for the PR 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @vsajip for the PR 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @vsajip, I could not cleanly backport this to 3.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 79d4ed102a5069c6cebaed2627cb1645637f0429 3.7

@miss-islington
Copy link
Contributor

Sorry @vsajip, I had trouble checking out the 3.8 backport branch.
Please backport using cherry_picker on command line.
cherry_picker 79d4ed102a5069c6cebaed2627cb1645637f0429 3.8

@bedevere-bot
Copy link

GH-17016 is a backport of this pull request to the 3.8 branch.

@bedevere-bot
Copy link

GH-17017 is a backport of this pull request to the 3.7 branch.

vsajip added a commit that referenced this pull request Oct 31, 2019
vsajip added a commit that referenced this pull request Oct 31, 2019
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
@Alan-R
Copy link

Alan-R commented Dec 31, 2019

The changes associated with this PR caused most non-trivial programs that use ctypesgen to fail. I'm not quite sure why this is, but it's pretty clearly the result. See ctypesgen/ctypesgen#77 for more details.

@kdschlosser
Copy link

kdschlosser commented Jan 8, 2020

This problem is more wide spread.

There are a plethora of Windows API functions that have structures with unions contained within passed to them. This change has caused a good portion of the Windows API to not be accessible using ctypes. it has also rendered comtypes pretty much useless.= comtypes is a popular library for accessing COM interfaces. The VARIANT structure which gets used a lot with COM interfaces contains unions and it gets passed to many functions/methods in the Windows API directly. Now ending up with a traceback.

There should have been solutions provided before this bugfix was ever added. and a warning should have been thrown instead of a traceback for a couple of minor version bumps before fully putting in place the fix. This would provide time needed in order to update code so things would not just break all of a sudden.

But seeing as how you have gone unanswered for 7 days, I am thinking that nothing is going to get done about it.. when it becomes more of a widespread issue as people start to use the newer version of Python there will be more people making noise about it.

what bothers me the most is this "bug" has been here for a really long time.. I have only come up with the single issue report of a problem and that was 4 years ago when it was made. maybe it is not a bug that should have been fixed as it did not cause widespread issues like the fix is going to cause

@Alan-R
Copy link

Alan-R commented Jan 8, 2020

It's my understanding that this simply rejected some things that didn't work. It's quite apparent that it's flagged a lot of things that do work. Ctypesgen is by far the best way to use ctypes, and it's rendered it useless.

@kdschlosser
Copy link

same with comtypes for accessing Windows COM interfaces.
I would say the patch needs to be removed. there are more people having issues now then before this patch was applied.
If you look at the issue opened in their issue tracker there was not a single test performed using any other OS except Linux... I do not know what OS you are running but this sure has made a mess of accessing the Windows API using ctypes. my application relies heavily on accessing the Windows API.. Now it looks like an extension module is going to need to be compiled for all of those pieces.. I wonder if simply using cython on the code would solve the issue.

@vsajip
Copy link
Member Author

vsajip commented Jan 9, 2020

If you look at the issue opened in their issue tracker there was not a single test performed using any other OS except Linux

Where did you look? Tests were run on multiple OSs, not just Linux. There's a whole fleet of buildbots which exercise all the tests.

But seeing as how you have gone unanswered for 7 days, I am thinking that nothing is going to get done about it

We're pretty much all volunteers working on Python, and we've been in a holiday period in large parts of the world. We work on this stuff as and when we can ... if you look at the Python issue, it points to a real problem with how libffi handles (or rather, doesn't handle) unions in structs. Only passing by value should be affected, not passing pointers to structs containing unions - as most OS APIs use. Perhaps if a Python issue could be opened with more specific information, that would help to identify specific problems - rather than just "remove the patch", which was put in for a valid reason.

There are a plethora of Windows API functions that have structures with unions contained within passed to them.

Since there's a plethora, it should be possible for you to produce a small problem script which fails, but shouldn't, and which is easy to reason about. That would certainly increase the chance of identifying what the problem is. ISTM that it should be less work than converting a project to Cython.

@kdschlosser
Copy link

yeah a plethora. really simple to test. install comtypes and access a windows COM object that uses the VARIANT structure.. There is a bunch of them..

Any function that take a structure vy value and that struture has a Union in it is not going to work any more. as well as passing a structure by value that has a bitfield correct???

Here this may put it into a better perspective...
https://github.com/kdschlosser/pyWinAPI

That is not a library that "runs" It is a reference. that is 750,000 lines of the 6 million lines of code that make up the Windows SDK. go poking through and see how many structures there are with bitfields and Unions. It is just going to give you a cope of how much of the Windows API does this and I don't even have 1/6th of the code ported. Now I know that all of them do not get passed by value to a function. But say maybe 10% of them do. a very large chunk of the Windows API is now broken..

Asd fgar as the tests are concerned.. I was talking about tests for validating the bug.. every single test that was done to validate there was in fact an issue was only done on a Linux OS and always a Union with exactly 11 fields and every single field was a ctypes.c_double. No one ever tested a "real world" scenario.. the VARIANT Structure is one of the most used structures in Windows. It is massive in size and no one ever had an issue with it.. millions of calls with this thing being passed and no problems. no errors..

I have attached a code example of what the VARIANT structure is.. It's close to 900 lines for a single structure.
variant.zip

If you look at that repo I posted open the file um/oaidl_h.py.. in there you will see but only a very small number of functions/methods that make use of the structure. some will be by value.. You would have to search the project to locate all of them and again It is not completely coded. and that is only 1 structure there are many more that are going to have the same issue.

Also do you bot tests specifically access the Windows API at all?? do they create COM objects and run tests specifically to check ctypes functionality in a real world setting?? or does it run the tests against a dll that is purpose built for making sure ctypes is working properly.. Because by definition your tests passed like they should. code was added to reject unions being passed by value. I am fairly sure no one looked at or even asked "what is this going to break in peoples programs??" What uses ctypes very heavily and might pass unions as values....

The validity of the bug was not tested properly. There is a very large number of Python users that use Windows.. Probably the majority since Windows does have 79.24% of the market share. but not one single issues reported by a windows user for this problem... Not a single test to see if the problem existed on Windows..

This is troubleshooting 101. this is the first thing that gets done. see if the problem exists across the board or is it specific to an OS or an architecture... I am not sure what happened with this bug, everyone is usually on the ball and a great deal of effort gets put into verifying the existence of a bug and dialing in what it affects and what is the actual cause. This process is a huge pain in the @^$ to do. I give you guys all the credit for doing it.. Now I know that there can still be issues even if this process was done..

I do not think anyone is going to care really that this was changed.. where the issue lies is there is no way to get around it.. None that has been documented, there have been several people that have been looking at finding a solution with none to be had thus far..

@vsajip
Copy link
Member Author

vsajip commented Jan 10, 2020

Also do you bot tests specifically access the Windows API at all?? do they create COM objects and run tests specifically to check ctypes functionality in a real world setting?

No. It's a volunteer project. No one gets paid to do any of this, so they scratch their own itches. Many people who run into a problem post a specific small script which demonstrates the problem, to help volunteers with limited time to home in on the problem. This is not the same as saying "there are lots of examples out there, pick one".

None that has been documented, there have been several people that have been looking at finding a solution with none to be had thus far

If someone posts to the issue a Python script using ctypes and demonstrating a specific Windows call which can't be accessed now when it should be, and could be before the check was added, then I can take a look at it.

@Alan-R
Copy link

Alan-R commented Jan 10, 2020

It might be good to include a test that uses ctypesgen as an integration test. When I get this next Assimilation release out in a week or two, I'll try and create a mini-test that reproduces the problem and one that is based off the more complex output of ctypesgen. Unfortunately, this bug limits my next release to only work with Python 3.6 :-(.

But what is clear is that the patch you intended to catch only one specific case that didn't work, prohibits a huge number of cases that worked just fine before. Those cases didn't work before, and they don't work now - a very minimal payoff.

Another project that hasn't complained yet (but likely will) is the Grass GIS project. They are based on a fork of ctypesgen. But the behavior of ctypesgen Strings goes back to the oldest versions of ctypesgen.

Whether you have a simple case that demonstrates the problem or not, it's really clear that this patch does more harm than good, and should be rolled back.

@vsajip
Copy link
Member Author

vsajip commented Jan 10, 2020

I'd like to see if there's something in the failing cases that would allow me to modify my change such that the original problems are stopped but the now-failing-though-seemingly-valid cases are let through. This breakage of packages using ctypesgen is regrettable, and was of course not intended. Is it really the case that this is a really widespread problem, but no-one can come up with a simple example which should work but doesn't?

@Alan-R
Copy link

Alan-R commented Jan 10, 2020

The reason why you're asking us to make a simple test case is because it takes time to come up with a good test case. I get that. Until I get this Assimilation open source release out, I just don't have the 2-4 hours it would take for me to produce such a really nice, simple, high-quality test case. If you'd broken all the 3.x releases, I wouldn't have any choice but to stop and do it right now. But I don't have to fix it right now, so it'll have to wait for a few more days. My urgency is that the Assimilation project is currently based on Python 2.7. So, that's gotta get fixed now.

I understand about volunteering for open source projects. I've been doing open source work since 1999, and have founded and led a few projects - one of which was quite successful.

@kdschlosser
Copy link

Here is a library that is encountering problems with this change.

https://github.com/pywinauto/pywinauto

I have asked one of the developers to chime in here as they may be able to provide an example..
My use case is the same as theirs with using comtypes and accessing Windows COM objects. It would take me a while to come up with a simple example. My use case would also require Windows to be running in order to run the example. I do not know if this is something that is available to you.

@kdschlosser
Copy link

kdschlosser commented Jan 10, 2020

Now I know this has nothing to do with the problem at hand.. But I need to express my disapproval/disappointment from this comment

No. It's a volunteer project. No one gets paid to do any of this, so they scratch their own itches.

The projects I work on are also volunteer. The above comment DOES NOT apply. If we support an OS that OS gets tested

79.25% of all computers in the world run Windows. Windows users are what comprise the majority of Python users I am sure of that.. this is with or without them having knowledge of running python due to being compiled into an application using py2exe or the like.

To not have tests that automatically get run to make sure things work properly on Windows is bonkers, to also not verify the validity of a reported bug using a Windows machine if the test can be run on a Windows machine is also bonkers.. Windows users = Python's largest number of users.. To not have any consideration for them in terms of making sure things are running properly.. I can't even wrap my head around why this is not being done.

Python gets used buy fortune 500 companies and the single user alike. To have a statement like the above does not instill faith in using Python. Here are some interpretations of that statement
"We do not take into consideration how changes made will effect the companies and the people that use Python",
It also can be read as
"bah.. I don't fell like testing it. I am just going to add it to a release and see what happens..".

Now that is simply mind blowing.. WOW.

I must have made a mistake in thinking that Python was built/maintained with better ideals and standards then what Microsoft uses.. What a horrible letdown...

also being a volunteer and not getting paid has not a thing to do with it. It's PRIDE it's your pride and the pride of the others that volunteer, You do get paid. you get paid when a user expresses their gratitude. that feeling you get inside from that taking place is why you do it. You are not volunteering because of the kindness in your heart.. trust you me.. No one does. There is always something gotten, there is always a payment made. It may not be in a physical form, but that does not make it any less of a payment.

I know this feeling. I deal with the above day in and day out.. I head up an open source project that has close to 200,000 users. Anyone who volunteers or helps out has to follow only 2 "rules"

  1. The users are most important component. How a change is going to effect them is the top priority.
  2. If you lose sight of rule 1 then you need to spend more time interacting with the users.

I believe that those rules should be used by any project. Open Source or Paid.. does not matter. I believe that fundamentally the volunteers do not want to release bad code. Purpose for doing gets replaced with complacency and unwillingness to put in 100% to make sure it is as close as they can get to being bug free code. Sometimes a break is needed.. other times a reminder of the people effected is needed. Burnout is a real thing. It can and will cause poor decision making to occur.

If any of the python developers have lost sight of why they are volunteering then they are not interacting with the "little" people enough.. the users.. not the companies but the single person that is having a problem. these are the people that will express the kind of gratitude that brings the reason why back into focus. The reason to do their very best and to make sure that it is done the right way.

@vasily-v-ryabov
Copy link

Hi @vsajip the example is as simple as possible (x86_64, Windows 10 version 1903, Python 3.8.1 64-bit):

import comtypes.client
comtypes.client.GetModule('UIAutomationCore.dll')

The traceback:

Traceback (most recent call last):
  File "C:\Python38_1_64\lib\ctypes\__init__.py", line 123, in WINFUNCTYPE
    return _win_functype_cache[(restype, argtypes, flags)]
KeyError: (<class 'ctypes.HRESULT'>, (<class 'comtypes.automation.tagVARIANT'>, <class 'comtypes.LP_POINTER(IDispatch)'>), 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python38_1_64\lib\site-packages\comtypes\client\_generate.py", line 110, in GetModule
    mod = _CreateWrapper(tlib, pathname)
  File "C:\Python38_1_64\lib\site-packages\comtypes\client\_generate.py", line 184, in _CreateWrapper
    mod = _my_import(fullname)
  File "C:\Python38_1_64\lib\site-packages\comtypes\client\_generate.py", line 24, in _my_import
    return __import__(fullname, globals(), locals(), ['DUMMY'])
  File "C:\Python38_1_64\lib\site-packages\comtypes\gen\_944DE083_8FB8_45CF_BCB7_C477ACB2F897_0_1_0.py", line 1433, in <module>
    IAccessible._methods_ = [
  File "C:\Python38_1_64\lib\site-packages\comtypes\__init__.py", line 329, in __setattr__
    self._make_methods(value)
  File "C:\Python38_1_64\lib\site-packages\comtypes\__init__.py", line 698, in _make_methods
    prototype = WINFUNCTYPE(restype, *argtypes)
  File "C:\Python38_1_64\lib\ctypes\__init__.py", line 125, in WINFUNCTYPE
    class WinFunctionType(_CFuncPtr):
TypeError: item 1 in _argtypes_ passes a union by value, which is unsupported.

comtypes generates some wrapper code around mentioned UIAutomationCore.dll in the cache folder: C:\Python38_1_64\lib\site-packages\comtypes\gen. Making the example more simple (I mean eliminating comtypes as a dependency) would take some time.

comtypes declaration of VARIANT is here:
https://github.com/enthought/comtypes/blob/master/comtypes/automation.py#L141

Microsoft documentation for VARIANT:
https://docs.microsoft.com/en-us/windows/win32/api/oaidl/ns-oaidl-variant

P.S. Agree with @kdschlosser . We're all volunteers working on open source projects. But I wouldn't pay further attention to this topic and suggest to better concentrate on the technical problem investigation. It's normal situation when some fix has broken a lot of end user codes. CPython is just much more popular than any of Python libraries. So I can understand @vsajip too. Sometimes it's hard to predict real impact of bugfix even if you're testing it pretty deep way. So let's check "what could be done" rather than "what was wrong".

@airelil
Copy link

airelil commented Jan 11, 2020

If it can be helpful, these are other reports about the same problem, but on Linux:
waveform80/picamera#604
hcpl/xkbgroup#15 and enkore/i3pystatus#771

@remdragon
Copy link

Can't even get to first base with ADODB.Recordset:

    rs = comtypes.client.CreateObject ( 'ADODB.Recordset' )
  File "C:\Python38-32\lib\site-packages\comtypes\client\__init__.py", line 250, in CreateObject
    return _manage(obj, clsid, interface=interface)
  File "C:\Python38-32\lib\site-packages\comtypes\client\__init__.py", line 188, in _manage
    obj = GetBestInterface(obj)
  File "C:\Python38-32\lib\site-packages\comtypes\client\__init__.py", line 110, in GetBestInterface
    mod = GetModule(tlib)
  File "C:\Python38-32\lib\site-packages\comtypes\client\_generate.py", line 110, in GetModule
    mod = _CreateWrapper(tlib, pathname)
  File "C:\Python38-32\lib\site-packages\comtypes\client\_generate.py", line 184, in _CreateWrapper
    mod = _my_import(fullname)
  File "C:\Python38-32\lib\site-packages\comtypes\client\_generate.py", line 24, in _my_import
    return __import__(fullname, globals(), locals(), ['DUMMY'])
  File "C:\Python38-32\lib\site-packages\comtypes\gen\_B691E011_1797_432E_907A_4D8C69339129_0_6_1.py", line 65, in <module>
    Fields15_Deprecated._methods_ = [
  File "C:\Python38-32\lib\site-packages\comtypes\__init__.py", line 329, in __setattr__
    self._make_methods(value)
  File "C:\Python38-32\lib\site-packages\comtypes\__init__.py", line 698, in _make_methods
    prototype = WINFUNCTYPE(restype, *argtypes)
  File "C:\Python38-32\lib\ctypes\__init__.py", line 125, in WINFUNCTYPE
    class WinFunctionType(_CFuncPtr):
TypeError: item 1 in _argtypes_ passes a union by value, which is unsupported.
C:\>python
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()

C:\>pip freeze
altgraph==0.17
bcrypt==3.1.7
beautifulsoup4==4.8.2
bs4==0.0.1
certifi==2019.11.28
cffi==1.13.2
chardet==3.0.4
colorama==0.4.3
comtypes==1.1.7
coverage==5.0.3
cryptography==2.8
Cython==0.29.14
Deprecated==1.2.7
deprecation==2.0.7
dnspython==1.16.0
future==0.18.2
idna==2.8
mock==3.0.5
mypy==0.761
mypy-extensions==0.4.3
numpy==1.18.1
packaging==20.1
paramiko==2.7.1
pefile==2019.4.18
Pillow==7.0.0
py==1.8.1
pycparser==2.19
pyflakes==2.1.1
PyInstaller==3.6
pyinstrument==3.1.0
pyinstrument-cext==0.2.2
PyMySQL==0.9.3
PyNaCl==1.3.0
pyodbc==4.0.28
pyparsing==2.4.6
python-dateutil==2.8.1
pytz==2019.3
pyvmomi==6.7.3
pywin32==227
pywin32-ctypes==0.2.0
requests==2.8.1
requests-mock==1.7.0
six==1.14.0
soupsieve==1.9.5
termcolor==1.1.0
typed-ast==1.4.1
typing-extensions==3.7.4.1
tzlocal==2.0.0
urllib3==1.25.8
wrapt==1.11.2
wxPython==4.0.7.post2

@vsajip
Copy link
Member Author

vsajip commented Feb 13, 2020

Well, this change has now been reverted.

@Alan-R
Copy link

Alan-R commented Apr 9, 2020

What versions of python have the fixed (reverted/correct) version of this code?

@vsajip
Copy link
Member Author

vsajip commented Apr 11, 2020

The reverting change was committed on Jan 12, 2020. Python versions 3.7.7 (released Mar 10, 2020) and 3.8.2 (released Feb 24, 2020) should have the reversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants