Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Wheel pip regressions #656

Merged
merged 14 commits into from
Jul 10, 2019
Merged

[Fix] Wheel pip regressions #656

merged 14 commits into from
Jul 10, 2019

Conversation

lambdaclan
Copy link
Contributor

@lambdaclan lambdaclan commented Jun 28, 2019

Description

With the latest release of rez 2.33.0 (2019-06-26) and the merge of #602 wheel distribution based installations have now first class support in rez. Furthermore also added is the update of distlib to version 0.29 (#654 ) and a version conversion method pip to rez compatible based on PEP440. These features have introduced a few hard to detect regressions (mostly on Windows). This PR aims to fix these issues and any others that pop up as more feedback is being received from the users who detect them.

Relates:

Remarks

  • I will keep editing this PR as more issues are brought to my attention.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Issues

1. Executable scripts (bin files) are not properly set as tools in package.py (fixed)

This is Windows specific issue. The reason for this regression is the the RECORD file generated during a wheel distribution installation lists the paths to the installed paths using UNIX paths. This causes the bin scripts to not be properly processed on Windows based systems because the check for bin scripts
incorporates os.sep which on Windows evaluates to \\. Although the package is correctly installed, the tools list in the package.py does not include any of the bin scripts but other incorrect source files (different issue see below).

References:

Installing pep8

bin

2. The check whether or not a file is an executable is insufficient (fixed)

This is another Windows related issue. In attempt to simplify the code the bin script handling was slightly adjusted but this has caused a regression on Windows marking many irrelevant files as
executables hence placing them under the tools list in the generated package.py. This is now handled by adding an extra check ensuring that the file marked as a tool resides in the bin directory as well.

References:

Installing PyInstaller

image

3. The pip to rez version conversion can cause issues for non canonical/local package versions (needs discussion)

We need to decide which parts of the canonical public version and local version segments are handled by rez and how to convert them accordingly. The new version of rez has added a strict check where a canonical public version or local version is converted to a rez compatible version as follows:

The canonical public version should follow the scheme below.

[N!]N(.N)*[{a|b|rc}N][.postN][.devN]

Epoch segment: N! - skipped rez does not use version epochs
Release segment: N(.N)* used as is
Pre-release segment: {a|b|rc}N converted to lowercase
Post-release segment: .postN converted to lowercase
Development release segment: .devN converted to lowercase

The local version should follow the scheme below.

<public version identifier>[+<local version label>]

The version is parsed using the REGEX provided by PEP440 this means that is a different word is used for pre, post,dev a version such as 1.6.8.mv3 will fail because it is not a valid public or local version but 1.6.8+mv3 will work.

Notice that local version use the + sign that has a different meaning in rez, this can potentially cause issues as now package vray3.6+ will be considered a local version.

Example:

A quick look at the version example 1.6.8.mv3 running the canonical check

is_canonical("1.6.8")

True

is_canonical("1.6.8.mv3")

False

I was not aware that non canonical versions can exist but this can potentially cause a lot of issues and needs to be handled. My implementation closely follows PEP440 as I had in mind mostly the python packages from PyPI.

a = _regex.match("1.6.8")
a
<_sre.SRE_Match object at 0x7fc8694022c0>
a.groupdict()
{'pre': None, 'dev_l': None, 'dev_n': None, 'post_l': None, 'pre_n': None, 'pre_l': None, 'dev': None, 'epoch': None, 'post_n1': None, 'post_n2': None, 'release': '1.6.8', 'post': None, 'local': None}
a = _regex.match("1.6.8.mv3")
a.groupdict()
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'groupdict'

Converting it to local....

a = _regex.match("1.6.8+mv3")
a.groupdict()
{'pre': None, 'dev_l': None, 'dev_n': None, 'post_l': None, 'pre_n': None, 'pre_l': None, 'dev': None, 'epoch': None, 'post_n1': None, 'post_n2': None, 'release': '1.6.8', 'post': None, 'local': 'mv3'}

However is still not canonical (that is because is local)

is_canonical("1.6.8+mv3")
False

There are many ways to tackle this and I am assuming it might pose an issue for other studios who do not follow the PEP440 scheme.

  1. Issue warning when version is not canonical or local and stop install?

  2. Detect invalid version and adjust it to work - going against PEP440?

  3. Convert invalid version to canonical/local? We will need a ruleset for this... what is acceptable and what is not.

4. New distlib version incompatibilities (fixed)

This is a a rarer issue that popped up currently with only one known package Arrow that seems to trigger. I did not spend that much time looking into it just yet but hopefully is nothing too serious. Check below for new info.

Sample stacktrace:

rez.vendor.distlib.DistlibException: Ill-formed name/version string: 'backports.functools-lru-cache (>=1.2.1) ; python_version=='2.7)'

I believe it might be an issue with the distribution.name_and_version function with the backports.functools-lru-cache dependency (special case by the looks of it).

Update:

By the looks of it the error has nothing to do with the new version of distlib and was very likely present in older versions of rez. The issue occurs during the dependency retrieval. We can see below that
the backports.functools-lru-cache package contains some extra information when compared to the other dependencies.

requirements:  [u'six (>=1.5)']
requirements:  [u'python-dateutil']
requirements:  [u"backports.functools-lru-cache (>=1.2.1) ; python_version=='2.7'"]

The package for the backports.functools-lru-cache which becomes the input of distlibs parse_name_and_version seems to be also including the python_version dependency.

arrow (0.14.2)
<_sre.SRE_Match object at 0x7f1150e8c828> OK
python-dateutil (2.8.0)
<_sre.SRE_Match object at 0x7f1150e8c828> OK
backports.functools-lru-cache (>=1.2.1) ; python_version==2.7
None OUPS??

This dependency includes the python version dependency along with its own version that causes
the REGEX match to fail. Furthermore the provided version is a version range rather than just a version when compared to other packages, not sure if that is also going to be a problem.

image

Getting rid of the python version bit

image

A quick look at the wheel's distinfo METADATA file reveals the case not currently handled by rez

image

In fact the Core metadata specifications page clearly states the following with regards to the use of
semicolons in the requirements list:

image

I will refrain from editing the vendored distlib file to handle this so we will need to change what gets sent to the parse name version function by adding a check or something like split at ; I am not sure how often this happens...

5. pip >= 19 check can cause issues with rez binded pip (fixed)

This is also a Windows issue. Currently running rez-bind pip will create a broken package unless you run it with admin privileges (see #659). This has to do with symlinking pip, and as you can't do that on non-administrative accounts there are things missing from the package rendering it in a broken state. This can potentially cause the pip version check to fail because the pip exe will be set to the broken package.

rez-bind pip on Windows causes an invalid pip package to be generated. I am not sure where the
mysterious 1.5.6 version comes from but is definitively not the same one as the one installed on the system.

pip_bind_fail

Now the version check correctly picks up the wrong version and raises an error but it actually hides the fact that the system pip version might be >= 19, it also hides the original exception

pip_bind_fail2

Saying that, the VersionError exception might prompt the user towards the right direction but is still
confusing nonetheless. I believe since the version check does pick up the wrong version (whether
the package is broken or not) it fulfils its purpose.

On a side note, the version check was using a too broad exception catch that would cause the original raise to be silently skipped even it was raised. I added a small fix to avoid it.

How Has This Been Tested?

The fixes for these issues have been tested with various packages using the following:

Test Configuration:

  • Python version: 2.7.15, 2.7.16
  • OS: Linux, Windows 10
  • Toolchain: rez 2.33, pip 18.1, 19.1.1, distlib 0.2.9.post0

Status:

  • Fix Windows executable scripts not being detected
  • Fix executable files check not being sufficient
  • Fix distlib exception when dealing with environment markers
  • Fix Windows pip symlinking (rez bind) version check
  • Fix version handling (canonical, local, custom)

The wheel RECORD file seems to always list the paths of the installed
file in UNIX form which means when checking for bin scripts on Windows
through os.sep ("\\") the executables are not handled correctly.
Normalize the path to fix the issue.

Relates: ef14e91
The current check if a file is executable or not seems to be causing
some issues on Windows platform adding non executable files to the
tools list in the package.py. Add a further check that ensures the
executable script file is located in the bin folder.

Relates: ef14e91
@lambdaclan lambdaclan changed the title Fix/wheel pip regressions [WIP] Fix wheel pip regressions Jun 28, 2019
@bfloch
Copy link
Contributor

bfloch commented Jun 28, 2019

I saw the described issues with PySide. The proposed PR has fixed them in my testing.

 The Core metadata specifications page mentions that a wheel's
 METADA file Require-Dist field can contain an environment marker after
 a semicolon. This means that the requirement is only needed in the
 specified conditions. Handle this marker to correctly retrieve
 package version and name.

 Relates: https://packaging.python.org/specifications/core-metadata/#requires-dist-multiple-use
@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Jul 2, 2019

@lambdaclan If we want to handle all possibilities with versions, we might want to look at the packaging pip package, which is an official lib to handle versions of pip packages. That would catch the python version thing (called a marker). See https://packaging.pypa.io/en/latest/requirements/#usage. I think it would help us. What do you think?

Note that I also think we should not ignore the markers. In the case of backports.functools-lru-cache, they use a marker to specify the python version to avoid having to install the package if in python 3 (since there is now an lru cache in functools in python 3.4+). I think we should do the same as pip and ignore the dependency in python is greater then 2.7.

@lambdaclan
Copy link
Contributor Author

lambdaclan commented Jul 2, 2019

Hello JC, thank you very much for your input as always. Indeed that package looks really interesting and the fact that is official makes it even better, thank you for bringing it to my attention. I am assuming it will need to be added as a vendored package? Either way I will take a closer look and see if and how we can use it to manage versions.

Edit:

Note that I also think we should not ignore the markers. In the case of backports.functools-lru-cache, they use a marker to specify the python version to avoid having to install the package

Agreed but I believe this is handled by pip itself. The fact that the backports dep gets installed is because of that marker. What I am doing is just removing the marker before sending it to the parse name version method. In the case of Python3 no semicolons will be included in fact Python 3 will not even pick up backports so it will not be handled at all.

@JeanChristopheMorinPerso
Copy link
Member

I am assuming it will need to be added as a vendored package?

Yes it would effectively require us to add packaging into our vendored libraries. packaging depends on attrs, pyparsing and six. Of the three, we already have 2 and attrs will not bring any other dependencies. So at the end, it's two libraries to add, which seems right to me if packaging removes some complexity in rez IMO.

Of course this is in the case where it does all we want.

The pip version check was using a too broad exception catch that would
cause the original raise to be silently skipped even it was raised
based on a successful version match (<19). Adjust exception handling
to avoid this issue.
@lambdaclan lambdaclan changed the title [WIP] Fix wheel pip regressions [Fix] wheel pip regressions Jul 4, 2019
@lambdaclan lambdaclan changed the title [Fix] wheel pip regressions [Fix] Wheel pip regressions Jul 4, 2019
@nerdvegas
Copy link
Contributor

Interesting case with the markers. Could we not just ignore the markers however, and then verify that each package requirement is present in the temporary pip install, and remove it if it's not? So in other words, in the arrow case above, we would first assume that backports.functools-lru-cache is a requirement, but then remove it after pip does its thing, sees that we're installing with python 3 (for argument's sake), and then doesn't install it?

Wrt adding packaging vendor lib, I'm not quite convinced that we need it yet, but if we do start seeing more complexity then yeah I'm not adverse to adding it.

src/rez/pip.py Outdated
@@ -102,6 +102,9 @@ def get_distribution_name(pip_name):
version = version.replace("==", "")
name = get_distribution_name(name)
except DistlibException:
# check if package contains erroneous additional environment info and remove it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erroneous? Do you meant extraneous?

I would add a link to the distlib docs on markers to clarify what's happening here.

@nerdvegas
Copy link
Contributor

Actually, given that we have that big regex for pip versions already, imo that would be reason enough to use packaging already, assuming that it parses out this info for you (which it looks like it does. I don't see an example for epoch but I assume it's probably there).

src/rez/pip.py Outdated
@@ -411,7 +416,7 @@ def pip_install_package(source_name, pip_version=None, python_version=None,
destination_file = os.path.relpath(source_file, stagingdir)
exe = False

if is_exe(source_file):
if is_exe(source_file) and destination_file.startswith("bin"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bin" + os.sep?

@lambdaclan
Copy link
Contributor Author

lambdaclan commented Jul 5, 2019

Switch to packaging library

conversions

image

legacy

image

strict

image

PEP440 defines the standard version scheme for Python package. Use
the packaging library to work with such versions without having to
rely on complex regex parsing.

Relates: https://packaging.pypa.io/en/latest/, https://www.python.org/dev/peps/pep-0440/#appendix-b-parsing-version-strings-with-regular-expressions
The packaging library depends on pyparsing lib  >=2.0.2. Currently
2.0.1 is installed. The currently installed pydot version 1.0.28
depends on pyparsing 2.0.1. In order to install packaging lib
update pydot which will in turn allow us to update pyparsing as well.

Relates: https://github.com/pypa/packaging/blob/5ef37d364bfb18461da28fd667e4d538ba67255b/setup.py#L51
@nerdvegas nerdvegas merged commit de37814 into AcademySoftwareFoundation:master Jul 10, 2019
@nerdvegas
Copy link
Contributor

Thanks, looking good.

@lambdaclan lambdaclan deleted the fix/wheel-pip-regressions branch April 6, 2020 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants