Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C files generated by Cython set tp_print to 0: PyTypeObject.tp_print removed #81431

Closed
vstinner opened this issue Jun 12, 2019 · 69 comments
Closed
Labels
3.8 (EOL) end of life build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

BPO 37250
Nosy @ncoghlan, @scoder, @encukou, @ambv, @jdemeyer, @zooba, @willingc
PRs
  • bpo-37250: define tp_print for backwards compatibility #14009
  • [3.8] bpo-37250: put back tp_print for backwards compatibility #14193
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-07-01.13:40:34.626>
    created_at = <Date 2019-06-12.13:08:30.505>
    labels = ['interpreter-core', 'build', '3.8']
    title = 'C files generated by Cython set tp_print to 0: PyTypeObject.tp_print removed'
    updated_at = <Date 2019-07-01.13:40:34.626>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2019-07-01.13:40:34.626>
    actor = 'ncoghlan'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-07-01.13:40:34.626>
    closer = 'ncoghlan'
    components = ['Interpreter Core']
    creation = <Date 2019-06-12.13:08:30.505>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37250
    keywords = ['patch']
    message_count = 69.0
    messages = ['345337', '345340', '345341', '345342', '345343', '345344', '345345', '345346', '345347', '345348', '345349', '345350', '345351', '345352', '345353', '345355', '345356', '345357', '345359', '345382', '345383', '345402', '345405', '345409', '345459', '345591', '345965', '345966', '345969', '345988', '345989', '345990', '345992', '345994', '345996', '345998', '346013', '346064', '346067', '346069', '346075', '346083', '346084', '346085', '346087', '346099', '346100', '346105', '346123', '346125', '346128', '346153', '346161', '346163', '346177', '346185', '346190', '346194', '346213', '346218', '346220', '346222', '346223', '346238', '346373', '346388', '346424', '346438', '346507']
    nosy_count = 7.0
    nosy_names = ['ncoghlan', 'scoder', 'petr.viktorin', 'lukasz.langa', 'jdemeyer', 'steve.dower', 'willingc']
    pr_nums = ['14009', '14193']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'compile error'
    url = 'https://bugs.python.org/issue37250'
    versions = ['Python 3.8']

    @vstinner
    Copy link
    Member Author

    Copy of Stefan Behnel's msg345305 in bpo-37221:
    """
    Note that PyCode_New() is not the only change in 3.8 beta1 that breaks Cython generated code. The renaming of "tp_print" to "tp_vectorcall" is equally disruptive, because Cython has (or had) a work-around for CPython (mis-)behaviour that reset the field explicitly to NULL after calling PyType_Ready(), which could set it arbitrarily without good reason.

    So, either revert that field renaming, too, or ignore Cython generated modules for the reasoning about the change in this ticket.

    I'm fine with keeping things as they are now in beta-1, but we could obviously adapt to whatever beta-2 wants to change again.
    """

    There are 2 problems:

    • source compatibility
    • ABI compatibility

    The following change removed PyTypeObject.tp_print and replaced it with PyTypeObject.tp_vectorcall_offset:

    commit aacc77f
    Author: Jeroen Demeyer <J.Demeyer@UGent.be>
    Date: Wed May 29 20:31:52 2019 +0200

    bpo-36974: implement PEP-590 (GH-13185)
    
    
    Co-authored-by: Jeroen Demeyer <J.Demeyer@UGent.be>
    Co-authored-by: Mark Shannon <mark@hotpy.org>
    

    == ABI compatibility ==

    In term of ABI, it means that C extensions using static type ("static PyTypeObject mytype = { .... };") is broken by this change.

    Honestly, I'm not sure if we ever provided any forward compatibility for static types. bpo-32388 removed "cross-version binary compatibility" on purpose in Python 3.8. It's an on-going topic, see also my notes about ABI and PyTypeObject:
    https://pythoncapi.readthedocs.io/type_object.html

    Maybe we can simply ignore this part of the problem.

    == Source compatibility ==

    Cython generates C code setting tp_print explicitly to NULL.

    To avoid depending on Cython at installation, most (if not all) projects using Cython include C files generated by Cython in files they distribute (like tarballs). Because of that, the commit aacc77f requires all these projects to regenerate their C files using Cython.

    In Fedora, we fixed many Python packages to always re-run Cython to regenerate all C files. But Fedora is just one way to distribute packages, it doesn't solve the problem of files distributed on PyPI, nor other Linux distribution (for example).

    Jeroen Demeyer proposed PR 14009 to fix the source compatibility:

       #define tp_print tp_vectorcall

    @vstinner vstinner added 3.8 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 12, 2019
    @vstinner vstinner changed the title C files generated by Cython set tp_print to NULL C files generated by Cython set tp_print to NULL: PyTypeObject.tp_print removed, replaced with tp_vectorcall_offset Jun 12, 2019
    @vstinner
    Copy link
    Member Author

    "printfunc tp_print;" has been replaced with "Py_ssize_t tp_vectorcall_offset;": the type was basically a pointer and has been replaced with an integer.

    With "#define tp_print tp_vectorcall", "type->tp_print = NULL;" becomes "type->tp_vectorcall = NULL;".

    If -Werror is used, "type->tp_vectorcall = NULL;" would fail with a compilation error, since NULL is not exactly an integer, no? I'm not sure that it's an issue, I'm just thinking aloud.

    @jdemeyer
    Copy link
    Contributor

    Actually, Cython generates code setting tp_print to 0 (not NULL).

    @vstinner
    Copy link
    Member Author

    inherit_slots() uses:

            /* Inherit tp_vectorcall_offset only if tp_call is not overridden */
            if (!type->tp_call) {
                COPYSLOT(tp_vectorcall_offset);
            }

    PyType_Ready() contains the following assertion:

        /* Consistency checks for PEP 590:
         * - Py_TPFLAGS_METHOD_DESCRIPTOR requires tp_descr_get
         * - _Py_TPFLAGS_HAVE_VECTORCALL requires tp_call and
         *   tp_vectorcall_offset > 0
         * To avoid mistakes, we require this before inheriting.
         */
        if (type->tp_flags & Py_TPFLAGS_METHOD_DESCRIPTOR) {
            _PyObject_ASSERT((PyObject *)type, type->tp_descr_get != NULL);
        }
        if (type->tp_flags & _Py_TPFLAGS_HAVE_VECTORCALL) {
            _PyObject_ASSERT((PyObject *)type, type->tp_vectorcall_offset > 0);
            _PyObject_ASSERT((PyObject *)type, type->tp_call != NULL);
        }

    I understand that tp_vectorcall_offset=0 is fine and the expected value for a type which doesn't have the flag _Py_TPFLAGS_HAVE_VECTORCALL.

    @vstinner
    Copy link
    Member Author

    IMHO the simplest and safest option for this issue is to revert tp_print removal and move tp_vectorcall_offset at the end of PyTypeObject.

    Is PEP-590 fully public in Python 3.8? It seems like _Py_TPFLAGS_HAVE_VECTORCALL at least is private.

    Maybe we can attempt again to remove tp_print from Python 3.9? Once Cython will handle tp_print removal? Or even keep it in Python 3.9?

    Python 3 had unused tp_print for 10 years and nobody complained. There are a few PyTypeObject instances in a Python process. A single pointer isn't a giant waste of memory.

    @jdemeyer
    Copy link
    Contributor

    If -Werror is used, "type->tp_vectorcall = NULL;" would fail with a compilation error, since NULL is not exactly an integer, no?

    No because tp_vectorcall is a function pointer. You're confusing with tp_vectorcall_offset which is an integer.

    @vstinner
    Copy link
    Member Author

    No because tp_vectorcall is a function pointer. You're confusing with tp_vectorcall_offset which is an integer.

    Oh. I didn't notice that your PR makes tp_print an alias to tp_vectorcall rather than tp_vectorcall_offset. Ok.

    @vstinner
    Copy link
    Member Author

    If we decide to not reintroduce tp_print, tp_print removal must be documented at:
    https://docs.python.org/dev/whatsnew/3.8.html#changes-in-the-c-api

    @jdemeyer
    Copy link
    Contributor

    I understand that tp_vectorcall_offset=0 is fine and the expected value for a type which doesn't have the flag _Py_TPFLAGS_HAVE_VECTORCALL.

    Not necessarily. There are subtleties involved when subclassing: there are cases where tp_vectorcall_offset needs to be non-zero despite _Py_TPFLAGS_HAVE_VECTORCALL not being set. See also PR 13858.

    @vstinner
    Copy link
    Member Author

    Cython issue: cython/cython#2976

    @jdemeyer
    Copy link
    Contributor

    IMHO the simplest and safest option for this issue is to revert tp_print removal and move tp_vectorcall_offset at the end of PyTypeObject.

    That's a drastic solution to a rather simple problem. PR 14009 would fix Cython just fine.

    @vstinner
    Copy link
    Member Author

    me:

    I understand that tp_vectorcall_offset=0 is fine and the expected value for a type which doesn't have the flag _Py_TPFLAGS_HAVE_VECTORCALL.

    Jeroen Demeyer:

    Not necessarily. There are subtleties involved when subclassing: there are cases where tp_vectorcall_offset needs to be non-zero despite _Py_TPFLAGS_HAVE_VECTORCALL not being set. See also PR 13858.

    Cython generates C code which looks like:

    if (PyType_Ready(type) < 0) { ... handle error ... }
    type->tp_print = 0;

    This code can cause subtle and annoying issue if PR 13858 is merged. So that's another argument in favor of reintroducing tp_print in Python 3.8.

    --

    Cython has already been fixed to no longer set tp_print to 0 on Python 3.8:

    cython/cython@f10a0a3

    But again, this problem is not about correctness, but more about practical backward compatibility issues (see my first message).

    @vstinner
    Copy link
    Member Author

    cython/cython@f10a0a3

    Note for myself: this fix is already part of Cython 0.29.10 released at June 2 (10 days ago).

    @jdemeyer
    Copy link
    Contributor

    This code can cause subtle and annoying issue if PR 13858 is merged.

    I don't see how this is related to PR 13858 at all. Please explain.

    @jdemeyer
    Copy link
    Contributor

    If we decide to not reintroduce tp_print, tp_print removal must be documented at:
    https://docs.python.org/dev/whatsnew/3.8.html#changes-in-the-c-api

    I'll add that to PR 13844.

    @jdemeyer
    Copy link
    Contributor

    This code can cause subtle and annoying issue if PR 13858 is merged.

    Or maybe you're confusing tp_vectorcall_offset and tp_vectorcall again?

    @vstinner
    Copy link
    Member Author

    Or maybe you're confusing tp_vectorcall_offset and tp_vectorcall again?

    The two are related, no?

    Honestly, reintroducing tp_print looks simple and safe enough. I'm in favor of doing that. Is there any drawback? (as I wrote, I don't think that the size of PyTypeObject really matters in practice)

    @jdemeyer
    Copy link
    Contributor

    Honestly, reintroducing tp_print looks simple and safe enough. Is there any drawback?

    1. The one-time effort to change the code and documentation back.

    2. Stefan Behnel wanted to use tp_print for backporting vectorcall in Cython to earlier Python versions.

    3. Stable ABI incompatibility (the complaint that you incorrectly added to PR 13858 would become correct).

    @jdemeyer
    Copy link
    Contributor

    The two are related, no?

    Related in the same way that tp_dictoffset and tp_dict are related (i.e. not that much).

    @zooba
    Copy link
    Member

    zooba commented Jun 12, 2019

    "tp_print" has been marked as reserved for all of Python 3. To me, that means deprecated and do not use. (But perhaps that ought to have been properly defined in the docs?)

    Cython should not be using this field directly. If all they're doing is setting it to NULL, they can equally easily zero out the entire structure and ignore it without changing behavior on any Python 3.x.

    I have no problem changing the names of deprecated/reserved fields in PyTypeObject between releases. Source compatibility guarantees do not extend that far.

    @zooba
    Copy link
    Member

    zooba commented Jun 12, 2019

    Jeroen Demeyer proposed PR 14009 to fix the source compatibility:

    #define tp_print tp_vectorcall

    This is a terrible idea, FWIW. Please don't do this.

    @scoder
    Copy link
    Contributor

    scoder commented Jun 12, 2019

    they can equally easily zero out the entire structure and ignore it without changing behavior on any Python 3.x.

    Any solution that we apply in Cython will require users to regenerate their .c sources with a new Cython version in order to make it compile in Py3.8. The main decision point in this ticket is: should they need to or not? Everything else is just minor technicalities. (I didn't bring up this topic, but the question is up on the table now.)

    I have no problem changing the names of deprecated/reserved fields in PyTypeObject between releases. Source compatibility guarantees do not extend that far.

    Fair enough, and I'm ok with letting CPython move forward cleanly and break things that are easily fixed on user side.

    My point is that it makes no sense to justify bpo-37221 with the goal of not breaking Cython modules, when at the same time another change (the one discussed here) has already broken them. Either we find a striking justification for bpo-37221 that *excludes* Cython generated modules, or we take back *both* changes and restore full source code compatibility. Everything in between is just causing useless code churn and annoyance on all sides.

    @scoder
    Copy link
    Contributor

    scoder commented Jun 12, 2019

    (I forgot to state the obvious third option, which is: don't do anything and leave everything as it is now in beta-1.)

    @zooba
    Copy link
    Member

    zooba commented Jun 12, 2019

    Any solution that we apply in Cython will require users to regenerate their .c sources with a new Cython version in order to make it compile in Py3.8. The main decision point in this ticket is: should they need to or not?

    Considering Cython had a bug when it generated those sources (directly using a reserved field), then yes, users should have to regenerate with a fixed version.

    My point is that it makes no sense to justify bpo-37221 with the goal of not breaking Cython modules, when at the same time another change (the one discussed here) has already broken them.

    I agree. Here's my distinction:

    bpo-37221 breaks any user who correctly followed the documentation: https://docs.python.org/3.7/c-api/code.html#c.PyCode_New

    This issue breaks users who did not follow the documentation: https://docs.python.org/3.7/c-api/typeobj.html#c.PyTypeObject.tp_print

    @jdemeyer
    Copy link
    Contributor

    This is a terrible idea, FWIW. Please don't do this.

    Why not? It's a simple pragmatic solution that fixes an actual problem at essentially no cost to CPython.

    You're right of course that Cython shouldn't set tp_print to 0 on Python 3. But I don't think that's a relevant argument.

    @scoder
    Copy link
    Contributor

    scoder commented Jun 14, 2019

    I agree with Steve that broadly redefining a global name in a widely used header file is not a good idea. You never know what names users have in their code. Yes, you can work around it by undef-ing it again, but honestly, in that case, both sides are hacks.

    @jdemeyer
    Copy link
    Contributor

    IMHO the simplest and safest option for this issue is to revert tp_print removal and move tp_vectorcall_offset at the end of PyTypeObject.

    I just realized that there is a much simpler solution: add back tp_print at the *end* of the PyTypeObject structure. This will fix the Cython problem and is even less likely to break stuff. I'm doing that in PR 14193.

    @vstinner
    Copy link
    Member Author

    If we should put in the workaround for 3.8, how does that make it okay to remove in 3.9?

    Cython has been modified to no longer set tp_print. The idea is just to give more time (one Python major release) to maintainers of C extensions to regenerate their Cython .c files.

    @zooba
    Copy link
    Member

    zooba commented Jun 19, 2019

    a class compiled with 3.7 could subclass a class compiled with 3.8 which uses vectorcall.

    This isn't a supported scenario though. The only way to mix extension modules compiled against different Python versions is to use the limited ABI, which does not include PyTypeObject. And PyType_FromSpec is going to allocate a PyTypeObject according to the version of Python that is running, not the one it was compiled against. So even if someone could force this scenario, it's not one we have to worry about.

    The idea is just to give more time (one Python major release) to maintainers of C extensions to regenerate their Cython .c files.

    If the beta cycle isn't long enough for this, then we need a longer beta cycle, since that's the whole point of it. Projects that can't make at least one new release with update Cython in time for Python 3.8 are presumably completely unmaintained - it's not like the 3.8 release is a huge surprise - and if completely unmaintained they aren't going to get a new release for 3.9 either. So either we fix them permanently by not removing tp_print ever, or we give them an entire beta release period to update their code for the new version of Python.

    There's a discussion at https://discuss.python.org/t/pep-596-python-3-9-release-schedule-doubling-the-release-cadence/1828 about updating the release cycle, and a couple of PEPs being written. That would be a good place to suggest that our current beta release is not long enough for packages to adapt to new releases.

    @jdemeyer
    Copy link
    Contributor

    This isn't a supported scenario though.

    Of course it's not! I was trying to tell you that already in msg345969 but you insisted in msg345988 that we should support that anyway.

    @jdemeyer
    Copy link
    Contributor

    The fact that tp_vectorcall_offset replaces tp_print was a very conscious decision that we made already when designing the PEPs (both PEP-580 and PEP-590). We shouldn't just throw that away without a good reason.

    So far I haven't seen any good reason, only an unsubstantiated claim that it is supposedly more backwards compatible to put tp_vectorcall_offset at the end.

    @encukou
    Copy link
    Member

    encukou commented Jun 20, 2019

    I agree with Steve that renaming tp_print to tp_vectorcall_offset is well within our rights.

    For reference, the rationale for the replacing is the first section of:
    https://mail.python.org/pipermail/python-dev/2018-June/153927.html
    I hope Cython's situation here has not changed. (In that case, we should bring tp_print back, and add tp_vectorcall_offset to the end.)

    Now, I see the question of bringing tp_print back as a *practical* compatibility matter, a workaround for the combination of two unfortunate issues:

    • Cython sets tp_print in Python 3 (which is a bug, but honestly, the word "reserved" could very well be interpreted as "unused" rather than "don't touch this" -- and even the post linked above, and by extension PEP-590, implies that interpretation!)
    • Packages hard-code Cython-generated files (which is a workaround for a long-standing setuptools limitation that is only recently being addressed).

    If we bring tp_print back, it would be because *we shouldn't punish users* even if we can. Nobody did anything wrong, apart from a quite understandable bug/misunterstanding.

    Note that 3rd party projects *do not* have the whole beta cycle to adapt. Until NumPy works, its dependents can't start adapting. (Unless adapting is done without PyPI packages, like the Fedora rebuild where we now "re-Cythonize" everything.)
    On the other hand, 3rd party projects don't really *need* to support 3.8.0 as soon as it's released. It's just that if they don't support the betas and RCs, we're don't get feedback for them in time.

    So I think the question is: Is (the rest of) the beta cycle enough for enough of the ecosystem to release new wheels, or should we give them the whole 3.9 cycle?

    I believe 3.8 beta2 is enough, and we don't need to bring tp_print back.

    @zooba
    Copy link
    Member

    zooba commented Jun 20, 2019

    I was trying to tell you that already in msg345969 but you insisted in msg345988 that we should support that anyway.

    I made a general note that "extremely unlikely" isn't unlikely enough to be an argument.

    You'll notice I didn't quote any of the rest of your specific scenario, because I wasn't trying to address it. And in any case, you left out the cross-version part in that description, so you did actually describe a supported case.

    (I'm not going to argue this particular point any further. If I can't help you learn the ways we do decision making on a project like CPython then I'll let someone else try.)

    @jdemeyer
    Copy link
    Contributor

    Can we then bring back the discussion to the core of this issue: should we restore tp_print or not in PyTypeObject?

    Note that the PR for the other similar issue bpo-37221 has been accepted for merging.

    @zooba
    Copy link
    Member

    zooba commented Jun 20, 2019

    Note that the PR for the other similar issue bpo-37221 has been accepted for merging.

    I've already explained why that issue is different (it breaks a compatibility guarantee; this one changes something that was explicitly deprecated).

    Petr's question seems to be the right one that we have not yet answered:

    Is (the rest of) the beta cycle enough for enough of the ecosystem to release new wheels, or should we give them the whole 3.9 cycle?

    (And since the 3.8 RM has not been here to discuss release cycle stuff, let's invite him in)

    @jdemeyer
    Copy link
    Contributor

    If we're in doubt, we should just apply PR 14193. It's essentially a one-line patch that fixes actual problems experienced by users and which doesn't break anything. Why should we *not* apply it?

    I know that we're not obliged to fix anything, but that is besides the point given that a fix exists.

    @willingc
    Copy link
    Contributor

    Petr, Thanks for the thoughtful summary.

    If I am understanding the many messages in this and the other related issue and by looking at this with a scientific Python hat on (not as the Steering Council), I'm inclined to favor Nick's approach if possible:

    1. In 3.8.0b2, move tp_print to the end of the structure (rather than removing it entirely) to make the "type->tp_print" in the old Cython generated code harmless rather than a compilation error
    2. In 3.9.0a1, remove it entirely again

    I agree that we shouldn't punish our users. IMHO Cython should get the benefit of a compromise, though not a long term guarantee, related to tp_print since Cython addresses a very real need for performance in scientific code.

    While Cython may be able to respond within the beta period, scientific projects that depend on it may lag longer as rebuilds and releases will need to happen on PyPI, conda, and conda-forge. Release versioning is critical in the science world as we depend on it for scientific research reproducibility. While we may not come up with an ideal solution for everyone, let's try to find something acceptable for Cython and the Core Python devs/users.

    @zooba
    Copy link
    Member

    zooba commented Jun 20, 2019

    While Cython may be able to respond within the beta period, scientific projects that depend on it may lag longer as rebuilds and releases will need to happen on PyPI, conda, and conda-forge. Release versioning is critical in the science world as we depend on it for scientific research reproducibility. While we may not come up with an ideal solution for everyone, let's try to find something acceptable for Cython and the Core Python devs/users.

    Cython has already responded and made a new release, so the problem would be whether the scientific projects will release new sdists some time in the next four months (they will almost all have to do new wheels for 3.8 too).

    Is four months really an unreasonable timeline? I thought most of these packages released more frequently than CPython.

    @willingc
    Copy link
    Contributor

    Is four months really an unreasonable timeline? I thought most of these packages released more frequently than CPython.

    I can't really predict with any accuracy what the release cycle is for most scientific projects. Anywhere from 1 month to 9 months probably covers 90% of the actively maintained scientific projects. So is four months unreasonable? Not entirely unreasonable IMHO. Six months seems a safer estimate. What makes it difficult to predict is that many of the scientific projects do not have rigid release cycles and push out releases when a specific feature or bug fix warrants it.

    Perhaps it would be helpful for someone to recap the current options on the table for moving forward and the most significant risk to Cython, Projects using Cython, or CPython by selecting each option.

    Thanks.

    @jdemeyer
    Copy link
    Contributor

    Perhaps it would be helpful for someone to recap the current options on the table for moving forward and the most significant risk to Cython, Projects using Cython, or CPython by selecting each option.

    There are three options:

    (1) Status-quo (keep everything as is in 3.8b1): this breaks any Cython project which has not been recythonized with the latest Cython. For example, it is (together with bpo-37221 which is planned to be fixed) the reason that "pip install numpy" doesn't work with 3.8b1. Typically, projects run Cython at packaging time, so this needs a new source release of every single Cython-using package.

    (2) Apply PR 14193: the only disadvantage that I can think of is that types would get 8 bytes larger. We should stop discussing and just do this.

    (3) Put back tp_print the way it was in 3.7: same as (2), except that it goes against PEP-590 as putting tp_vectorcall_offset in place of tp_print was a design decision made in that PEP. It would break ABI (any static type compiled with 3.7 would break on 3.8) and it prevents Cython from backporting vectorcall to 3.7. It's also a larger change compared to 3.8b1, so it's more work and more risky. In my personal opinion, the worst of the three options.

    Then there is the further question whether we should apply this only on 3.8 or possibly also later versions. But here the consensus seems to be 3.8 only.

    @jdemeyer
    Copy link
    Contributor

    Is four months really an unreasonable timeline? I thought most of these packages released more frequently than CPython.

    I'm mostly worried about dependency chains. Right now, any project which depends on numpy cannot even test on 3.8b1 because numpy doesn't build.

    @encukou
    Copy link
    Member

    encukou commented Jun 21, 2019

    scientific projects that depend on it may lag longer as rebuilds and releases will need to happen on PyPI, conda, and conda-forge.

    Keep in mind that binary artifacts (wheels, Conda/distro packages) should not be affected. They need to be re-built for Python 3.8 *either way*, because the ABI changed.

    The only artifacts that putting tp_print back in helps are sdists that include pre-generated Cython output, plus possibly projects that pin Cython to an older version.

    Specifically, for any platform without a C compiler installed, users aren't helped by putting tp_print back in.

    So:

    (1) Status-quo (keep everything as is in 3.8b1):

    • binary/compiled distributions need to be re-built (*1)
    • sdists need to be re-generated with an up-to-date Cython (*2)
    • projects that pin Cython need to update it.
      + pure source distributions (where compiling involves running Cython) works as before (*3)
      + type objects do not waste memory
      + Cython can implement vectorcall optimizations for past Python versions (before 3.8)

    (2) Apply PR 14193:

    • binary/compiled distributions still need to be re-built (*1)
      + sdists DO NOT need to be re-generated (*2)
      + projects that pin Cython DO NOT need to update it.
      + pure source distributions (where compiling involves running Cython) works as before (*3)
    • type objects get 8 bytes larger
      + Cython can implement vectorcall optimizations for past Python versions (before 3.8)

    (3) Put back tp_print the way it was in 3.7:

    • binary/compiled distributions still need to be re-built (*1)
      + sdists DO NOT need to be re-generated (*2)
      + projects that pin Cython DO NOT need to update it.
      + pure source distributions (where compiling involves running Cython) works as before (*3)
    • type objects get 8 bytes larger
    • Cython can NOT implement vectorcall optimizations for past (pre-3.8) Python versions

    *1: applicable to most wheel consumers, conda users, etc.
    *2: applicable to sdist consumers with a compiler
    *3: should be applicable to Conda/Fedora/Debian/... packagers (not so much end users)

    Is four months really an unreasonable timeline? I thought most of these packages released more frequently than CPython.

    Note that maintainers of these packages need to take an action (build wheels) anyway to support Python 3.8 (on compiler-less platforms). They need to test, and ideally even read "Porting to Python 3.8" and adjust the code.
    Is re-generating Cython output, and bumping the version, too much of an *added* burden here?

    If a project doesn't release 3.8 wheels by Python 3.8.0rc1, we don't get feedback on it from PyPI users (but Conda & Linux distros should be unaffected, if they run Cython on build).

    @zooba
    Copy link
    Member

    zooba commented Jun 21, 2019

    FWIW, I agree totally with Petr's analysis.

    Jeroen's analysis ignores some points, has technical inaccuracies (nobody can use a static type compiled for 3.7 with 3.8), and is based on precedents that do not apply to this situation. (All of which I've mentioned above, but I assume nobody is reading the whole repetitive thread anymore.)

    @jdemeyer
    Copy link
    Contributor

    based on precedents that do not apply to this situation

    I'm curious what you mean here...

    @ncoghlan
    Copy link
    Contributor

    It isn't the actively maintained projects that publish wheel files that I'm worried about fixing - it's the sdist only projects that would otherwise only need recompilation to work on the new Python version.

    With this fixed, and the PyCode_New -> PyCode_New + PyCode_WithPosArgs fix from bpo-37221 merged, then that long tail will get another few years of life, and will start emitting deprecation warnings for the part that's actually going to go away in Python 3.9 (i.e. the tp_print slot).

    My opinion would be different if tp_print had actually been emitting compile time warnings for the past 10 years, but it hasn't been - we just haven't been using it for anything.

    @zooba
    Copy link
    Member

    zooba commented Jun 21, 2019

    I'm curious what you mean here...

    msg345409

    @zooba
    Copy link
    Member

    zooba commented Jun 21, 2019

    With this fixed, and the PyCode_New -> PyCode_New + PyCode_WithPosArgs fix from bpo-37221 merged, then that long tail will get another few years of life, and will start emitting deprecation warnings for the part that's actually going to go away in Python 3.9 (i.e. the tp_print slot).

    By definition (as unmaintained packages), that's a few more years until death. Nobody is looking at their compile time warnings (that would require maintenance ;) ), so nothing will happen until it actually breaks. Keeping the field for one more version makes less sense than keeping it forever.

    @willingc
    Copy link
    Contributor

    Thanks Petr for the wonderful summary.

    Of the pros/cons that you mention (if I am understanding correctly), I think that the issue comes down to sdist consumers and regeneration.

    1. If sdist consumers will need to regenerate an sdist?
    2. How many sdist consumers will likely be impacted?
    3. If impacted, how difficult would the regeneration be for a scientist who is moderately comfortable with compilation?
    4. Is there any guidance or information that can be given to sdist consumers?

    Do we have any good metrics/insights on Question 2 and 3?

    ---

    While TypeObject size, updating a pinned version, and vectorcall optimization for prior versions are considerations, I believe that they are secondary:

    • Typeobject size: Nice to have a smaller size but probably not a concern of the sdist consumer or general user.
    • Update a pinned Cython version: General expectation is it should work with the environment which it was pinned within. There should not be an expectation that the pinned version will work if other parts of the environment change.
    • vectorcall optimization extends to prior versions: While perhaps nice to have, it is likely more an added benefit instead of an implicit contract or expectation.

    @encukou
    Copy link
    Member

    encukou commented Jun 24, 2019

    1. If sdist consumers will need to regenerate an sdist?

    They would need to compile from the source, e.g. clone from Git and make an sdist (or install) from that.

    1. How many sdist consumers will likely be impacted?

    I don't know. Depends on how fast people migrate to Python 3.8.

    1. If impacted, how difficult would the regeneration be for a scientist who is moderately comfortable with compilation?

    See (1.)

    1. Is there any guidance or information that can be given to sdist consumers?

    "Raise the issue with the sdist maintainers" is probably their best bet.

    ---

    Realistically, I expect those people to stay on Python 3.7- until their dependencies are updated to (and also tested with!) 3.8.

    I've seen anecdotal evidence of this hitting people who want to use real-world projects to benchmark Python 3.8+: https://mail.python.org/archives/list/python-dev@python.org/message/TF2JI7G5ZMCGUMM3AWNSCQDYVFNRPMQ4/

    If we "stop discussing and just [Apply PR 14193]", people wen't be blocked from trying out 3.8, even if sdist maintainers don't care about 3.8 yet.

    @ncoghlan
    Copy link
    Contributor

    There's also the question of complying with our own deprecation policy, as the approval to ditch tp_print API compatibility in PEP-590 relied on the fact that it has been deprecated since Python 3.0.

    However, it turns out the associated programmatic deprecation had never actually been implemented, which means it is now appropriate to provide that deprecation warning belatedly in Python 3.8, and then upgrade to full removal in Python 3.9.

    That way the folks building affected sdists from source will get the additional warnings to tell them "Hey, you have a looming maintainability problem to solve (and a year or two to solve it)", rather than the current situation, which is "You have a dependency management crisis *now*, and until you solve it, you can't even start testing on Python 3.8, since you won't even be able to build your dependencies, let alone install them".

    @scoder
    Copy link
    Contributor

    scoder commented Jun 24, 2019

    Thank you, Petr, for bringing this discussion back on a solid basis, actually twice already.

    And sorry for causing this problem in the first place. The "tp_print = 0" was necessary to fix misbehaviour in Py2, and we should have restricted it to Py2 back then (that line has been there since late 2013).

    I think keeping some kind of a "tp_print" named field in PyTypeObject for now is the best we can do. I like Jeroen's PR (thanks for taking care of so many things at a time, Jeroen!), and I think we should just go with it.

    I'm happy to release a new point release for Cython, as soon as it's clear what CPython will do, and how Cython can detect the change to adapt to it.

    @willingc
    Copy link
    Contributor

    Thank you Petr. Looking at your response, the full discussion here, and the fact that we don't have an easy way to put a number on who would be impacted, I agree that merging the PR seems the best course of action.

    Thanks Stefan for additional follow up with Cython too.

    @encukou
    Copy link
    Member

    encukou commented Jun 25, 2019

    New changeset d917cfe by Petr Viktorin (Jeroen Demeyer) in branch '3.8':
    [3.8] bpo-37250: put back tp_print for backwards compatibility (GH-14193)
    d917cfe

    @ncoghlan ncoghlan closed this as completed Jul 1, 2019
    @ncoghlan ncoghlan added the build The build process and cross-build label Jul 1, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants