-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 657: use 0-indexed offsets #2022
Conversation
CC: @ammaraskar |
pep-0657.rst
Outdated
while offsets bigger than these values will be treated as missing (value of 0). | ||
all AST nodes. The output of the public APIs (``co_positions`` and ``PyCode_Addr2Location``) | ||
that deal with these attributes use 0-indexed offsets (just like the AST nodes) and designates | ||
``-1`` as the return code for information not available, but the underlying implementation is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also more consistent with the PyCode_Addr2Line
, since it also returns -1
on error but for the sake of consistency we used to convert -1
to 0
on PyCode_Addr2Location
. Now that cast won't be needed.
7796281
to
db72bf3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with Pablo's suggestion incorporated.
091adcf
to
cfdc826
Compare
The merged version looks good to me. |
Whoa, don’t we use 1-based offsets in SyntaxError attributes? |
There's a bit of discussion around that here: python/cpython#26958 (comment) I think the consensus ended up being that |
Hm, possibly, but we chose 1-based because editors like vim and emacs interpret column numbers as 1-based. Where else do we use 0-based offsets te refer to source locations? (In Python code. Of course in C code it’s 0-based all the way down. :-) |
In the AST, all offsets are 0 based and that's one of the primary motivations as we want to be as close as those as possible:
so one of the advantages of 0-based is that every tool that currently deals with these can be used without modifications to deal with the new positions. We asked around some authors and seems that everyone frefers 0-based positions (so they can also do |
I not sure exact what "In Python code." is meant to say, but both Python and tk (and hence tkinter and IDLE) are 0-based for both indexing and slicing. |
What I meant is when you print column offsets for end-user tools like editors to use, 1-based seems to be the norm. OTOH when you are using this to index into an array of lines, 0-based is more convenient and seems conventional. (Everybody agrees that line numbers start at 1 though. :-) I'm not yet sure whether PEP 657 is more faced towards end users and their editors, or towards developers of Python libraries that will need to slice and dice the source code. I do observe that in traceback.py the TraecbackException object has a documented offset field that's 1-based (since it is copied from SyntaxError). I guess there will be a few traps no matter what we decide... |
Oh, i see what you mean. Notice that we are not showing the numbers at the moment (and we don't plan to) so users will not see "error in column x". These numbers are mainly focused for tools to use them to grab the correct piece of source code that is associated with an instruction. If we ever show these numbers, we can easily make them 1-based only for the display.
Right, after asking around seems that there is an opportunity to avoid a slightly bigger subset of tracks for libraries consuming these numbers. So this decision would prioritize reusing existing code and may give some small surprises when developing new code. I think is a good balance but I totally understand if someone sees the other way :) |
Okay, sounds good! |
I see the essence of PEP 657 as making the full AST position info available on the code object as iterable co_positions. bpo-43950 translates the position info, within the existing traceback.py traceback construction method, into a new caret line added under the code line. bpo-44569, just opened, will factor the per-frame formatting code into a separate method that users can replace. Users can then handle the position info as they wish while customizing frame formatting, all without giving up other traceback features (like truncating tracebacks resulting from runaway recursion). IDLE, for instance. will omit the caret line and instead include the position info so that Shell can highlight the error slice within the code line. |
For the background of this change, see the discussion on: python/cpython#26958 (comment)