Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast: Different FormattedValue expressions have same col_offset information #81639

Closed
WeijarZ mannequin opened this issue Jul 1, 2019 · 10 comments
Closed

ast: Different FormattedValue expressions have same col_offset information #81639

WeijarZ mannequin opened this issue Jul 1, 2019 · 10 comments
Assignees
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@WeijarZ
Copy link
Mannequin

WeijarZ mannequin commented Jul 1, 2019

BPO 37458
Nosy @ericvsmith, @lysnikolaou, @isidentical

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/ericvsmith'
closed_at = None
created_at = <Date 2019-07-01.02:10:41.834>
labels = ['interpreter-core', 'type-bug', '3.10']
title = 'ast: Different FormattedValue expressions have same col_offset information'
updated_at = <Date 2020-07-04.23:02:28.764>
user = 'https://bugs.python.org/WeijarZ'

bugs.python.org fields:

activity = <Date 2020-07-04.23:02:28.764>
actor = 'eric.smith'
assignee = 'eric.smith'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2019-07-01.02:10:41.834>
creator = 'Weijar Z'
dependencies = []
files = []
hgrepos = []
issue_num = 37458
keywords = []
message_count = 5.0
messages = ['346950', '346981', '372989', '372991', '373002']
nosy_count = 4.0
nosy_names = ['eric.smith', 'lys.nikolaou', 'BTaskaya', 'Weijar Z']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue37458'
versions = ['Python 3.10']

@WeijarZ
Copy link
Mannequin Author

WeijarZ mannequin commented Jul 1, 2019

This express

f"{x}{x}{y}"

will produce ast tree:

{
"$node": "Module",
"body": [
{
"$node": "Expr",
"value": {
"$node": "JoinedStr",
"values": [
{
"$node": "FormattedValue",
"value": {
"$node": "Name",
"id": "x",
"ctx": {
"$node": "Load"
},
"lineno": 1,
"col_offset": 3
},
"conversion": -1,
"format_spec": null,
"lineno": 1,
"col_offset": 0
},
{
"$node": "FormattedValue",
"value": {
"$node": "Name",
"id": "x",
"ctx": {
"$node": "Load"
},
"lineno": 1,
"col_offset": 3
},
"conversion": -1,
"format_spec": null,
"lineno": 1,
"col_offset": 0
},
{
"$node": "FormattedValue",
"value": {
"$node": "Name",
"id": "y",
"ctx": {
"$node": "Load"
},
"lineno": 1,
"col_offset": 9
},
"conversion": -1,
"format_spec": null,
"lineno": 1,
"col_offset": 0
}
],
"lineno": 1,
"col_offset": 0
},
"lineno": 1,
"col_offset": 0
}
]
}

These two variable 'x' has same col_offset '3', is it wrong?

@WeijarZ WeijarZ mannequin added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 (EOL) end of life type-bug An unexpected behavior, bug, or error labels Jul 1, 2019
@ericvsmith
Copy link
Member

I'm working on overhauling how these are calculated. But it's complex, and is taking a while.

@ericvsmith ericvsmith self-assigned this Jul 1, 2019
@ericvsmith
Copy link
Member

I still see this problem with 3.10, which I thought might have fixed this.

@lys.nikolaou: any ideas on this?

@ericvsmith ericvsmith added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.10 only security fixes and removed stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 (EOL) end of life labels Jul 4, 2020
@lysnikolaou
Copy link
Member

I still see this problem with 3.10, which I thought might have fixed this.
Nope, that's still true in 3.10.

I'm working on overhauling how these are calculated. But it's complex, and is taking a while.
In short, the FormattedValue nodes all have the exact same lineno's and col_offset's, which are also identical to those of the enclosing JoinedStr node. These values are equal to lineno and col_offset of the first STRING token and end_lineno and end_col_offset of the last STRING token (note that the grammar accepts a STRING+ node).

any ideas on this?
Moving f-string parsing to the parser, which we've been talking about lately, would solve this. But this will probably take some time, since I currently do not have the time. It's probably going to be a good project for this coming fall.

Another alternative, in case we don't want to wait until then, would be for the handwritten f-string parser to have its own instances of a lineno and col_offset, so that they can be used when the FormattedValue nodes are created. This would probably also require some effort though, so I'm not sure we want to do it, before we really know if we're gonnna proceed with the "moving f-string parsing to PEG" project.

@ericvsmith
Copy link
Member

I think waiting until we decide what to do with the parser makes sense. This problem has been around for a while, and while it's unfortunate I don't think it's worth heroic measures to fix.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@alexmojaki
Copy link

This was fixed in #27729, right?

@lysnikolaou
Copy link
Member

I'm not sure whether it's been completely fixed, definitely there's been improvements since the issue was opened here. We're working on moving f-string parsing into the PEG parser, which'll certainly help close this once and for all.

@alexmojaki
Copy link

The PR I linked includes fixing FormattedValues which contain identical-looking expressions, which I think is the problem being described here. The specific example is fixed for me in 3.9.7 and 3.10.

Are there any remaining problems you know of relating to incorrect lineno/col_offset in AST nodes?

@youknowone
Copy link
Contributor

It seems not fixed at least until 3.11

Python 3.11.1 (main, Feb  4 2023, 11:11:18) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
>>> import ast
>>> expr = ast.parse('''\n\n\n\n\nf"Warning: '{fieldname}' should be a list, got type '{typename}'"''').body[0]
>>> expr.value.values
[<ast.Constant object at 0x1004e9c90>, <ast.FormattedValue object at 0x1004ebc70>, <ast.Constant object at 0x1004ea2c0>, <ast.FormattedValue object at 0x1004eb3a0>, <ast.Constant object at 0x1004eb190>]
>>> def l(n): return ((n.lineno, n.col_offset), (n.end_lineno, n.end_col_offset))
... 
>>> for v in expr.value.values: print(l(v))
... 
((6, 0), (6, 65))
((6, 0), (6, 65))
((6, 0), (6, 65))
((6, 0), (6, 65))
((6, 0), (6, 65))

@lysnikolaou
Copy link
Member

This has been fixed in 3.12 after the PEP 701 implementation (#102855) was merged. Closing.

jacobtylerwalls added a commit to jacobtylerwalls/astroid that referenced this issue Jun 22, 2023
jacobtylerwalls added a commit to jacobtylerwalls/astroid that referenced this issue Jun 23, 2023
jacobtylerwalls added a commit to jacobtylerwalls/astroid that referenced this issue Jun 25, 2023
jacobtylerwalls added a commit to pylint-dev/astroid that referenced this issue Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants