Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace old format string interpolation with f-strings #4400

Merged
merged 1 commit into from
Sep 30, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Sep 25, 2020

Fixes #4394

Since Python 3.5 is no longer supported, format string interpolations
can now be replaced by f-strings, introduced in Python 3.6, which are
more readable, require less characters and are more efficient.

Note that pylint issues a warning when using f-strings for log
messages, just as it does for format interpolated strings. The reasoning
is that this is slightly inefficient as the strings are always
interpolated even if the log is discarded, but also by not passing the
formatting parameters as arguments, the available metadata is reduced.
I feel these inefficiencies are premature optimizations as they are
really minimal and don't weigh up against the improved readability and
maintainability of using f-strings. That is why the pylint config is
update to ignore the warning logging-fstring-interpolation which
replaces logging-format-interpolation that was ignored before.

@codecov
Copy link

codecov bot commented Sep 25, 2020

Codecov Report

Merging #4400 into develop will decrease coverage by 0.02%.
The diff coverage is 45.68%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4400      +/-   ##
===========================================
- Coverage    79.24%   79.22%   -0.01%     
===========================================
  Files          475      475              
  Lines        34826    34831       +5     
===========================================
- Hits         27594    27593       -1     
- Misses        7232     7238       +6     
Flag Coverage Δ
#django 73.07% <43.14%> (+0.01%) ⬆️
#sqlalchemy 72.30% <43.21%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...jsite/db/migrations/0015_invalidating_node_hash.py 100.00% <ø> (ø)
...backends/djsite/db/migrations/0024_dblog_update.py 93.50% <0.00%> (ø)
...site/db/migrations/0032_remove_legacy_workflows.py 61.54% <0.00%> (ø)
...migrations/0037_attributes_extras_settings_json.py 86.85% <0.00%> (ø)
...a/backends/djsite/db/migrations/0039_reset_hash.py 100.00% <ø> (ø)
aiida/backends/djsite/db/models.py 81.29% <0.00%> (ø)
aiida/backends/djsite/settings.py 87.50% <0.00%> (ø)
aiida/backends/sqlalchemy/manager.py 78.00% <0.00%> (ø)
...migrations/versions/041a79fc615f_dblog_cleaning.py 91.21% <0.00%> (ø)
...ns/12536798d4d3_trajectory_symbols_to_attribute.py 95.66% <ø> (ø)
... and 231 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4544bc4...0698e19. Read the comment docs.

@sphuber
Copy link
Contributor Author

sphuber commented Sep 26, 2020

Who dares review this PR 😄 @greschd @ltalirz ?

@ltalirz
Copy link
Member

ltalirz commented Sep 26, 2020

First of all, thanks @sphuber for taking on the effort to clean this up!

Who dares review this PR 😄 @greschd @ltalirz ?

Hehe ;-) To me, the question is more how to best review this PR.
Could you perhaps describe how you did this - was this all done manually, did you use some search & replace operations, some refactoring tool, ...?
I wonder whether going through all of this line by line is the best way to spot potential issues...

P.S. This is just out of curiosity: Do you know why test coverage decreased (very mildly), despite f-strings being less verbose?

@sphuber
Copy link
Contributor Author

sphuber commented Sep 26, 2020

Hehe ;-) To me, the question is more how to best review this PR.
Could you perhaps describe how you did this - was this all done manually, did you use some search & replace operations, some refactoring tool, ...?

Sure, I sure as hell didn't do this manually but used the flynt tool. I first ran with default settings, meaning it will perform the safest transformation. I checked the changes manually and did some minor corrections and pushed to have the tests run. When all tests passed, I ran the tool again, but this time with the options --transform-concats --agressive -ll 120. Here also I checked the changes manually and did some minor corrections. Then I manually searched with regex ['"]\.format\( and fixed some cases manually. There are still quite a few remaining but didn't want to start addressing them all manually so I left them for now.

I wonder whether going through all of this line by line is the best way to spot potential issues...

Probably not. These kinds of changes are always a bit risky since we are relying purely on the tests but many of these lines may not actually be covered. pylint will give us some extra security but probably not a 100%. To be honest I am also not sure how much this can and should be reviewed.

P.S. This is just out of curiosity: Do you know why test coverage decreased (very mildly), despite f-strings being less verbose?

I have no idea, but honestly this happens all the time and we always just ignore it.

@sphuber sphuber force-pushed the fix/4394/f-strings branch 3 times, most recently from c3575aa to 00c8797 Compare September 29, 2020 15:38
@sphuber sphuber requested review from ltalirz and greschd September 29, 2020 17:47
@chrisjsewell
Copy link
Member

This is just out of curiosity: Do you know why test coverage decreased, despite f-strings being less verbose?

Simple, because there're fewer lines covered 😄, e.g.

- "AttributeError: '{}' is not a valid attribute of the object "
- "'{}'".format(attr, self.__class__.__name__)
+ f"AttributeError: '{attr}' is not a valid attribute of the object '{self.__class__.__name__}'"

So now <lines covered>/<total lines> is a smaller fraction

we always just ignore it.

You might not have to after #4413 😉

@chrisjsewell
Copy link
Member

For what its worth, I took a quick parse over the diff, and didn't notice any errors (my vision did start to blur a bit after the first hundred diffs though lol)

Note, flynt has a pre-commit hook: https://github.com/ikamensh/flynt#pre-commit-hook, should we consider adding it?

@sphuber
Copy link
Contributor Author

sphuber commented Sep 30, 2020

Note, flynt has a pre-commit hook: https://github.com/ikamensh/flynt#pre-commit-hook, should we consider adding it?

I thought about this yeah, but decided not to. But to be honest, I am not sure anymore why I thought it best not to add it. I have now added it. Let's see how that goes

@greschd
Copy link
Member

greschd commented Sep 30, 2020

TBH I don't see why always using f-strings should be enforced. Something like the following still makes sense in my book:

some_dict = {'a': 1, 'b': 2}
<...>
string = 'values: a={a}, b={b}'.format(**some_dict)

@chrisjsewell
Copy link
Member

Something like the following still makes sense in my book

Yeh but I don't think flynt would convert that?
anyway looks to be a weird problem with its CLI interaction with pre-commit?

@sphuber
Copy link
Contributor Author

sphuber commented Sep 30, 2020

TBH I don't see why always using f-strings should be enforced. Something like the following still makes sense in my book:

some_dict = {'a': 1, 'b': 2}
<...>
string = 'values: a={a}, b={b}'.format(**some_dict)

Flynt won't touch these though. There are indeed some cases where it is better to use implicit conversion with format.

Edit: just need to figure out how to call it in the pre-commit hook. Apparently pre-commit is now passing all files individually to flynt as arguments and it doesn't understand it.

@greschd
Copy link
Member

greschd commented Sep 30, 2020

Alright, in that case it's fine. Haven't tried flynt myself, so I'm commenting from a place of ignorance 🙃

@ltalirz
Copy link
Member

ltalirz commented Sep 30, 2020

This is just out of curiosity: Do you know why test coverage decreased, despite f-strings being less verbose?

Simple, because there're fewer lines covered 😄,

Right, my misunderstanding came from the assumption that what mattered was the number of uncovered lines, but we are actually triggering on the percentage of uncovered lines.
I guess it's a little more complicated than you suggest as well, since there will be less lines both in the uncovered and in the covered code. Assuming that coverage is counted correctly, the question becomes: is the density of lines saved higher in covered or uncovered code (a priori, I would probably have bet on the latter, since it's often error messages in Except blocks that go uncovered).

Naively, when making a PR what I actually care about is whether/how much I'm increasing the total number of uncovered lines.
The problem is that for new code this will be slightly nonzero in almost all cases, and so it is difficult to decide when to raise a warning flag.
At the moment, we're essentially saying: for new code, cover it at least as well as the average codebase (which is a moving target as our coverage has been slowly increasing over time).

@greschd
Copy link
Member

greschd commented Sep 30, 2020

Ideally (IMO), the coverage criteria would be:

  • for any new logic, a fixed coverage (e.g. 95%) needs to be reached
  • for slight modifications to existing code (such as here), coverage should remain the same, with some tolerance

Of course the problem is that there's no way of distinguishing between the two cases, so we just fall back on the second (less strict) requirement. But it's a red flag if the number of missed lines increases by a substantial number.

@ltalirz
Copy link
Member

ltalirz commented Sep 30, 2020

But it's a red flag if the number of missed lines increases by a substantial number.

Fully agree, and it would be nice to be able to trigger on this (the codecov.yml reference does not seem to offer this option (?)).

The first priority, however, is probably still to get the random fluctuations under control.
To stay with the the codecov report above, it actually reports an increase in the total number of missed lines as well:
image

You may have noticed that, some time ago, codecov started adding comments to missed lines in the "Files" tab, which is helpful. In this case, it makes clear that the report of these supposedly "uncovered lines" is really a bug, see e.g.
https://github.com/aiidateam/aiida-core/pull/4400/files#diff-31db4ebe16273a17841c21ee58d0d9baL184

Edit: It seems like the comments are added to all missed lines that are touched in the PR (whether they were missed before or not). I will need to figure out how to narrow them down to the 7 new misses that were reported or perhaps @sphuber remembers whether he added some extra lines somewhere?

.pre-commit-config.yaml Outdated Show resolved Hide resolved
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@sphuber sphuber force-pushed the fix/4394/f-strings branch 2 times, most recently from 5695149 to 6530471 Compare September 30, 2020 12:36
Since Python 3.5 is no longer supported, format string interpolations
can now be replaced by f-strings, introduced in Python 3.6, which are
more readable, require less characters and are more efficient.

Note that `pylint` issues a warning when using f-strings for log
messages, just as it does for format interpolated strings. The reasoning
is that this is slightly inefficient as the strings are always
interpolated even if the log is discarded, but also by not passing the
formatting parameters as arguments, the available metadata is reduced.
I feel these inefficiencies are premature optimizations as they are
really minimal and don't weigh up against the improved readability and
maintainability of using f-strings. That is why the `pylint` config is
update to ignore the warning `logging-fstring-interpolation` which
replaces `logging-format-interpolation` that was ignored before.

The majority of the conversions were done automatically with the linting
tool `flynt` which is also added as a pre-commit hook. It is added
before the `yapf` step because since `flynt` will touch formatting,
`yapf` will then get a chance to check it.
@sphuber sphuber merged commit 02248cf into aiidateam:develop Sep 30, 2020
@sphuber sphuber deleted the fix/4394/f-strings branch September 30, 2020 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace old format string interpolation with f-strings
4 participants