Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add black formatter with pre-commit hooks #1435

Conversation

oleksii-leonov
Copy link
Contributor

Proposal for #1295.

  • black formatter as community standard (~30k GitHub stars vs ~10k for the nearest competitor yapf).
  • pre-commit to enforce formatting locally.
  • Automatic "format on save" for VSCode/Codespaces.
  • GitHub action to run pre-commit on changed files for PRs. The action does not correct formatting automatically — it simply fails with a description of the reason.

This approach enforces formatting only on changed files and is gentle. But it will require contributors to fix the formatting of any existing files their further PRs will touch (so PRs would be a bit harder to review with multiple format-related changes).


Another option — with pre-commit run --all-files I could fix formatting for the project at once and add all those changes to this PR (~350 changed files, tests are green). This approach is a radical one, but it will simplify all further contributions.

@mstimberg
Copy link
Member

Hi @aleksejleonov Many apologies for the terribly late reply, and many thanks for the contribution 🙏 . I've been thinking about this for a while and agree that we should have an automatic code formatting solution in place, but I am still not quite sure what the best solution is.
I see the appeal of black's "all decisions are taken for you" approach, but unfortunately I do not agree with all of its decisions. My two main disagreements:

  1. Brian's unit system can lead to nicely readable values for quantities, but black makes them less readable by adding spaces everywhere:
# currently
g_Na = 100*nS/cm**2
V_r = -70*mV
V_th = V_r + 20*mV
# black-formatted
g_Na = 100 * nS / cm ** 2
V_r = -70 * mV
V_th = V_r + 20 * mV
  1. We recently had a big change to standardize our string handling (https://brian2.readthedocs.io/en/stable/developer/guidelines/style.html#string-formatting), and try to follow the “single quotes for data, double quotes for human-readable strings” convention. I don't really want to change this.

I still need to think about this a bit more, and try black on some example code. Point 1 is mostly an issue for examples/tutorials/etc., so maybe we could simply exclude everything outside of the brian2 package itself. For point 2, it seems that black's --skip-string-normalization option could do the trick, but I'd also want the merging/splitting over multiple lines feature which is currently only enabled with --preview – I am a bit afraid that we will get inconsistent code formatting with the next black update...

This approach enforces formatting only on changed files and is gentle. But it will require contributors to fix the formatting of any existing files their further PRs will touch (so PRs would be a bit harder to review with multiple format-related changes).

Another option — with pre-commit run --all-files I could fix formatting for the project at once and add all those changes to this PR (~350 changed files, tests are green). This approach is a radical one, but it will simplify all further contributions.

Even though it seems radical, I think the second option would actually be better, since I don't want to put any additional burden on contributors. This would also be consistent with the massive string-formatting commit we did earlier, and we already have a .git-blame-ignore-revs file that we can use to make sure this does not mess up history too much.

@mstimberg
Copy link
Member

Oh, and maybe we should also use something like isort/usort at the same time? We do state the usual conventions for imports in our docs (https://brian2.readthedocs.io/en/stable/developer/guidelines/style.html#general-recommendations), but of course in reality it is a bit of a mess...

@oleksii-leonov
Copy link
Contributor Author

Hi @mstimberg,

  1. Brian's unit system can lead to nicely readable values for quantities, but black makes them less readable by adding spaces everywhere.

Yes, I agree with you. Brian's unit system is a kind of DSL and is excellent for readability.

  • One of the options is to use # fmt: off and # fmt: on to disable auto-formatting for a specific block of code. It's usually done for table-like data representation in code when indentations are intentional.
  • Another option is to exclude the examples folder from Black's scope.
  1. We recently had a big change to standardize our string handling (https://brian2.readthedocs.io/en/stable/developer/guidelines/style.html#string-formatting), and try to follow the “single quotes for data, double quotes for human-readable strings” convention. I don't really want to change this.

That's an interesting question. From an aesthetic and semantic point of view, I agree with you.
But, from "easier for contributor" and "easier for maintainer" points of view, I prefer Black's approach.

Black (or any other formatter) can't reliably identify a string as "data" or "human-readable." So, the responsibility and burden for keeping the code style for string are moved to contributors and maintainers.

Black's approach (everything auto-formatted to double quotes) gives the following pros:

  • Fewer rules for contributors to learn (as Black fixes quote automatically).
  • The maintainer doesn't need to check the code style of the strings in PRs and spends time communicating back to contributors about code style rules.

I am taking part in a large codebase project with tens of actively contributing developers. For some time, I tried to keep some aesthetical code style rules different from the community standard formatter. But at some point, I gave up and switched to the community standard configuration. And it was a good decision — code reviews become faster and easier.

@oleksii-leonov
Copy link
Contributor Author

@mstimberg ,

Oh, and maybe we should also use something like isort/usort at the same time? We do state the usual conventions for imports in our docs (https://brian2.readthedocs.io/en/stable/developer/guidelines/style.html#general-recommendations), but of course in reality it is a bit of a mess...

Yes, I totally agree with using isort. Also, I would like to addpylint as a linter to pre-commit and Ci.
I believe, it's better to make such changes in smaller steps :)

This PRs covers pre-commit + GitHub Actions + black.
Adding isort after pre-commit + GitHub Actions in place is a trivial task that I would prefer to do in separate PRs.

@oleksii-leonov
Copy link
Contributor Author

I'd also want the merging/splitting over multiple lines feature which is currently only enabled with --preview – I am a bit afraid that we will get inconsistent code formatting with the next black update...

I am totally fine with --preview option.

  1. The particular version of black is fixed in pre-commit configuration. No implicit version drift in devcontainer or CI is possible. black behavior would be fixed until an explicit version update.
  2. In the case of the black version update, if some formatting behavior changes (rarely) — running black once over the whole codebase isn't a big issue.

@mstimberg
Copy link
Member

Thanks again @aleksejleonov for your comments and valuable insights. As you mentioned, there is a tension between what I like best, and what is the easiest for contributors. Since I write most of the code that gets committed these days, I feel that my taste in style matters somewhat, but of course the whole idea is that my preferences should matter less and less as more and more contributors join :)

So I think from my side, a good solution/compromise would be:

  • Go with black's string formatting preference, but use --preview to get the improved handling of multi-line strings (as you correctly mentioned, by pinning black's version, we shouldn't have any bad surprise – we could even check as part of our ci whether newer versions would change the source code in any way).
  • Exclude the examples (and probably everything else outside of the main brian2 package) – I feel that they are all very different style-wise in the more general sense (which is not necessarily a bad thing for examples), and normalizing things like whitespace wouldn't be that helpful. I guess black will only apply to .py files, so tutorials and documentation are excluded anyway.
  • As I explained above, I'd prefer a single massive syle-adjusting commit, together with .git-blame-ignore-revs – I think explaining to contributors that they have to fix formatting for code they did not write would be odd. Also, a sensible approach for contributors is to copy the style in the existing code base, again it would be a bit odd to explain them that outdated style is only ok for existing code.
  • oh, and of course we'd have to adapt our developer documentation/contribution guidelines

Yes, I totally agree with using isort. Also, I would like to addpylint as a linter to pre-commit and Ci. I believe, it's better to make such changes in smaller steps :)

This PRs covers pre-commit + GitHub Actions + black. Adding isort after pre-commit + GitHub Actions in place is a trivial task that I would prefer to do in separate PRs.

pylint and similar tools would be nice indeed, but will also need a bit more discussion. I agree that it would be better to keep this (and isort) separate. As you say, once the infrastructure is in place this will be rather straightforward.

Any thoughts on all this by others before this PR changes every single file in this repository 😆 ?

@thesamovar @bdevans @denisalevi

@thesamovar
Copy link
Member

I don't really see the advantage of a standardised code style. 🤷‍♂️

I'm happy for Marcel to make the call on this. Certainly, we don't want spaces enforced on examples because it would make them look horrible (particularly equation strings). But, for internal code it doesn't matter so much. Personally I don't like standardised spaces around operators, but I don't feel like my preferences about that should count for much any more. 😉

@bdevans
Copy link
Contributor

bdevans commented Nov 14, 2022

A couple of general comments. I actually prefer the spacing between numbers and their units - it makes it easier to parse for my eyes but I suppose it really amounts to what you're used to. Having said that, when it comes down to aesthetic preferences, I would break the tie by going with the Pythonic option... Also as you point out, the more community contributions the project receives, the more it makes sense to adopt such widely applied principles.

Thanks again @aleksejleonov for your comments and valuable insights. As you mentioned, there is a tension between what I like best, and what is the easiest for contributors. Since I write most of the code that gets committed these days, I feel that my taste in style matters somewhat, but of course the whole idea is that my preferences should matter less and less as more and more contributors join :)

Thanks indeed - this is a good initiative and it's nice to see the .devcontainer getting some love! 🥳

So I think from my side, a good solution/compromise would be:

  • Go with black's string formatting preference, but use --preview to get the improved handling of multi-line strings (as you correctly mentioned, by pinning black's version, we shouldn't have any bad surprise – we could even check as part of our ci whether newer versions would change the source code in any way).

I agree. I have no strong view on which auto formatter to use but using one is a good idea. 😄

  • Exclude the examples (and probably everything else outside of the main brian2 package) – I feel that they are all very different style-wise in the more general sense (which is not necessarily a bad thing for examples), and normalizing things like whitespace wouldn't be that helpful. I guess black will only apply to .py files, so tutorials and documentation are excluded anyway.

This seems like a reasonable compromise to me.

  • As I explained above, I'd prefer a single massive syle-adjusting commit, together with .git-blame-ignore-revs – I think explaining to contributors that they have to fix formatting for code they did not write would be odd. Also, a sensible approach for contributors is to copy the style in the existing code base, again it would be a bit odd to explain them that outdated style is only ok for existing code.

Agreed. Better to rip the plaster off and move on with our lives! 😉

  • oh, and of course we'd have to adapt our developer documentation/contribution guidelines

#TODO...

pylint and similar tools would be nice indeed, but will also need a bit more discussion. I agree that it would be better to keep this (and isort) separate. As you say, once the infrastructure is in place this will be rather straightforward.

Yes, I think these would be a good idea in the same spirit as using black but agree that a separate PR probably makes most sense. I would suggest adding to the list of follow-up PRs to add to .git-blame-ignore-revs a purge on strings using % and .format() so that we can consistently use f-strings.

Any thoughts on all this by others before this PR changes every single file in this repository 😆 ?

Do it! 😝

@mstimberg
Copy link
Member

@thesamovar:

I don't really see the advantage of a standardised code style. man_shrugging

I know what you mean, but the main advantage I am seeing is that there are some style-related things I do care about (say, I wouldn't want to have a func(kw = 3) instead of func(kw=3)), and it is simply easier to tell people to run black than to link to PEP8 or to make detailed comments about whitespace – this shouldn't be what code review is about.

Certainly, we don't want spaces enforced on examples because it would make them look horrible (particularly equation strings).

Equation strings wouldn't be affected in either case – the content of strings will never be affected.

@bdevans:

I would suggest adding to the list of follow-up PRs to add to .git-blame-ignore-revs a purge on strings using % and .format() so that we can consistently use f-strings.

I think we already did this with #1364 😊

@mstimberg
Copy link
Member

To make it a bit more concrete what black formatting would do: the two main changes would be

  1. use " everywhere (instead of sometimes using ' and sometimes ")
  2. use a different way to break long lines (e.g. function calls/definitions with many arguments)

For 2, here's a little example from our code:

def _nextclocks(self):
clocks_times_dt = [(c,
self._clock_variables[c][1][0],
self._clock_variables[c][2][0])
for c in self._clocks]
minclock, min_time, minclock_dt = min(clocks_times_dt, key=lambda k: k[1])
curclocks = {clock for clock, time, dt in clocks_times_dt if
(time == min_time or
abs(time - min_time)/min(minclock_dt, dt) < Clock.epsilon_dt)}
return minclock, curclocks

would be reformatted as:

    def _nextclocks(self):
        clocks_times_dt = [
            (c, self._clock_variables[c][1][0], self._clock_variables[c][2][0])
            for c in self._clocks
        ]
        minclock, min_time, minclock_dt = min(clocks_times_dt, key=lambda k: k[1])
        curclocks = {
            clock
            for clock, time, dt in clocks_times_dt
            if (
                time == min_time
                or abs(time - min_time) / min(minclock_dt, dt) < Clock.epsilon_dt
            )
        }
        return minclock, curclocks

There's one thing we still need to figure out: how to deal with the indentation in multi-line strings. For example:

prefs.register_preferences('core.network', 'Network preferences',
default_schedule=BrianPreference(
default=['start',
'groups',
'thresholds',
'synapses',
'resets',
'end',
],
docs="""
Default schedule used for networks that
don't specify a schedule.
"""
)
)

would be reformatted as:

prefs.register_preferences(
    "core.network",
    "Network preferences",
    default_schedule=BrianPreference(
        default=[
            "start",
            "groups",
            "thresholds",
            "synapses",
            "resets",
            "end",
        ],
        docs="""
                               Default schedule used for networks that
                               don't specify a schedule.
                               """,
    ),
)

which is clearly sub-optimal for the docs=''' part...

@denisalevi
Copy link
Member

denisalevi commented Nov 15, 2022

Bit late to the party, but here are my 2 cents:

  • I am all for consistent code that is easy to enforce / format during development
  • I have not strong feelings about the details of that. I just use black everywhere because I don't have to think about it and its bound to a shortcut 👍
  • I find the Brian2CUDA code horrible, its a mix of Brian code-style (from copying code over) and my coding preference, which changed over the years. I've been wanting to run black on that code for a while (Add automatic code formatting for Python code base and also generated standalone code brian2cuda#207), but only didn't do so because Brian2CUDA development always also interacts or is based on Brian2 code. So I would love it if I could run some kind of (however configured) formatter on Brian2CUDA once this is implemented 🎉

@oleksii-leonov
Copy link
Contributor Author

@mstimberg, I will slightly update this PR according to the discussion:

  • GitHub action will run black on all files in the brian2 package folder (instead of running only on changed files);
  • black with --preview option.

@oleksii-leonov oleksii-leonov force-pushed the ci/add-pre-commit-with-black-formatter-on-changed-files branch 3 times, most recently from fef98e8 to 2b1ea41 Compare November 16, 2022 06:39
@oleksii-leonov
Copy link
Contributor Author

oleksii-leonov commented Nov 16, 2022

@mstimberg, I have added pyproject.toml with a very basic black config:

[tool.black]
target-version = ['py37']
include = '^/brian2/.*\.pyi?$'
preview = true

The main idea is that the contributor could run either black . or pre-commit run --all-files and will get the same results without specifying parameters for black (--preview, --include, etc.).

Both black . and pre-commit run --all-files format only brian2 directory.

I don't add "editor.formatOnSave": true in .devcontainers, as there is no way to specify that only the brian2 folder should be formatted on saving in VSCode (need to use shortcut). Anyway, pre-commit will run black before each commit, so should not be an issue.

@oleksii-leonov oleksii-leonov force-pushed the ci/add-pre-commit-with-black-formatter-on-changed-files branch from 2b1ea41 to 60b7b3d Compare November 16, 2022 13:51
@mstimberg
Copy link
Member

@aleksejleonov Thanks for the update! I will have a closer look tomorrow, update the docs and if everything is working smoothly on my end, I'll merge the PR.

@mstimberg mstimberg force-pushed the ci/add-pre-commit-with-black-formatter-on-changed-files branch from 60b7b3d to d6a5dbd Compare November 17, 2022 16:34
@mstimberg
Copy link
Member

@aleksejleonov I hope it is not considered rude, but I just force-pushed into your branch. I wanted to split the commit into the part that updates the Github workflow, adds pre-commit, etc., and the actual code style changes to the source files. The main reason is that I would like git blame to show the former but not the latter. I also manually excluded a few things from formatting and fixed things like multi-line strings that looked odd due to their internal (mis)-alignment.
If you could have a quick look whether anything catches your eye, that would be great.

There are only two things open, before I can go ahead with the merge: 1) the developer documentation needs to be updated and 2) the CI to build our packages seems to be broken, I need to look into why (maybe because we now have a pyproject.toml but don't use it to build the package?). I hope to deal with this tomorrow.

Instead of installing them manually with CIBW_BEFORE_BUILD
Also add black badge to README, and fix an unrelated typo in release
notes

[ci skip]
Copy link
Member

@mstimberg mstimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, that's a wrap. Thanks again @aleksejleonov 🙏 and everyone else who commented. I fixed the CI issues, updated the development docs, and verified that everything works on my side, including the VS Code devcontainter.

If no one has any last-minute comment/complaint, I'll go ahead with the merge.

@mstimberg mstimberg merged commit 834b819 into brian-team:master Nov 18, 2022
@oleksii-leonov oleksii-leonov mentioned this pull request Nov 19, 2022
@mstimberg mstimberg mentioned this pull request Jan 4, 2023
@mstimberg mstimberg mentioned this pull request Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants