Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Python linters/checkers stream their partitioned results? #13380

Open
Eric-Arellano opened this issue Oct 27, 2021 · 2 comments
Open

Should Python linters/checkers stream their partitioned results? #13380

Eric-Arellano opened this issue Oct 27, 2021 · 2 comments
Labels
backend: Python Python backend-related issues

Comments

@Eric-Arellano
Copy link
Contributor

For Python, we may return >1 result. For example, MyPy and Flake8 can partition by interpreter constraints. When that happens, we don't dump the results until all partitions have run. Instead, we could do something like #13379 (review) to stream results.

However, I expect >90% of users don't use this partition feature. It would be confusing to render the per-partition result when there is only one partition, given that we will also still render CheckResults and LintResults due to the code in check.py and lint.py. So, we should probably only stream "partitions" when they are actually used.

This question also applies to the option --lint-per-file-caching.

--

Concretely, we might want to remove the CheckResults and LintResults feature in favor of having to return a single CheckResult/LintResult a la #13379 (review). Let each plugin determine if it wants to stream. (Although, that would make support for --lint-per-file-caching harder to implement)

@stuhood
Copy link
Member

stuhood commented Oct 27, 2021

Concretely, we might want to remove the CheckResults and LintResults feature in favor of having to return a single CheckResult/LintResult a la #13379 (review). Let each plugin determine if it wants to stream. (Although, that would make support for --lint-per-file-caching harder to implement)

Ideally the various linters could be oblivious to the number of partitions that are created, and instead just operate on whatever they're given... that would hopefully mean they wouldn't need to care about how many instances there were. But I think that that would require making the partitioning constraints more declarative, so that lint/check could execute the partitioning before calling the tools.

This relates to a conversation from the other day: it would be great to be able to optimize partition sizes independently of the constraints (one instance of yapf is too few on a 64 core machine, while one per file is too many: https://pantsbuild.slack.com/archives/C046T6T9U/p1635271709151400), and removing most of that logic from individual linters would help enable that. Then we could play with partition sizes independently of constraints.

@stuhood
Copy link
Member

stuhood commented Jan 19, 2022

Relates to #13462.

But: yes: we probably should stream the results of individual partitions, in addition to cleaning them up / making them quieter by default, á la #14129.

@cognifloyd cognifloyd added the backend: Python Python backend-related issues label Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues
Projects
None yet
Development

No branches or pull requests

3 participants