Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rule groups and a core rules group #3142

Merged
merged 31 commits into from
May 4, 2022
Merged

Conversation

pwildenhain
Copy link
Member

Brief summary of the change made

Fixes #811

As I studied the problem more, it made more sense for rule groupings to be something controlled by the config, rather than by the rule itself (like rule tags or something like that). What this change now essentially amounts to is an opinionated default list of core rules, that users can modify if they wish to include more core rules. It also makes it easier to roll out SQLFluff to a team by only running core rules at first.

Are there any other side effects of this change that we should be aware of?

User can now define their own rule groups, though I'm not sure there's much utility in that, since they still need to write out all the rule names. I think a good addition to this feature would be "common sense" names for each of the rules (i.e L001 = "trailing-whitespace") which makes it easier for users to define things like that.

It also opens up the potential for different types of groups that the package maintainers might want to offer in the future (dbt-best-practice, safe-sql, etc.) for user convenience

If we have agreement on this, I'll document this new feature as well as the core rule list in the docs.

Pull Request checklist

  • Please confirm you have completed any of the necessary steps below.

  • Included test cases to demonstrate any code changes, which may be one or more of the following:

    • .yml rule test cases in test/fixtures/rules/std_rule_cases.
    • .sql/.yml parser test cases in test/fixtures/dialects (note YML files can be auto generated with tox -e generate-fixture-yml).
    • Full autofix test cases in test/fixtures/linter/autofix.
    • Other.
  • Added appropriate documentation for the change. (Not done yet)

  • Created GitHub issues for any relevant followup/future enhancements if appropriate.

@pwildenhain pwildenhain requested a review from a team April 21, 2022 16:33
@pwildenhain pwildenhain self-assigned this Apr 21, 2022
@pwildenhain pwildenhain requested review from nolanbconaway, tunetheweb, barrywhart and a team and removed request for a team and nolanbconaway April 21, 2022 16:33
@codecov
Copy link

codecov bot commented Apr 21, 2022

Codecov Report

Merging #3142 (0aebcc6) into main (43b7f8a) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #3142   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          167       167           
  Lines        12423     12502   +79     
=========================================
+ Hits         12423     12502   +79     
Impacted Files Coverage Δ
src/sqlfluff/core/rules/base.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L001.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L002.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L003.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L004.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L005.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L006.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L007.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L008.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L009.py 100.00% <100.00%> (ø)
... and 55 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 43b7f8a...0aebcc6. Read the comment docs.

@tunetheweb
Copy link
Member

As I studied the problem more, it made more sense for rule groupings to be something controlled by the config, rather than by the rule itself (like rule tags or something like that).

I still think it might be better in the rule, for a few reasons:

  1. Less likely to be missed when a new rule is added
  2. Can be in rule docs with a rule decorator

@pwildenhain
Copy link
Member Author

Totally open to changing this approach. How should we identify when a rule group has been added to the 'rules' or 'excludes_rules' config?

If it's a part of the rule, then we need to check for every rule if it's been included/excluded vs. just checking one time once the .sqlfluff config is loaded.

Am I seeing that right? Or am I missing something?

@tunetheweb
Copy link
Member

Yeah that would be a downside. It would presumably have to go through all rules and then check whether that rule is in the currently configured group, and skip if not. Would still be pretty quick though, but yeah maybe there's a better way to only do that once (maybe on the first query run?) and not have to do that for every SQL file.

@pwildenhain
Copy link
Member Author

(maybe on the first query run?) and not have to do that for every SQL file.

Yeah I agree there likely is a better way. I don't think the proposed way would work because different sql files can have different config values, if the user nests multiple .sqlfluff files across their directories

@barrywhart
Copy link
Member

Re: If the group membership was on the rules, we are already doing something similar with rule phases here: https://github.com/sqlfluff/sqlfluff/blob/main/src/sqlfluff/core/linter/linter.py#L514L516, i.e. looping over the rules, checking a property, deciding what to run.

@barrywhart
Copy link
Member

I don't see any doc changes in the PR -- were you planning to do that once there's agreement on the general approach?

@pwildenhain
Copy link
Member Author

I don't see any doc changes in the PR -- were you planning to do that once there's agreement on the general approach?

Correct -- it seems like you agree with @tunetheweb that this should be something that lives with the rules?

@barrywhart
Copy link
Member

I don't have a strong opinion where it should live, but wanted to share a note about how it could work.

I may have some thoughts to share later about how we could enforce assigning a new rule somewhere, even if it's in config.

@barrywhart
Copy link
Member

The thought I mentioned earlier: If we want to be sure that we consider each new rule's membership in some group, we could add a startup check that every rule belongs to at least (or exactly one) group. Then define groups "core", "noncore", and a few others if we want.

@tunetheweb
Copy link
Member

Might be better to have an all group that rules are all automatically in? And maybe that's the default?

@pwildenhain
Copy link
Member Author

Having an all group makes sense to me.

Do we all agree that we want users to use the feature like this:

$ sqlfluff my_file.sql --rules core

Or is there a different vision we have for how rule groupings will be used by the user?

@tunetheweb
Copy link
Member

That method of invoking it LGTM. Presume it would also work from .sqlfluff in the rules param?

@barrywhart
Copy link
Member

For a moment, I was worried that using --rules for both groups and individual rules could cause problems or confusion, but TBH I like it. People shouldn't have groups whose names start with L.

We should spend a little time thinking, documenting, and writing tests for what happens if someone mixes rules and groups in various ways, e.g.:

--exclude-rules L003 --rules core

Which wins? One possible answer: Always process --rules first, then --exclude. (Note that we won't know the order they appear on the command line, at least using click.)

@pwildenhain
Copy link
Member Author

Always process --rules first, then --exclude.

Agree that this makes the most sense, and we should make weird test cases that will mimic what users will most certainly do in the real world

@barrywhart
Copy link
Member

Agree that this makes the most sense, and we should make weird test cases that will mimic what users will most certainly do in the real world

Exactly. I don't necessarily care that they do it, but then they'll ask us to explain why it's not working, and that's hard work. 🤣

@pwildenhain pwildenhain mentioned this pull request Apr 22, 2022
1 task
docs/source/rules.rst Outdated Show resolved Hide resolved
@barrywhart
Copy link
Member

@pwildenhain: Where are we with this PR? Would be good to get it merged!

Above, you mentioned you were going to make a docs change. IIRC, overall the PR was looking pretty good to me.

@tunetheweb

@pwildenhain
Copy link
Member Author

Sure thing -- I can push the necessary changes tomorrow if we have agreement

docs/source/rules.rst Outdated Show resolved Hide resolved
docs/source/rules.rst Outdated Show resolved Hide resolved
Comment on lines +1107 to +1112
if r in valid_groups:
rules_in_group = [
rule
for rule, rule_dict in self._register.items()
if r in rule_dict["groups"]
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a question for a future PR: Does this support nested rule groups, e.g., maybe later we want to define the core group as the combination of several groups, e.g. whitespace, "making things explicit", "foobar". In that case, expansion of groups would need to be recursive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see that as a nice DRY alternative, rather than manually specifying the same rule in different groups over and over again. I agree though that we should leave it as a consideration for a future PR -- who knows, maybe no one will care about rule groups 😂

Copy link
Member

@barrywhart barrywhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small suggestions and questions. Overall LGTM.

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small number of questions/suggestions.

docs/source/rules.rst Outdated Show resolved Hide resolved
docs/source/rules.rst Outdated Show resolved Hide resolved
Comment on lines +44 to 45
groups = ("all", "core")
config_keywords = ["tab_space_size"]
Copy link
Member

@tunetheweb tunetheweb May 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is groups a Tuple, and config_keywords a List?

Should groups be a List?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made groups a tuple because I wanted to make sure we couldn't modify it after the rule is instantiated.
If we think that's no a big concern, I can turn it into a list. If we do think it's a big concern, then we should turn config_keywords into a Tuple as well for consistency

pwildenhain and others added 3 commits May 3, 2022 12:43
Co-authored-by: Barry Pollard <barry_pollard@hotmail.com>
Co-authored-by: Barry Hart <barrywhart@yahoo.com>
Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but made one further suggestion to the wording.

docs/source/rules.rst Outdated Show resolved Hide resolved
pwildenhain and others added 2 commits May 3, 2022 19:53
Co-authored-by: Barry Pollard <barry_pollard@hotmail.com>
@pwildenhain pwildenhain merged commit 3e0d615 into main May 4, 2022
@pwildenhain pwildenhain deleted the add-core-rules-group branch May 4, 2022 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Group or tag the core rules to make bulk enabling/disabling of rules a possibility
5 participants