Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rules to be skipped #66

Merged

Conversation

diorge
Copy link
Contributor

@diorge diorge commented Jun 27, 2024

Solves #48 (partially?)

Introduce ModelFilter, which allows Rule objects to be skipped, by filtering on a Model. Adding filters is done in a similar way as rules, and can be configured using the pyproject.toml configuration.

Rules can now return an empty value SkipRule().
When doing so, the rule is not computed at all
by the Scorer.
Copy link
Contributor

@matthieucan matthieucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! 🙏

Left a comment for discussion on the overall design of the feature.

docs/create_rules.md Outdated Show resolved Hide resolved
@diorge
Copy link
Contributor Author

diorge commented Jul 11, 2024

I'm currently working on updating this with the Skip feature as well. Just externalizing here that having parametrized skips work well enough as long as they are used programatically and not through the configfile interface. I thought about having generic skips such as "skip_schema(schema)" and "keep_schema(schema)", but that would not work nicely with the configfile.

On that note, Skip is not a great name because inverting it is also a valid use case. Any other name recommendation? RuleFilter?

@jochemvandooren
Copy link
Contributor

I'm currently working on updating this with the Skip feature as well. Just externalizing here that having parametrized skips work well enough as long as they are used programatically and not through the configfile interface. I thought about having generic skips such as "skip_schema(schema)" and "keep_schema(schema)", but that would not work nicely with the configfile.

On that note, Skip is not a great name because inverting it is also a valid use case. Any other name recommendation? RuleFilter?

Great to hear @diorge! Yes I think it will be hard to configure these properly with the config files, let's keep it simple (for now)!

Could you maybe elaborate on "inverting it"? That's not entirely clear to me. Anyway: RuleFilter is also a good name to be honest, or maybe ModelFilter, as it's actually filtering out some models

@diorge
Copy link
Contributor Author

diorge commented Jul 12, 2024

Could you maybe elaborate on "inverting it"?

Both "apply only on schema X" and "apply everywhere except schema X" are valid filters. One is skipping X, and the other is the invert operation.

Anyway, yeah, ModelFilter is a much better name for the feature, thanks for the input.

Copy link
Contributor

@jochemvandooren jochemvandooren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work 🚀 It works super well and code looks good. I have left a couple of suggestions and questions. Will take another look later this week!

src/dbt_score/model_filter.py Show resolved Hide resolved
src/dbt_score/model_filter.py Outdated Show resolved Hide resolved
src/dbt_score/rule.py Outdated Show resolved Hide resolved
docs/create_rules.md Outdated Show resolved Hide resolved
@diorge
Copy link
Contributor Author

diorge commented Jul 30, 2024

My bad on not having installed the pre-commit hooks, the linter should be passing now

@druzhinin-kirill
Copy link
Contributor

Hey @diorge !

Thanks for this work, a really great feature 💪

I see now there are two mechanisms to disable evaluation for a model-rule pair. Is there any practical difference between them so we should keep both? For me, it feels like ModelFilter provides all the functionality to achieve skipping. It also keep rule logic abstracted and seems more natural (verify if the rule should be applied before applying). Do you think the SkipRule() mechanism needs to be kept?

@diorge
Copy link
Contributor Author

diorge commented Jul 31, 2024

@druzhinin-kirill this has been brought up in a conversation marked as resolved. SkipRule could be removed, but makes some use cases easier to deal with. I'd argue for most use cases, having a simple return SkipRule() is simpler than creating a ModelFilter, and filters won't be reusable by much.

I'm not too strong for keeping it, but I do think it's a nice addition that does not bloat the lib too much.

@druzhinin-kirill
Copy link
Contributor

@druzhinin-kirill this has been brought up in a conversation marked as resolved. SkipRule could be removed, but makes some use cases easier to deal with. I'd argue for most use cases, having a simple return SkipRule() is simpler than creating a ModelFilter, and filters won't be reusable by much.

I'm not too strong for keeping it, but I do think it's a nice addition that does not bloat the lib too much.

I see, got your points 👍
However, IMO the rule logic should be separated from the filter. Rules should left untouched if filters change. ModelFilter seems pretty powerful and clear from the design perspective, so following

There should be one-- and preferably only one --obvious way to do it.

I would vote to keep filters only. They might have a little overhead compared to SkipRule in specific cases indeed, but still straightforward enough to understand and implement 💡

WDYT @jochemvandooren @matthieucan ?

@diorge
Copy link
Contributor Author

diorge commented Aug 2, 2024

I'd compare the relationship of SkipRule and ModelFilter to that of tuple and list: they do the same thing in the end, but the use cases are different enough.

Anyway, if it is decided to remove the feature, I'd appreciate it if anyone else could pick this PR. I won't be able to work on it in the following weeks.

@matthieucan
Copy link
Contributor

@druzhinin-kirill this has been brought up in a conversation marked as resolved. SkipRule could be removed, but makes some use cases easier to deal with. I'd argue for most use cases, having a simple return SkipRule() is simpler than creating a ModelFilter, and filters won't be reusable by much.
I'm not too strong for keeping it, but I do think it's a nice addition that does not bloat the lib too much.

I see, got your points 👍 However, IMO the rule logic should be separated from the filter. Rules should left untouched if filters change. ModelFilter seems pretty powerful and clear from the design perspective, so following

There should be one-- and preferably only one --obvious way to do it.

I would vote to keep filters only. They might have a little overhead compared to SkipRule in specific cases indeed, but still straightforward enough to understand and implement 💡

WDYT @jochemvandooren @matthieucan ?

I agree with that sentiment. Separating rules (pure logic) and configuration (to whom does that logic apply, with which parameters) will keep things simple, and avoid some edge-cases that do not warrant the extra complexity IMO.

Comment on lines +129 to +132
class SkipSchemaY(ModelFilter):
description = "Applies a rule to every schema but Y."
def evaluate(self, model: Model) -> bool:
return model.schema.lower() != 'y'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know use-cases where the class version (as opposed to the decorator version) would be necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, but no harm in having it as it does not really change implementation and it follows the same design of @rule.

@jochemvandooren
Copy link
Contributor

@diorge I made some changes to the PR! Mainly got rid of the SkipRule, everything else is the same. @druzhinin-kirill @matthieucan Please have another look so we can get this one merged 🙌

@@ -57,11 +57,11 @@ def evaluate(self) -> None:
self.results[model] = {}
for rule in rules:
try:
result: RuleViolation | None = rule.evaluate(model, **rule.config)
if rule.should_evaluate(model): # Consider model filter(s).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, if a filter skips the model, it's not now visible to the user. In the case of complex filters, it might be important to flag skipping so the filter behaves expectedly

It might be interesting to either:

  • emit a warning if the model skipped (maybe too much noise)
  • Keep track of skipped models in the skipped list attr. In the end, Formatter can use this to print at least the number of skipped models.

My proposal is to add self.skips and track Model, Filter

I am fine to outline it as a new feature request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes let's keep it for another PR, then we can already release this one 🚀

Copy link
Contributor

@druzhinin-kirill druzhinin-kirill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @diorge! Thanks a lot for bringing life 🚀

@jochemvandooren jochemvandooren merged commit af26d7b into PicnicSupermarket:master Aug 15, 2024
3 checks passed
@jochemvandooren
Copy link
Contributor

@diorge Thanks a lot for your contribution! 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants