-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gazelle): Add "python_test_file_pattern" directive #1819
Conversation
Add glob to generate.go Add glob dep
gazelle/python/generate.go
Outdated
func makeCompiledGlobs(globs []string) []glob.Glob { | ||
compiledGlobs := []glob.Glob{} | ||
for _, value := range globs { | ||
compiledGlob := glob.MustCompile(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might prefer a more helpful error message if the glob fails. Could we add an integration test? Maybe it is good enough already? Right now this code would panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Failure to compile: error message added, test added
I had a similar thought last night: we also need to handle the case where the user has # gazelle:python_test_file_pattern
(with no value). Right now no files will be mapped to py_test
if there is no pattern. If we want to keep that behavior, I'll need to update the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added a test that subpackages can override the value.
gazelle/python/generate.go
Outdated
func makeCompiledGlobs(globs []string) []glob.Glob { | ||
compiledGlobs := []glob.Glob{} | ||
for _, value := range globs { | ||
compiledGlob := glob.MustCompile(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Failure to compile: error message added, test added
I had a similar thought last night: we also need to handle the case where the user has # gazelle:python_test_file_pattern
(with no value). Right now no files will be mapped to py_test
if there is no pattern. If we want to keep that behavior, I'll need to update the docs.
gazelle/python/configure.go
Outdated
case pythonconfig.TestFilePattern: | ||
globStrings := strings.Split(strings.TrimSpace(d.Value), ",") | ||
config.SetTestFilePattern(globStrings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #1787 I added an option for resetting the directive to the rules_python default. Do we want to support the same for this directive?
// N.B.: pseudocode; haven't tested yet
case pythonconfig.TestFilePattern:
directiveArg := strings.TrimSpace(d.Value)
if directiveArg == "DEFAULT" {
directiveArg = pythonconfig.DefaultTestFilePatternString
}
globStrings := strings.Split(directiveArg, ",")
config.SetTestFilePattern(globStrings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency I would agree, but at the same time my brain is also telling me that we could have:
gazelle:python_test_file_pattern
to reset the config to defaults.
The empty pattern yielding no test targets is not super intuitive, IMHO.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty pattern yielding no test targets is not super intuitive, IMHO.
Agreed.
That said, I would argue that the empty pattern resetting to defaults is also not super intuitive. Some readers will see that and think "no test targets" while others will think "resets to default".
I like explicit things 😁. There's no ambiguity and it's easier to grok when returning to something N years down the road.
If we wanted to support yielding no test targets1 it should be something like gazelle:python_test_file_pattern NONE
or have a different directive altogether (gazelle:python_no_tests
or something).
Footnotes
-
I can't think of a reason why someone would want that, but that doesn't mean there aren't reasons ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passerby opinion: yielding no test targets can (and imho should) be achieved by ignoring files with gazelle:python_ignore_files
. That said, setting pattern to NONE should not be allowed (also, it's a bit confusing to set pattern to NONE
, some might thing there is NONE file or smths). gazelle:python_test_file_pattern
(blank) should also fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yielding no test targets can (and imho should) be achieved by ignoring files with
gazelle:python_ignore_files
I disagree. There are definitely cases where you want to yield no test targets but do want to yield lib and binary targets.
project/
+ BUILD.bazel
+ MODULE.bazel
+ src/myfoo/
+ BUILD.bazel
+ foo.py
+ foo_test.py
+ subdir/
+ BUILD.bazel
+ run_test.py # py_binary
+ example_test.py # py_library
Using gazelle:python_ignore_files
would mean that no targets are generated in subdir
above, when really we'd want only py_library and py_binary targets made.
There's also the case where test files are outside of the package, as described by PyPA Tutorial. In that case, you can be sure that all targets in src/mypackage
below are py_library
and py_binary
.
project/
+ BUILD.bazel
+ MODULE.bazel
+ src/mypackage/ # guaranteed to have no py_test targets
+ BUILD.bazel
+ run_electrical_test.py # py_binary
+ electrical_test.py # py_library
+ tests/ # any py_* targets
+ BUILD.bazel
+ foo_test.py # py_test
+ bar_test.py # py_test
+ test_utils.py # py_library
+ run_all_test.py # py_binary
also, it's a bit confusing to set pattern to
NONE
, some might thing there isNONE
file
A valid point. Perhaps using something that is unlikely to be a filename would be better? Maybe one of these or similar?
# gazelle:python_test_file_pattern %NONE%
# gazelle:python_test_file_pattern !!NO_FILES
# gazelle:python_test_file_pattern ~!~NO_PATTERN~!~
One issue is that linux filenames can be basically anything haha so it's not feasible to pick a string that can't be a filename.
If we do go with a more "special" string, I'd probably want to update the python_default_visibility
directive special strings to match.
gazelle:python_test_file_pattern
(blank) should also fail.
+1 to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with arguments around gazelle:python_ignore_files
.
How about...not having any magic names at all? I mean, by the time someone wants to reset the test file patterns, he/she already learned what default values are (while customising them in the first place), and if resetting the patterns is needed, it can be done with # gazelle:python_test_file_pattern *_test,test_*
.
Yet another idea is to have dedicated no-args directive for resetting to default, something like:
# gazelle:default_python_test_file_pattern
. However, it only makes sense if applied conventionally everywhere in the plugin imho. Otherwise, I'd vote for not having any special values and let users "reset" by explicitly setting python_test_file_pattern
to default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with no magic names at all and a "reset" by explicitly setting the value to defaults. gazelle:python_test_file_pattern *_test.py,test_*.py
That said, I'd still like to mention the two (minor) issues with no magic values:
- [can ignore] There's no way to defer to
rules_python
's defaults. For example, ifrules_python
changes the default from*_test.py,test_*.py
to something else, users would have to update their configs manually if they wanted to always use therules_python
default.- TBH this is probably a good thing because it results in less unexpected behavior.
- We're back to the
# gazelle:python_test_file_pattern
(no value) for preventing any files from mapping topy_test
. As mentioned previously:- this is not very intuitive.
- we said that "no value" should fail.
For (2), perhaps just making sure that's documented and tested is sufficient? Unless we implement another directive gazelle:python_no_test_files
or something (which I'm against - it seems to only complicate things).
have dedicated no-args directive for resetting to default ... it only makes sense if applied conventionally everywhere in the plugin
+1. I'm slightly unsure about having directives to reset to default (as you said, people can just set the default manually), but if we add them they definitely should be applied everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, the plugin currently has no switches for enabling/disabling parts of the generation process. Setting pattern to none sounds to me like an abuse of the directive to disable tests and I'd rather have an explicit python_no_test_files
so that there is a clear indication no tests are being processed for a) confusing fewer humans, and b) plugin itself, so that it can potentially deviate its processing depending on that fact (e.g. print debug/warning message etc).
This should be, however, out of context for this PR. I'd say, we should just disallow no-arg python_test_file_pattern
, provide no magic values and wrap-up the PR.
Have I already mentioned this is great work and nice improvement 🙂? Thank you! Looking forward to it. I have exactly the place where I will use it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting pattern to none sounds to me like an abuse of the directive
I can see that and it's a good point.
I'd rather have an explicit
python_no_test_files
so that there is a clear indication no tests are being processed ... we should just disallow no-argpython_test_file_pattern
, provide no magic values and wrap-up the PR.
SGTM. I'd like to continue the discussion, so I've opened #1826.
this is great work and nice improvement
Thanks! I've been really enjoying being able to add more features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this SGTM, thanks for the discussion. I've been enjoying the read! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the test case for "no value causes error" and updated the docs for this directive. I think we're ready for re-review.
gazelle/go.mod
Outdated
@@ -9,6 +9,7 @@ require ( | |||
github.com/bmatcuk/doublestar v1.3.4 | |||
github.com/emirpasic/gods v1.18.1 | |||
github.com/ghodss/yaml v1.0.0 | |||
github.com/gobwas/glob v0.2.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the most popular glob
package, though I don't see any github activity in 6 years. Other options were:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about doublestar.Match
? It is used by bazel-gazelle
and is already included in this go.mod:9
too.
import (
"fmt"
"github.com/bmatcuk/doublestar/v4"
)
func matchesAnyGlob(s string, globs []string) bool {
for _, g := range globs {
if ok, _ := doublestar.Match(g, s); ok {
return true
}
}
return false
}
func main() {
defaultGlob := []string{"*_test.py", "test_*.py"}
fmt.Println(matchesAnyGlob("test_my_target.py", defaultGlob))
fmt.Println(matchesAnyGlob("my_target_test.py", defaultGlob))
customGlob := []string{"*_pytest.py", "pytest_*.py"}
fmt.Println(matchesAnyGlob("pytest_upload.py", customGlob))
fmt.Println(matchesAnyGlob("download_pytest.py", customGlob))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for not adding another dep! Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed gobwas/glob
and migrated/upgraded to doublestar/v4.
gazelle/go.mod
Outdated
@@ -9,6 +9,7 @@ require ( | |||
github.com/bmatcuk/doublestar v1.3.4 | |||
github.com/emirpasic/gods v1.18.1 | |||
github.com/ghodss/yaml v1.0.0 | |||
github.com/gobwas/glob v0.2.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for not adding another dep! Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the great work!
gazelle/python/generate.go
Outdated
@@ -54,6 +54,19 @@ func GetActualKindName(kind string, args language.GenerateArgs) string { | |||
return kind | |||
} | |||
|
|||
func matchesAnyGlob(s string, globs []string) bool { | |||
for _, g := range globs { | |||
ok, err := doublestar.Match(g, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am slightly concerned that we are compiling the glob here. Would it be possible to compile it elsewhere? When parsing the config would be another option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is used in larger repos I am worried that we perform a linear number of glob compilations as opposed to doing it once when loading the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We were doing that before1 with gobwas/glob
. From what I can tell, doublestar
doesn't have any kind of "compiled pattern" object that I can pass around instead.
We can validate patterns during config load, but we wouldn't be able to tell doublestart.Match
to not validate patterns and thus save some cycles. (Unless there's a way to access matchWithSeparator
? My understanding of go is that lowercase names aren't exported and are thus not accessible.)
I think we have two (three?) options:
- Go back to
gobwas/glob
and precompile patterns - Accept the perf hit.
- Some go ability that I'm not aware of?
Either way we should validate/compile patterns when loading the config so that we fail earlier. I've made that update. It means we validate one more time, but we're already validating (N_files * Patterns) times so (N*P)+1 is no big deal
Footnotes
-
Well, almost. It wasn't done when loading config, but at least it was outside of the
for _, f := range args.RegularFiles
loop. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the match's code there is nothing to "compile" in glob really, like we typically do with regex. This is common for globs to be matched directly as they are simple, do not need complicated parsing, creating underlying tree used for matching etc.
This [rules_python] repo already does doublestart.Match
in a loop here and so does bazel-gazelle here and it hasn't been an issue so far.
Validating earlier is a good improvement.
Unless there's a way to access matchWithSeparator? My understanding of go is that lowercase names aren't exported and are thus not accessible.
That is correct for go. I created a ticket bmatcuk/doublestar#92
…well, in addition to) during generate.GenerateRules
gazelle/python/generate.go
Outdated
@@ -54,6 +54,19 @@ func GetActualKindName(kind string, args language.GenerateArgs) string { | |||
return kind | |||
} | |||
|
|||
func matchesAnyGlob(s string, globs []string) bool { | |||
for _, g := range globs { | |||
ok, err := doublestar.Match(g, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We were doing that before1 with gobwas/glob
. From what I can tell, doublestar
doesn't have any kind of "compiled pattern" object that I can pass around instead.
We can validate patterns during config load, but we wouldn't be able to tell doublestart.Match
to not validate patterns and thus save some cycles. (Unless there's a way to access matchWithSeparator
? My understanding of go is that lowercase names aren't exported and are thus not accessible.)
I think we have two (three?) options:
- Go back to
gobwas/glob
and precompile patterns - Accept the perf hit.
- Some go ability that I'm not aware of?
Either way we should validate/compile patterns when loading the config so that we fail earlier. I've made that update. It means we validate one more time, but we're already validating (N_files * Patterns) times so (N*P)+1 is no big deal
Footnotes
-
Well, almost. It wasn't done when loading config, but at least it was outside of the
for _, f := range args.RegularFiles
loop. ↩
gazelle/pythonconfig/pythonconfig.go
Outdated
func (c *Config) SetTestFilePattern(patterns []string) { | ||
for _, p := range patterns { | ||
if !doublestar.ValidatePattern(p) { | ||
log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
globstar
errors with path.ErrBadPAttern
but also states "users should not count on the returned error, doublestar.ErrBadPattern
, being equal to path.ErrBadPattern
" so I've opted to inject the string manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually... I could just do:
// We only care if the pattern is valid. Use `Match` instead of `ValidatePattern` (which
// just returns `bool`) so that we always use the error message provided by `doublestar`.
if _, err := doublestar.Match(p, ""); err != nil {
log.Fatalf("ERROR: Failed to compile glob '%v'. Error: %v\n", p, err)
}
...
Any preference between
(a) using ValidatePattern
and manually injecting the error string as I've already done or
(b) abusing Match
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the word "abusing" in your comment speaks for itself :) Looking at the Match
's code, err.ErrBadPattern
brings no more than just an error flag (no error message, no extra context etc.) so it is effectively does the same as having true/false. I'd keep it as is. Except maybe a nit regarding go error message style (see the comment in configure.go
).
log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p) | |
log.Fatalf("invalid glob pattern '%s'", p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 kept, aside from style nit
gazelle/pythonconfig/pythonconfig.go
Outdated
// Will exit 1 if the pattern is invalid. | ||
func (c *Config) SetTestFilePattern(patterns []string) { | ||
for _, p := range patterns { | ||
if !doublestar.ValidatePattern(p) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this check be better in configure.go
case pythonconfig.TestFilePattern:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say it would be slightly more aligned with existing code to do to in a case
indeed as there are already some invalid value
errors and parsing there while these stay simple setters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 moved.
Ah, thanks for checking. Then I am happy to merge everything! :)
…On 4 April 2024 17:38:05 GMT+09:00, Illia Ovchynnikov ***@***.***> wrote:
@wingsofovnia commented on this pull request.
> @@ -54,6 +54,19 @@ func GetActualKindName(kind string, args language.GenerateArgs) string {
return kind
}
+func matchesAnyGlob(s string, globs []string) bool {
+ for _, g := range globs {
+ ok, err := doublestar.Match(g, s)
Looking at the match's [code](https://github.com/bmatcuk/doublestar/blob/180028ba525d3116cc917ac3179e79428acfd89c/match.go#L74) there is nothing to "compile" in glob really, like we typically do with regex. This repo already does `double start.Match` in the loop [here](https://github.com/bazelbuild/rules_python/blob/b9f39bf09760ff75aedd64ce99c9573c1a97068f/gazelle/python/generate.go#L188) and so does bazel-gazelle [here](https://github.com/bazelbuild/bazel-gazelle/blob/1b331b6d9cb2d981580a8d3e2dfc551185790737/walk/config.go#L145) and it hasn't been an issue so far.
--
Reply to this email directly or view it on GitHub:
#1819 (comment)
You are receiving this because you modified the open/close state.
Message ID: ***@***.***>
|
gazelle/python/configure.go
Outdated
case pythonconfig.TestFilePattern: | ||
value := strings.TrimSpace(d.Value) | ||
if value == "" { | ||
log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there is no need to double severity in the message here as Fatal
will be already in the log. Also, go's error messages are conventionally lowercased since errors can be wrapped/nested (e.g. something bad happened: directive 'python...
). \n
is also unnecessary.
log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.") | |
log.Fatal("directive 'python_test_file_pattern' requires a value.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL, thanks! Updated.
As an FYI, the ERROR:
and \n
were included to maintain consistency with the other log.Fatal
messages:
$ rg 'log.Fatal.?\("'
manifest/generate/generate.go
90: log.Fatalf("ERROR: %v\n", err)
109: log.Fatalf("ERROR: %v\n", err)
pythonconfig/pythonconfig.go
444: log.Fatalf("ERROR: Failed to compile glob '%v'. Error: syntax error in pattern\n", p)
python/lifecycle.go
41: log.Fatalf("failed to write parser zip: %v", err)
47: log.Fatalf("cannot write %q: %v", helperPath, err)
python/generate.go
238: log.Fatalf("ERROR: %v\n", err)
319: log.Fatalf("ERROR: %v\n", err)
353: log.Fatalf("ERROR: %v\n", err)
384: log.Fatalf("ERROR: %v\n", err)
504: log.Fatalf("ERROR: %v\n", err)
python/configure.go
188: log.Fatal("ERROR: Directive 'python_test_file_pattern' requires a value.")
Prior to this PR, 7 of 9 calls included both ERROR:
and \n
.
Add the
python_test_file_pattern
directive. This directive allowsusers to configure what python files get mapped to the
py_test
rule.
The default behavior is unchanged: both
test_*
and*_test.py
files generate
py_test
targets if the directive is unset.The directive supports multiple glob patterns, separated by a comma.
Note: The original code used, effectively,
test_*
for one of thepatterns. This code uses
test_*.py
instead. These are equivalentbecause of the
.py
extension check prior to pattern matching.Fixes #1816.