-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare each .gitignore found with an appropiate relative path #3338
Conversation
10dc09c
to
e4bae08
Compare
Converting to draft because I'm getting strange results locally |
This is PR is back up again and ready for review 👍 |
src/black/__init__.py
Outdated
@@ -660,14 +662,14 @@ def get_sources( | |||
elif p.is_dir(): | |||
if exclude is None: | |||
exclude = re_compile_maybe_verbose(DEFAULT_EXCLUDES) | |||
gitignore = get_gitignore(root) | |||
root_gitignore = get_gitignore(root) | |||
p_gitignore = get_gitignore(p) | |||
# No need to use p's gitignore if it is identical to root's gitignore | |||
# (i.e. root and p point to the same directory). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are more than two levels? For example, we're formatting a/b/c/
and a/.gitignore/
, a/b/.gitignore
, and a/b/c/.gitignore
all exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Right now, if the command is black a/b/c/
then a/b/.gitignore
is completely skipped while a/.gitignore
and a/b/c/.gitignore
are captured by root_gitignore
and p_gitignore
, respectively. What's the expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a separate issue/PR? If its an undefined behavior, then it should be documented properly I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are more than two levels? For example, we're formatting
a/b/c/
anda/.gitignore/
,a/b/.gitignore
, anda/b/c/.gitignore
all exist.
The current behavior seems to be that black a/b/c
only detects a/b/c/.gitignore
and completely ignores other gitignore files. This PR implements the same behavior
src/black/files.py
Outdated
@@ -198,7 +198,7 @@ def gen_python_files( | |||
extend_exclude: Optional[Pattern[str]], | |||
force_exclude: Optional[Pattern[str]], | |||
report: Report, | |||
gitignore: Optional[PathSpec], | |||
gitignore: Optional[Dict[Path, PathSpec]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest renaming this to gitignore_dict
for clarity. Also, no reason to keep this Optional, we can just use an empty dict if there's no gitignores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gitignore
can be None
if exclude
is not None
. In that case, no gitignore is considered at any level. I though that this is the expected behavior while reading the code. If get_sources
receives a value for exclude
, what's the expected behavior?
About renaming the variable: I agree 👍
50661f4
to
f97cccd
Compare
When a .gitignore file contains the special rule to ignore every subfolder content (`*/*`) and the file is located in a subfolder relative to where the command is executed (root), the rule is incorrectly applied and ignores every file at the same level of the .gitignore file. The reason for this is that the `gitignore` variable accumulates the rules found in each .gitignore while traversing files and directories recursively. This makes sense and, in general, works as expected. The problem is that the gitignore rules are applied using as the relative path from root to target directory as a reference. This is the cause of the bug. The implemented solution keeps track of every .gitignore file found while traversing the targets and the absolute location of each .gitignore file. Then, when matching files to the .gitignore rules, compare each set of rules with the appropiate relative path to the candidate target file. To make this possible, we changed the single `gitignore` object with a dictionary of similar objects, where the corresponding key is the absolute path to the folder that contains that .gitignore file. This required changing the signature of the `get_sources` function. Also, we introduce a `is_ignored` function that compares a file with every set of rules. Finally, some tests required an update to pass the gitignore object in the new format. Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
The test contains three cases: 1) when the .gitignore with the special rule to ignore every subfolder and its contents (*/*) is in the root, 2) when the file is inside a subfolder relative to root (nested), and 3) when the target folder contains the .gitignore and root is a parent folder of the target. In all of these cases, we compare the files that are visible by Black with a known list of paths containing the expected values. Before the fix introduced in the previous commit, these tests failed when the .gitignore file was nested (second case). Now, the test is passed for all cases. Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
Add entry about fixed bug and changes introduced: ignore files by considering the location of each .gitignore file and the relative path of each target Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
src/black/files.py
Outdated
if relative_path is None: | ||
break | ||
if _gitignore is not None and _gitignore.match_file(relative_path): | ||
report.path_ignored(child, "matches the .gitignore file content") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe include the (full) path to the .gitignore file that matched?
These changes are small improvements to improve code readability: rename a variable to a more descriptive name (from `exclude_is_None` to `using_default_exclude`), use a better syntax to include the type annotation for `gitignore` variable (from typing comment to Python-style typing annotation), and replace an if-else block with a single dictionary definition (in this case, we need to compare keys instead of values, meaning that the change works) Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
f257725
to
c09a816
Compare
The function to match a given path with every discovered .gitignore file does not need to be a nested function and can be a top-level function. The arguments did not change, but the naming of local variables was improved for readability. Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
Description
Fix #3315 : Fix incorrectly applied .gitignore rules by considering the .gitignore location and the relative path to the target file. Commit message of the commit that fixed the bug:
When a .gitignore file contains the special rule to ignore every subfolder content (
*/*
) and the file is located in a subfolderrelative to where the command is executed (root), the rule is incorrectly applied and ignores every file at the same level of the .gitignore file.
The reason for this is that the
gitignore
variable accumulates the rules found in each .gitignore while traversing files and directories recursively. This makes sense and, in general, works as expected. The problem is that the gitignore rules are applied using as the root of the target directory as a reference. This is the cause of the bug.The implemented solution keeps track of every .gitignore file found while traversing the targets and the location of each .gitignore. Then, when matching files to the .gitignore rules, compare each set of rules with the appropiate relative path to the candidate target file.
To make this possible, we changed the single
gitignore
object with a dictionary of similar objects, where the corresponding key is the path to the folder that contains that .gitignore file. This required changing the signature of theget_sources
function. Also, we introduce ais_ignored
function that compares a file with every set of rules. Finally, some tests required an update to pass the gitignore object in the new format.Checklist - did you ...
CHANGES.md
if necessary?Add new / update outdated documentation?