Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the REUSE.toml performance regression (+ do a whole lot of refactoring) #1047

Merged
merged 10 commits into from
Sep 26, 2024

Commits on Sep 18, 2024

  1. Remove bidirectional aggregation between VCSStrategy and Project

    Hooray! This had always been bad design.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    d9a2b51 View commit details
    Browse the repository at this point in the history
  2. Refactor iter_files out of Project

    This is kind of annoying, but it is necessary for some work I want to do
    in global_licensing. Fortunately this was surprisingly easy; iter_files
    is the first function ever written for REUSE, so I expected that it
    would be more tangled up.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    9036904 View commit details
    Browse the repository at this point in the history
  3. Factor out iter_files into its own module covered_files

    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    3b25418 View commit details
    Browse the repository at this point in the history
  4. Create fixture subproject_repository

    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    a8a1f28 View commit details
    Browse the repository at this point in the history
  5. Fix performance regression when searching for REUSE.toml

    Previously, the glob **/REUSE.toml would search _all_ directories,
    including big directories containing build artefacts that are otherwise
    ignored by VCS. This commit uses the same logic to find REUSE.toml as
    any other file is found. It's not super rapid, but it does a heap more
    filtering.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    f087082 View commit details
    Browse the repository at this point in the history
  6. Convert Project into attrs class

    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    5651081 View commit details
    Browse the repository at this point in the history
  7. Get errant source from exception instead of find_global_licensing

    In _main, find_global_licensing was called to find the file that
    contained some parsing error. This may have contained false information
    in the case of multiple REUSE.tomls.
    
    Instead of needlessly calling this function, the errant file is now an
    attribute on the exception.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    8280798 View commit details
    Browse the repository at this point in the history
  8. Reduce calls to iter_files

    Previously this function would be called three times on a lint:
    
    - once by NestedReuseTOML.find_reuse_tomls in Project.find_global_licensing
    - once by NestedReuseTOML.find_reuse_tomls in  NestedReuseTOML.from_file
    - once to lint all the files
    
    I now no longer use NestedReuseTOML.from_file. It's still not ideal to
    go over all files twice, but it's better than thrice.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    a8be9db View commit details
    Browse the repository at this point in the history
  9. Add change log entries

    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    455fad9 View commit details
    Browse the repository at this point in the history
  10. Small cleanup

    This cleanup is the result of a rebase. It would be more effort to
    incorporate these fixes into their original commits.
    
    Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
    carmenbianca committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    9a237f4 View commit details
    Browse the repository at this point in the history