Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hidden directories search performance decreased #1734

Closed
sometimesfood opened this issue Mar 22, 2015 · 10 comments
Closed

Hidden directories search performance decreased #1734

sometimesfood opened this issue Mar 22, 2015 · 10 comments

Comments

@sometimesfood
Copy link
Contributor

While commit 5df9ffb seems to have fixed #1656, it seems to have introduced a significant drop in performance.

I am using rubocop commit a1865b5 without a custom .rubocop.yml.

With bundle install --path vendor:

% time bundle exec rubocop    
Inspecting 75 files
[…]
bundle exec rubocop  9.69s user 0.31s system 98% cpu 10.131 total

If I install my gems to .bundle (bundle install --path .bundle), performance decreases sharply:

% time bundle exec rubocop
Inspecting 75 files
[…]
bundle exec rubocop  30.91s user 1.55s system 95% cpu 33.914 total

Sorry to keep bothering you about this, but as rubocop is becoming increasingly popular and many developers are relying on it, I feel that this could affect other projects as well. (My project has a relatively small number of dependencies – for a typical Rails project the effect of this change would probably be even more pronounced.)

@sometimesfood
Copy link
Contributor Author

If I add this to .rubocop.yml, performance is back to normal for both versions:

AllCops:
  Exclude:
    - '.bundle/**/*'
    - 'vendor/**/*'
% time bundle exec rubocop
Inspecting 75 files
[…]
bundle exec rubocop  9.98s user 0.32s system 97% cpu 10.523 total

Please let me know if you need more data to get to the root cause of this.

@bquorning
Copy link
Contributor

You say you get the performance decrease without a custom .rubocop.yml file. Do you get any deprecation warnings (“Warning: Deprecated pattern style …”)?

@sometimesfood
Copy link
Contributor Author

No, none whatsoever.

I'll post a copy&pasteable example to reproduce this tomorrow; should have done so earlier.

@bquorning
Copy link
Contributor

I’ve been able to reproduce the issue by, as you said, bundling to .bundle and removing any .rubocop.yml file.

I don’t quite see the same increase in runtime as you do, but that depends a lot on the number of hidden files and directories, I guess.

The root of the problem is that two different styles of directory patterns are supported for the Include/Exclude configuration, but the older style has been deprecated for almost a year (since v0.21.0 – #1018). Maybe it’s time to stop supporting the old syntax? cc @jonas054

@sometimesfood
Copy link
Contributor Author

Here's a copy & paste example:

mkdir rubocop-repro-1734
cd rubocop-repro-1734
cat > Gemfile <<EOF     
source 'https://rubygems.org'
gem 'rubocop', git: 'https://github.com/bbatsov/rubocop.git', ref: 'a1865b5'
gem 'rails'
EOF
bundle install --path .bundle
time bundle exec rubocop > /dev/null
mkdir vendor
mv .bundle/ruby vendor
bundle install --path vendor
time bundle exec rubocop    

Output for the first rubocop run (gems in .bundle):

bundle exec rubocop > /dev/null  1.17s user 0.24s system 97% cpu 1.446 total

Output for the second rubocop (gems in vendor):

bundle exec rubocop > /dev/null  3.20s user 0.35s system 97% cpu 3.623 total

Of course this increases dramatically for multiple Ruby versions, a larger number of Gems, a couple of bundle updates etc.

@jonas054
Copy link
Collaborator

@sometimesfood It's the other way around, right? Hidden directories are slower than excluded directories. At least that's what I get.

@bquorning I think it's time to remove support for the old pattern style, but it won't solve the whole problem. There are some optimizations we could do to speed up the (non-)matching of hidden directories. I'll submit a PR soon so we can discuss whether the added code is justified.

@sometimesfood
Copy link
Contributor Author

@sometimesfood It's the other way around, right? Hidden directories
are slower than excluded directories. At least that's what I get.

That's correct. Sorry, did not see @bquorning's earlier comment.

@jonas054
Copy link
Collaborator

Please take a look at the following
master...jonas054:1734_faster_hidden_dirs_search

The interesting thing is that removing the old style pattern matching gives no addition decrease in execution time when the other two optimizations have been applied. These are the results I got on my system. YMMV.

Change % decrease
Remove old style pattern matching 22
Optimize PathUtil#relative_path for common case 33
Optimize Config#file_to_include? for hidden dirs 30
Config#file_to_include? + PathUtil#relative_path 64
Config#file_to_include? + PathUtil#relative_path + remove old style 64

This is why I didn't make a pull request, because my suggestion is that we keep the old style pattern matching for now, as it's not relevant for this issue.

@bquorning
Copy link
Contributor

Those numbers sure look good. Great job @jonas054. And my argument for removing the old pattern style doesn't apply anymore.

I'm curious to see 2 more numbers:

  • This issue being about a performance regression, how do your changes perform in comparison with v0.29.1?
  • Is the performance significantly slower on a project without a lot of hidden files?

@sometimesfood
Copy link
Contributor Author

@jonas054 Thanks, your changes to PathUtil#relative_path and Config#file_to_include? significantly improve rubocop's performance for me.

Hidden directories (.bundle):

bundle exec rubocop  10.79s user 0.88s system 99% cpu 11.752 total

Excluded directories (vendor):

bundle exec rubocop  10.28s user 0.36s system 94% cpu 11.273 total

As you said, fd3c56e3 does not seem to have any measurable influence on performance.

@bquorning Directly comparing performance with 0.29.1 is not possible because #1656 caused hidden directories to always be searched. With 0.29.1, I experienced runtime in excess of 90s. The last version that worked correctly for me before that was 0.28. While 0.28 did not have #1656, it was a lot slower than the current version for me:

bundle exec rubocop  23.78s user 0.94s system 98% cpu 25.224 total

All in all I think the changes are a big improvement. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants