Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up #441

Closed
wants to merge 23 commits into from
Closed

Speed up #441

wants to merge 23 commits into from

Conversation

0jdxt
Copy link
Contributor

@0jdxt 0jdxt commented Oct 27, 2020

Whilst working on my other PR I found --tree to be quite slow compared to tree and so analysed cargo flamegraph results resulting in reducing unnecessary allocations and system calls, producing a significant speed gain for the --tree and -R options.

Original flamegraph: (--tree)

flamegraph-slow

After optimisations: (--tree)

flamegraph-fast

After optimisations: (-R)

flamegraph-r

Originally, the main bottleneck was system calls in order to retrieve uid and gid properties and the following processing into strings when the -R and --tree options do not need this information so now on unix, the information is lazy loaded.

Now, in the optimised --tree graph, we see the main bottleneck now is creating the rest of the Meta struct and that the sorting and display isn't too shabby. In the optimised -R graph, we see the displaying and sorting of information into a grid is the biggest bottleneck for this option. This would indicate for future optimisations, some information and processing may need to be lazy loaded for the Meta struct or perhaps simply the algorithms need optimising, on top of improving the grid/tree display performance.

Nevertheless, overall this PR has managed to make the following speed gains, tested only on my linux x86_64 machine with hyperfine, compared with native equivalents ls -R and tree:

~ 5 000 directories, 56 000 files, 27G

Command Mean [s] Min [s] Max [s] Relative
lsd --tree 2.374 ± 0.016 2.350 2.400 6.89 ± 0.15
lsd --tree (opt.) 0.537 ± 0.007 0.528 0.550 1.56 ± 0.04
tree 0.345 ± 0.007 0.338 0.360 1.00
Command Mean [s] Min [s] Max [s] Relative
lsd -R 2.495 ± 0.027 2.458 2.535 19.77 ± 0.33
lsd -R (opt.) 0.649 ± 0.011 0.639 0.668 5.14 ± 0.11
ls -R 0.126 ± 0.002 0.123 0.130 1.00

@0jdxt 0jdxt requested review from meain and Peltoche as code owners October 27, 2020 16:50
@codecov-io
Copy link

codecov-io commented Oct 27, 2020

Codecov Report

Merging #441 (3224f87) into master (7254816) will decrease coverage by 0.19%.
The diff coverage is 73.10%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
- Coverage   88.48%   88.29%   -0.20%     
==========================================
  Files          36       36              
  Lines        3535     3289     -246     
==========================================
- Hits         3128     2904     -224     
+ Misses        407      385      -22     
Impacted Files Coverage Δ
src/core.rs 0.00% <0.00%> (ø)
src/main.rs 9.09% <0.00%> (-0.91%) ⬇️
src/meta/permissions.rs 86.95% <ø> (+3.62%) ⬆️
src/meta/mod.rs 38.79% <29.62%> (-0.69%) ⬇️
src/meta/inode.rs 66.66% <40.00%> (-2.09%) ⬇️
src/flags.rs 91.66% <50.00%> (-8.34%) ⬇️
src/meta/symlink.rs 88.88% <63.63%> (+19.50%) ⬆️
src/display.rs 75.73% <68.88%> (+2.80%) ⬆️
src/meta/owner.rs 61.11% <75.00%> (-13.89%) ⬇️
src/meta/name.rs 94.14% <80.88%> (+0.36%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7254816...3224f87. Read the comment docs.

Copy link
Member

@meain meain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR. I was not able to get this big a difference but a huge improvement however.

src/meta/name.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
src/meta/name.rs Outdated Show resolved Hide resolved
src/meta/symlink.rs Show resolved Hide resolved
Copy link
Contributor Author

@0jdxt 0jdxt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey man! you're right some changes weren't necessary, I reverted some changes and tidied some expressions in some places. This commit now speeds things up in 2 ways:

String allocation

  • Use Cow to the Path instead of an owned String in the Name struct, so allocating a new String only happens if and when needed.
  • Use to_ascii_lowercase instead of to_lowercase since it is faster and the keys we have in Icons only contain, and probably will always only contain, ascii letters.

Syscalls

On linux fetching the file info for all the files was quite slow, especially when the information isnt needed (like -R and --tree). The syscalls have been deferred to when needed in rendering the Meta entry.

src/meta/symlink.rs Show resolved Hide resolved
@data-man
Copy link

What do you think about jwalk?

@0jdxt
Copy link
Contributor Author

0jdxt commented Nov 13, 2020

What do you think about jwalk?

I tried to re-implement it with walkdir and it is ideal for walking directory trees whereas in lsd we are mostly listing the contents of a single given directory, not the tree unless -R or --tree. When I benchmarked, the original lsd implementation was faster which I assume is due to the overhead of setting up threads and preparing to walk the directories in walkdir even though typically we only need to get the entries in 1 directory.

If jwalk is anything like this I would presume it would have the same effect - just adding overhead. At the moment, the walking and filtering is rather efficient since we only descend into directories if needed. I think the most useful speed gains now would be the general workings of the Meta struct and its values (trying to avoid allocations), and the displaying of entries into a grid/tree.

0jdxt and others added 8 commits November 13, 2020 20:18
This commit makes some changes based on analysing the stack trace with
`cargo flamegraph`. All changes are within the creation of the `Meta`
struct since this takes up most of the processing time. Some heavy
operations, namely memory allocations and system calls, have been either reduced or
deferred until the information is needed, significantly speeding up the
`-R` and `--tree` options.

This commit also simplifies some code and makes the repo ignore
`Cargo.lock` since it is not needed for binary projects.
@0jdxt
Copy link
Contributor Author

0jdxt commented Nov 15, 2020

What do you think about jwalk?

Upon looking into it I will see if it helps specifically for these recursive descents I'll try it

@SuperSandro2000
Copy link
Contributor

@0jdxt Do you have planed to finish this PR and resolve the merge conflicts?

@0jdxt
Copy link
Contributor Author

0jdxt commented Mar 14, 2021

@0jdxt Do you have planed to finish this PR and resolve the merge conflicts?

Hi! I've been busy so feel free. It's just such a mess I feel i should be responsible for cleaning it up. Alas, this PR could be merged soon, I'm just a bit concerned on how I've changed the flow of the program a bit and the structure of the program is getting quite complex. Either way the repo needs a tidy up at least with some comments describing the flow of the program so its easier to follow, especially for those who want to contribute to the repo and is reading the code for the first time.

@meain
Copy link
Member

meain commented Mar 18, 2021

Hey @0jdxt , you will have to rebase on master instead of merging from master.

@0jdxt
Copy link
Contributor Author

0jdxt commented Mar 18, 2021

summary: Some code is merely tidied. Replaced String of Name with Cow, reducing unnecessary allocs. Meta keeps raw Metadata to avoid second syscall for the metadata during colouring/rendering. Keep Owner as u64 until rendered. Change icons to chars and use the faster FxHashMap, delegates separator flag to Name::render. Name::escape function was O(2n), reduced to O(n). Name::Ord now uses to_ascii_lowercase instead, increasing speed of comparisons.

Copy link
Member

@meain meain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, this has a great deal of code cleanup as well.
Just took a quick look at the changes.

src/display.rs Outdated Show resolved Hide resolved
src/icon.rs Outdated Show resolved Hide resolved
src/meta/name.rs Show resolved Hide resolved
tests/integration.rs Outdated Show resolved Hide resolved
@meain
Copy link
Member

meain commented Mar 25, 2021

Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.

@SuperSandro2000
Copy link
Contributor

Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.

You can click on force pushed to see an interdiff.

https://github.com/Peltoche/lsd/compare/04bb7141df9226a774cb1cba91977514d20b3625..0f3eb3e459267d78207e3cfa8eca8b23b81a3fb0

@0jdxt
Copy link
Contributor Author

0jdxt commented Mar 25, 2021

Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.

Hi sorry still not used to this PR stuff

@meain
Copy link
Member

meain commented Mar 28, 2021

@0jdxt Had to merge in #489 . You will have to rebase your PR on top of current master.
This looks mostly good to go otherwise.

0jdxt added 3 commits March 28, 2021 15:48
this pr reduces syscalls, especially repeated ones. We should only get
the metadata once per file. Reduce allocating extra data, especially
when not used.
@0jdxt
Copy link
Contributor Author

0jdxt commented Mar 28, 2021

@0jdxt Had to merge in #489 . You will have to rebase your PR on top of current master.
This looks mostly good to go otherwise.

Rebased! should be good to go

// Check through libc if stdout is a tty. Unix specific so not on windows.
// Determine color output availability (and initialize color output (for Windows 10))
#[cfg(not(target_os = "windows"))]
let tty_available = unsafe { libc::isatty(io::stdout().as_raw_fd()) == 1 };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was used to check if lsd is outputting to a terminal or being piped to something else. This behaviour is not available with the new changes.

For example if we pipe lsd to less, we used to convert it to single line and remove the colors. But now it does not do that.

Screenshot

src/display.rs Outdated
)
}

generate_counter!(DIR_COUNT, u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to use this here? I am thinking it would be better(simpler) to just have usize values created in fn tree.

src/display.rs Outdated
Comment on lines 40 to 43
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) {
Some(i) => i,
None => 0,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) {
Some(i) => i,
None => 0,
};
let index = flags.blocks.0.iter().position(|&b| b == Block::Name).unwrap_or(0);

@codecov-commenter
Copy link

Codecov Report

Merging #441 (38a6933) into master (b6bd1d3) will increase coverage by 0.44%.
The diff coverage is 63.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
+ Coverage   80.71%   81.16%   +0.44%     
==========================================
  Files          35       35              
  Lines        3449     3413      -36     
==========================================
- Hits         2784     2770      -14     
+ Misses        665      643      -22     
Impacted Files Coverage Δ
src/core.rs 0.00% <0.00%> (ø)
src/display.rs 25.26% <0.00%> (-0.23%) ⬇️
src/main.rs 10.00% <ø> (+10.00%) ⬆️
src/meta/permissions.rs 36.95% <ø> (+1.53%) ⬆️
src/meta/owner.rs 16.66% <15.38%> (-25.01%) ⬇️
src/meta/mod.rs 18.81% <18.18%> (+0.88%) ⬆️
src/meta/symlink.rs 90.24% <50.00%> (+20.43%) ⬆️
src/app.rs 92.15% <57.14%> (-3.93%) ⬇️
src/meta/name.rs 94.85% <75.55%> (+1.05%) ⬆️
src/meta/date.rs 98.79% <95.45%> (+5.53%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6bd1d3...38a6933. Read the comment docs.

@zwpaper
Copy link
Member

zwpaper commented Oct 6, 2021

Hi @0jdxt, thanks so much for working for this massive PR!

I want to try to make this PR fire just now, but I found it mixed up with several functionality changes, and some of it break the original function.

for example:

  1. Cargo.lock should not be added in .gitignore, check explains here
  2. termsize which used in the PR checking whether it is tty failed in pipeline
    image

how about break down the PR into several ones which contains one functionality only and we can make sure it work as expected, also we can do some quicker reviews and merges.

@meain
Copy link
Member

meain commented Sep 12, 2022

I'm closing this PR for now as it has deviated quite a bit from master, but let me know if you are still interested in this and we can open it back up.

@meain meain closed this Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants