Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort search results when using -X option if results are not many. Solve Issue #441 #513

Closed
wants to merge 4 commits into from

Conversation

MarcoIeni
Copy link
Contributor

@MarcoIeni MarcoIeni commented Nov 30, 2019

Open Source Saturday

This PR solves #441.

As suggested by @sharkdp in this comment we created a constant called SORT_THRESHOLD. If the found paths are < SORT_THRESHOLD, then the paths are sorted.

Feel free to propose changes if you have ideas to clean the code or to set a proper value for the SORT_THRESHOLD constant.

@MarcoIeni MarcoIeni marked this pull request as ready for review November 30, 2019 15:42
Copy link
Owner

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution!

I added a few comments.

@@ -27,6 +27,10 @@ use self::input::{basename, dirname, remove_extension};
pub use self::job::{batch, job};
use self::token::Token;

/// When `ExecutionMode` is `Batch` and number of results is less
/// than this value, results will be sorted
const SORT_THRESHOLD: usize = 100;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be similar to MAX_BUFFER_LENGTH from src/internal.mod. Could we reuse that constant?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle I'm slightly in favour of a specific constant, possibly with the same value. Anyway, you have a better picture of the whole code, so if you think that the semantic is similar enough then I'm OK with reusing MAX_BUFFER_LENGTH.

src/exec/mod.rs Outdated Show resolved Hide resolved
src/exec/mod.rs Outdated Show resolved Hide resolved
let mut first_paths = take(&mut paths, SORT_THRESHOLD);
has_path = !first_paths.is_empty();

if first_paths.len() < SORT_THRESHOLD {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be <= SORT_THRESHOLD?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as it is now first_paths.len() <= SORT_THRESHOLD is always true. Maybe it's clearer if we refactor as

let mut first_paths = take(&mut paths, SORT_THRESHOLD + 1);

if first_paths.len() <= SORT_THRESHOLD {

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what the intention is?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate that you were concerned about performance and introduced the SORT_THRESHOLD, but I am not sure if this is really required. I would expect the sort time to be much smaller than the actual filesystem-search. I would therefore remove the limit and simplify the logic here. There is a limit to the number of paths for -X anyway (see #410), so I don't think this will cause any problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our motivating reason was this. If sorting time is not an issue, please close this pull request and we'll submit another with less noise.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. I forgot about my "If we would not be restricted, it might make sense to only sort the results if there are less than some fixed number." comment.

I would like to start with an easy solution where we always sort the paths. If this turns out to be a problem (performance wise), we can go back to an approach like this.

Comment on lines +223 to +224
/// This function behaves exactly like the standard library `take` for iterators,
/// except that the original iterator is not consumed in the process.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what Iterator::peekable() is for?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know there is no equivalent of this function in the standard library: could you elaborate on how you wanted to use peek?

@crash-g
Copy link

crash-g commented Dec 24, 2019

Thank you very much for your contribution!

Thanks to you for the careful review @sharkdp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants