-
-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort search results when using -X option if results are not many. Solve Issue #441 #513
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your contribution!
I added a few comments.
@@ -27,6 +27,10 @@ use self::input::{basename, dirname, remove_extension}; | |||
pub use self::job::{batch, job}; | |||
use self::token::Token; | |||
|
|||
/// When `ExecutionMode` is `Batch` and number of results is less | |||
/// than this value, results will be sorted | |||
const SORT_THRESHOLD: usize = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be similar to MAX_BUFFER_LENGTH
from src/internal.mod
. Could we reuse that constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle I'm slightly in favour of a specific constant, possibly with the same value. Anyway, you have a better picture of the whole code, so if you think that the semantic is similar enough then I'm OK with reusing MAX_BUFFER_LENGTH
.
let mut first_paths = take(&mut paths, SORT_THRESHOLD); | ||
has_path = !first_paths.is_empty(); | ||
|
||
if first_paths.len() < SORT_THRESHOLD { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be <= SORT_THRESHOLD
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, as it is now first_paths.len() <= SORT_THRESHOLD
is always true. Maybe it's clearer if we refactor as
let mut first_paths = take(&mut paths, SORT_THRESHOLD + 1);
if first_paths.len() <= SORT_THRESHOLD {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what the intention is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate that you were concerned about performance and introduced the SORT_THRESHOLD
, but I am not sure if this is really required. I would expect the sort time to be much smaller than the actual filesystem-search. I would therefore remove the limit and simplify the logic here. There is a limit to the number of paths for -X
anyway (see #410), so I don't think this will cause any problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our motivating reason was this. If sorting time is not an issue, please close this pull request and we'll submit another with less noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. I forgot about my "If we would not be restricted, it might make sense to only sort the results if there are less than some fixed number." comment.
I would like to start with an easy solution where we always sort the paths. If this turns out to be a problem (performance wise), we can go back to an approach like this.
/// This function behaves exactly like the standard library `take` for iterators, | ||
/// except that the original iterator is not consumed in the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what Iterator::peekable()
is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know there is no equivalent of this function in the standard library: could you elaborate on how you wanted to use peek
?
Thanks to you for the careful review @sharkdp |
This PR solves #441.
As suggested by @sharkdp in this comment we created a constant called SORT_THRESHOLD. If the found paths are < SORT_THRESHOLD, then the paths are sorted.
Feel free to propose changes if you have ideas to clean the code or to set a proper value for the SORT_THRESHOLD constant.