-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hyperlink support #2610
add hyperlink support #2610
Conversation
6600d46
to
d32f66b
Compare
I know this is a WIP but I took a quick glance and thought I'd mention it just in case: you removed the |
Oh, hmmm. The point about a non-match is a good one. I'll take a closer look. Thank you for catching that! |
Yes, I see, that's a good catch. Arguably other stuff in a |
a432325
to
17bba5d
Compare
This commit represents the initial work to get hyperlinks working and was submitted as part of PR #2483. Subsequent commits largely retain the functionality and structure of the hyperlink support added here, but rejigger some things around.
This does a variety of polishing. 1. Deprecate the tty methods in favor of std's IsTerminal trait. 2. Trim down un-needed dependencies. 3. Use bstr to implement escaping. 4. Various aesthetic polishing. I'm doing this as prep work before adding more to this crate. And as part of a general effort toward reducing ripgrep's dependencies.
This will enable us to query for the current system's hostname in both Unix and Windows environments. We could have pulled in the 'gethostname' crate for this, but: 1. I'm not a huge fan of micro-crates. 2. The 'gethostname' crate panics if an error occurs. (Which, to be fair, an error should never occur, but it seems plausible on borked systems? ripgrep runs in a lot of places, so I'd rather not take the chance of a panic bringing down ripgrep for an optional convenience feature.) 3. The 'gethostname' crate uses the 'windows-targets' crate from Microsoft. This is arguably the "right" thing to do, but ripgrep doesn't use them yet and they appear high-churn. So I just added a safe wrapper to do this to winapi-util[1] and then inlined the Unix version here. This brings in no extra dependencies and the routine is fallible so that callers can recover from potentially strange failures. [1]: BurntSushi/winapi-util#14
Like a previous commit did for the grep-cli crate, this does some polishing to the grep-printer crate. We aren't able to achieve as much as we did with grep-cli, but we at least eliminate all rust-analyzer lints and group imports in the way I've been doing recently. Next we'll start doing some more invasive changes.
I originally did not put PathPrinter into grep-printer because I considered it somewhat extraneous to what a "grep" program does, and also that its implementation was rather simple. But now with hyperlink support, its implementation has grown a smidge more complicated. And more importantly, its existence required exposing a lot more of the hyperlink guts. Without it, we can keep things like HyperlinkPath and HyperlinkSpan completely private. We can now also keep `PrinterPath` completely private as well. And this is a breaking change.
This essentially takes the work done in #2483 and does a bit of a facelift. A brief summary: * We reduce the hyperlink API we expose to just the format, a configuration and an environment. * We move buffer management into a hyperlink-specific interpolator. * We expand the documentation on --hyperlink-format. * We rewrite the hyperlink format parser to be a simple state machine with support for escaping '{{' and '}}'. * We remove the 'gethostname' dependency and instead insist on the caller to provide the hostname. (So grep-printer doesn't get it itself, but the application will.) Similarly for the WSL prefix. * Probably some other things. Overall, the general structure of #2483 was kept. The biggest change is probably requiring the caller to pass in things like a hostname instead of having the crate do it. I did this for a couple reasons: 1. I feel uncomfortable with code deep inside the printing logic reaching out into the environment to assume responsibility for retrieving the hostname. This feels more like an application-level responsibility. Arguably, path canonicalization falls into this same bucket, but it is more difficult to rip that out. (And we can do it in the future in a backwards compatible fashion I think.) 2. I wanted to permit end users to tell ripgrep about their system's hostname in their own way, e.g., by running a custom executable. I want this because I know at least for my own use cases, I sometimes log into systems using an SSH hostname that is distinct from the system's actual hostname (usually because the system is shared in some way or changing its hostname is not allowed/practical). I think that's about it. Closes #665, Closes #2483
17bba5d
to
0fb591c
Compare
May I bother you with a little question I was wondering about? I initially used You replaced those with explicit let line = values.line.unwrap_or(1).to_string();
dest.extend_from_slice(line.as_bytes()); I was wondering why? Is the optimizer able to remove the heap allocation, or does it simply not matter in this context and you prefer that code style? |
Great question! And feel free to ask more like it. I replaced them because I have a faint recollection that It's funny you bring it up, because I actually have a TODO (before the next release) to try and more carefully benchmark this choice. I wanted to bake off four different choices:
In particular, I want to run benchmarks both with the system allocator (glibc on my system) and with musl's allocator, which has been a bit slower in my experience. |
Thanks! 🙂 |
This is based on #2483 and keeps its general structure, but rejiggers some details. The biggest change is probably lifting the
gethostname
call from thegrep-printer
crate all the way up into ripgrep's core. We also add a--hostname-bin
flag that allows the end user to provide an executable program that prints the hostname ripgrep should use.Closes #665