-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(dist/features): ship tracing
and friends by default
#3803
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
I think we should use a basic console-based tracing-subscriber setup: pub(super) fn subscribe() -> tracing::subscriber::DefaultGuard {
let sub = tracing_subscriber::FmtSubscriber::builder()
.with_max_level(tracing::Level::TRACE)
.with_writer(|| TestWriter)
.finish();
tracing::subscriber::set_default(sub)
}
struct TestWriter;
impl Write for TestWriter {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
print!(
"{}",
str::from_utf8(buf).expect("tried to log invalid UTF-8")
);
Ok(buf.len())
}
fn flush(&mut self) -> io::Result<()> {
io::stdout().flush()
}
} This is what I've been using in Quinn for many years. Every test just starts by calling If we do this we can massively simplify the scaffolding for traces/logging:
|
I don't think this is true. We've not asked people to build with otel enabled that I'm remembering. The OS level traces we use to debug fundamental problems are from strace / truss and other similar tools. Otel / tracing! is not deployed widely enough within rustup to be a replacement for such things. |
Please don't - while the OS level debugging is vital, for doing investigations on performance, having a nice report with spans and the detailed call tree is very useful, and since we configure it off by default it has very little overhead to maintain or build with. Really only the all-features-build test matters. |
How many times have you used it in the past year? IMO while custom test macro + opentelemetry dependencies may not impose run-time overhead for downstream users, it does impose significant maintenance overhead that may not be warranted for the additional insight compared to just tracing-subscriber built-in (and maybe tokio-console level output). |
IMO the important point is that we should have a user-facing solution like "enable RUST_LOG=trace and give us the output of that", which seems like a decent method of getting better insight into problems that happen only in specific environments, which seems to be an important source of issues for rustup. |
I just checked and it looks like tokio-console doesn’t currently have a timeline view (tokio-rs/console#129), so I imagine For now I plan to:
More specifically, I imagine having multiple subscribers (tokio-rs/tracing#971) based on env vars and features:
Finally:
|
f50ac2f
to
2ff50f8
Compare
#3367 has been merged, would be good to rebase this! |
2142f7e
to
6d54875
Compare
This comment was marked as resolved.
This comment was marked as resolved.
I have been largely absent for the last year, so perhaps a better thing to ask is 'of the times you've been investigating performance, how often have you used detailed call graph with opentelemetry' - and the answer there is every single time. I depend on it.
I think that that is important too, for sure. |
e919914
to
a667805
Compare
@djc The new approach proposed in #3871 (review) has worked! This PR has been thus unblocked from test refactoring and I'll just need to do some rather simple tweaks (see #3803 (comment) for the list of remaining tasks) to get it ready 🎉 PS: I've also changed the signature of |
6489353
to
0cf0c6b
Compare
@rbtcollins @djc I've just finished all 3 cleanup tasks, unfortunately however the 3rd (the final) commit (0cf0c6b) has made our CI red. To make sure that only this commit was bad though, I've reverted it temporarily in 1b7662e, and indeed the CI is green again. I don't see how a mechanical change like 0cf0c6b can cause Rustup to stack overflow on Windows (https://github.com/rust-lang/rustup/actions/runs/9515433133/job/26229657328)... Am I missing something? |
@djc That's a good idea! I'll give it a try tomorrow. That being said, the overall |
Oh yes, I'm in favor of merging without it! |
50216ad 'reorder'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Warning
Since #3876,
RUST_LOG
has been replaced withRUSTUP_LOG
for configuring Rustup's internal logging system. The rest of the thread has usedRUST_LOG
due to historic reasons.Part of #3790.
Rationale
Currently, helping out the Rustup team by enabling local tracing is quite a tedious process (esp. for community contributors), requiring rebuilding Rustup from the exact commit with an extra feature,
otel
:rustup/doc/dev-guide/src/tracing.md
Lines 13 to 36 in 54dd3d0
After some experiment, it turned out that we actually can ship the tracing features by default without forcing the user to face OTEL connection errors on a daily basis.
To clarify, this does not mean Rustup is setting up a central (a.k.a. phone-home-style) telemetry mechanism, and we will keep the tracing disabled by default unless
RUST_LOG
has been explicitly set.Concerns
RUSTUP_DEBUG
in favor ofRUST_LOG=trace
? (Yes.)Should we remove(Not yet, see chore(dist/features): shipopentelemetry
while keepingtracing
(refactor(download): remove curl backend #3788 (comment))? If we should, theotel
feature should be renamed.tracing
and friends by default #3803 (comment).)