-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How sampler.type=remote works #832
Comments
|
a. ClientSamplingConfiguration says probabilistic 0.1, and CollectorSamplingConfiguration says probabilistic 0.2 b. ClientSamplingConfiguration says remote, and CollectorSamplingConfiguration says probabilistic 0.2
|
|
|
You mean persisted into storage?
Sorry still confused when is the 0.1 used ? service -> agent or agent -> collector or collector -> storage ? or all of them (if all of them then finally 0.1 * 0.1 * 0.1 will be stored in DB right) ? |
The sampling rate is only used at the service, 0.1 of traces will be stored in the DB. |
Ok! so the sampling only happens in service before sending spans out.
What is the flow ? From service's standpoint, is it pull / push ? |
services pulls from agent every minute, this is configurable: https://github.com/jaegertracing/jaeger-client-go/blob/master/config/config.go#L86 We haven't done this yet but I've always wanted to do push. It's on my personal road map. |
Can you also help me understand another sampling propagation question -- |
Sampling and generation of a span happens roughly at the same time. Context is always propagated between services (even if unsampled). |
If service B receives a request with context saying something like {"span_a", "unsampled"}, B will still create a span as child of "span_a" and propagate continuously , but won't report it, is it correct ? |
yes |
Ok so does it mean that it's possible for every request being traced by putting it's span info into the log even though we do sampling ? |
I'm not sure I understand the question. Are you asking if span logs are always persisted even if we do sampling? |
I'm asking is it possible to use some logging framework like MDC (http://www.baeldung.com/mdc-in-log4j-2-logback) to log the trace id for every single request even if we do sampling. |
yes you can log the trace id for every request but since you're sampling, some logs will have trace ids without a persisted trace. |
This is a golang example: https://github.com/jaegertracing/jaeger/blob/master/examples/hotrod/pkg/log/spanlogger.go however, here we're doing more than just logging the traceid, we're dual logging to both the log reporter and into the span. |
closing issue, feel free to open if you have more questions |
The agent proxies the config request to the collector through which connection? The TChannel or gRPC, whichever one is connected? Thank you. |
whichever one you configure on the agent. We recommend gRPC.
See #1718
Can you elaborate what can be improved in the docs? If you're using |
Thanks, now it's all coming together through different info from the different refered github issues.
For improvements to the doc, here are ideas:
Thanks. |
Does 'remote' sampling work with http-sender? In my aks cluster setup, I haven't configured 'jaeger-agent'. |
Sampler has nothing to do with Sender, it's an independent component. It can work with both the agent and the collector. |
Thanks Yuri for the quick response. I really appreciate your help on this. I'm using Jaeger K8s operators and has following sampling strategy in the configmap:
*** We're using monitoring namespace instead of observability. Client application has following properties: **
** Since these properties are defined in the application's properties file, I'm overriding using k8s environment variables. I have set sampler.type to remote. As I don't know what value should be given to sampling-rate when sampler.type is set to remote, I set it as 1 With this when I created the pod, every sample is being collected. I'm not sure why it is not honoring remote configuration. Am I missing anything? |
The numeric value of 1 is treated as 100% default probability when the sampler cannot contact the backend. It's possible that in your deployment it cannot reach the backend and never gets the 0.1 probability. The sampler emits metrics about unsuccessful configuration pulls. |
Dear yuri, |
Previously agent was using Thrift to retrieve sampling from collector. Not it uses protobuf, but the clients consume sampling as JSON, and that JSON is still generated from Thrift. |
You mean the sample strategies are sent to agents from collector via thrift previously but via protobuff+gRPC now ? I know client get sampling json using http+5778 port. So I care about how collector sent them to agent. |
collector to agent is grpc |
Thanks, yuri. |
I'm not sure this is completely correct. Or there is a bug in this code path. I'm setting sampler type to remote and leaving the param yet, the param value is being set to 1 by default even when the remote actually has a param of 0.5. Seems like a bug to me. |
" Remote (sampler.type=remote, which is also the default) sampler consults Jaeger agent for the appropriate sampling strategy to use in the current service. This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend, or even dynamically (see Adaptive Sampling). "
This is excerpted from Jaeger Doc and it looks pretty confusing to me. Can you help me to understand it with the following questions ?
"consults Jaeger agent for the appropriate sampling strategy" -- As I know there are two places to configure sampling rate: jaeger-client and jaeger-collector. What role does Jaeger agent play here?
"This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend" -- Does "a central configuration in Jaeger backend" mean jaeger-collector ?
What if we use zipkin-client + jaeger backend (jaeger-collector + jaeger-ui + storage) ? In this case we don't have jaeger-agent running, how does the "remote consulting" work ?
A follow-up question on 3. Without jaeger-agent, how is batch handled ? According to zipkin's doc: https://zipkin.io/pages/architecture.html, I interpret "Transport" as "jaeger-agent", in zipkin-client + jaeger backend scenario, do we discard "Transport" ?
The text was updated successfully, but these errors were encountered: