Skip to content
This repository has been archived by the owner on May 23, 2024. It is now read-only.

SIGSEGV in jaeger-client-go/reporter.go #155

Closed
yqf3139 opened this issue May 31, 2017 · 7 comments
Closed

SIGSEGV in jaeger-client-go/reporter.go #155

yqf3139 opened this issue May 31, 2017 · 7 comments
Assignees

Comments

@yqf3139
Copy link

yqf3139 commented May 31, 2017

I was try to migrate from the zipkin backend to jaeger, so everything I need is to swap the tracer.

To init the tracer:

// Sample configuration for testing. Use constant sampling to sample every trace
// and enable LogSpan to log every span via configured Logger.
config := jaegercfg.Configuration{
	Sampler: &jaegercfg.SamplerConfig{
		Type:  jaeger.SamplerTypeConst,
		Param: 1,
	},
	Reporter: &jaegercfg.ReporterConfig{
	LogSpans: true,
		LocalAgentHostPort: "jaeger-all-in-one-agent.fission:5775",
	},
}
// Example logger and metrics factory. Use github.com/uber/jaeger-client-go/log
// and github.com/uber/jaeger-lib/metrics respectively to bind to real logging and metrics
// frameworks.
jLogger := jaegerlog.StdLogger
jMetricsFactory := metrics.NullFactory
// Initialize tracer with a logger and a metrics factory
closer, err := config.InitGlobalTracer(
	"router",
	jaegercfg.Logger(jLogger),
	jaegercfg.Metrics(jMetricsFactory),
)
if err != nil {
	log.Printf("Could not initialize jaeger tracer: %s", err.Error())
	return
}
defer closer.Close()

And then error occured:

2017-05-31T04:36:39.214604075Z panic: runtime error: invalid memory address or nil pointer dereference
2017-05-31T04:36:39.214734078Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x80492cc]
2017-05-31T04:36:39.214753293Z 
2017-05-31T04:36:39.214814925Z goroutine 16 [running]:
2017-05-31T04:36:39.214944289Z sync/atomic.AddUint64(0x19f3a424, 0xffffffff, 0xffffffff, 0x19b1c700, 0x1)
2017-05-31T04:36:39.215025990Z 	/home/yqf/bin/go/src/sync/atomic/asm_386.s:112 +0xc
2017-05-31T04:36:39.215083702Z github.com/fission/fission/vendor/github.com/uber/jaeger-client-go.(*remoteReporter).processQueue(0x19f3a400)
2017-05-31T04:36:39.215193739Z 	/home/yqf/dev/mygo/src/github.com/fission/fission/vendor/github.com/uber/jaeger-client-go/reporter.go:234 +0x107
2017-05-31T04:36:39.215252715Z created by github.com/fission/fission/vendor/github.com/uber/jaeger-client-go.NewRemoteReporter
2017-05-31T04:36:39.215361183Z 	/home/yqf/dev/mygo/src/github.com/fission/fission/vendor/github.com/uber/jaeger-client-go/reporter.go:200 

When the error occured, r.queueLength is 0.
Appreciate any helps.

@black-adder black-adder self-assigned this May 31, 2017
@black-adder
Copy link
Contributor

Looking at the stack trace you provided, I found this ticket: golang/go#13868

I need to investigate if it's related but it points to the same line of code that the stack trace referenced.

@yurishkuro
Copy link
Member

We need to move queueLength to the top of the remoteReporter struct, and leave a comment explaining why, with a link to that Go issue.

@yqf3139
Copy link
Author

yqf3139 commented May 31, 2017

Thanks.

I ran the jaeger image on a GCP machine which is 64 bit. Still cannot fully understand where the problem is if it is running on a 64 bit machine.

@black-adder
Copy link
Contributor

@yqf3139 before I put in the fix for this, do you mind applying the change locally and running your service again? I can't reproduce this issue on my end.

In jaeger-client-go/reporter.go change

type remoteReporter struct {
	reporterOptions
	sender       transport.Transport
	queue        chan *zipkincore.Span
	queueLength  int64 // signed because metric's gauge is signed
	queueDrained sync.WaitGroup
	flushSignal  chan *sync.WaitGroup
}

to

type remoteReporter struct {
        queueLength  int64 // signed because metric's gauge is signed
	reporterOptions
	sender       transport.Transport
	queue        chan *zipkincore.Span
	queueDrained sync.WaitGroup
	flushSignal  chan *sync.WaitGroup
}

@yqf3139
Copy link
Author

yqf3139 commented May 31, 2017

@black-adder I will have a try.

@yqf3139
Copy link
Author

yqf3139 commented May 31, 2017

@black-adder Thanks, problem solved by making the change. We use GOARCH=386 to build, so the problem is caused by the go issue.

@yqf3139 yqf3139 closed this as completed May 31, 2017
@tomersimis-zz
Copy link

Had the same problem and reordering the remoteReporter fields did work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants