Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic in docker module: interface conversion int64 uint64 #2027

Closed
andyceo opened this issue Nov 10, 2016 · 7 comments · Fixed by #2190
Closed

panic in docker module: interface conversion int64 uint64 #2027

andyceo opened this issue Nov 10, 2016 · 7 comments · Fixed by #2190
Assignees
Labels
bug unexpected problem or unintended behavior panic issue that results in panics from Telegraf
Milestone

Comments

@andyceo
Copy link

andyceo commented Nov 10, 2016

Ubuntu 16.04, telegraf: 1.1.0, installed from ofiicial repo.

If in docker plugin total = true, telegraf fails with mesaage:

ноя 15 14:05:21 newhope telegraf[10797]: panic: interface conversion: interface is int64, not uint64
ноя 15 14:05:21 newhope telegraf[10797]: goroutine 109 [running]:
ноя 15 14:05:21 newhope telegraf[10797]: panic(0xf24700, 0xc420050940)
ноя 15 14:05:21 newhope telegraf[10797]:         /usr/local/go/src/runtime/panic.go:500 +0x1a1
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.gatherBlockIOMetrics(0xc4201d4000, 0x18dfd20, 0xc4202d9650, 0xc420288390, 0xecfbce2f1, 0xc419c899a3, 0x18acdc0, 0xc42010d800, 0x40, 0xd472062c93e10101)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:492 +0x2109
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.gatherContainerStats(0xc4201d4000, 0x18dfd20, 0xc4202d9650, 0xc420288390, 0xc42010d800, 0x40, 0x18d0101)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:385 +0x30b1
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).gatherContainer(0xc420196360, 0xc42010d800, 0x40, 0xc42010d840, 0x1, 0x4, 0xc4205b2d80, 0xc, 0xc4200f9090, 0x47, ...)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:264 +0x68f
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather.func1(0xc4205d2450, 0xc420196360, 0x18dfd20, 0xc4202d9650, 0xc42010d800, 0x40, 0xc42010d840, 0x1, 0x4, 0xc4205b2d80, ...)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:127 +0xc5
ноя 15 14:05:21 newhope telegraf[10797]: created by github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:132 +0x2dd
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Unit entered failed state.
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Failed with result 'exit-code'.
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Service hold-off time over, scheduling restart.
ноя 15 14:05:21 newhope systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
ноя 15 14:05:21 newhope systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.

Tuning total = false fix the problem.

@jwilder jwilder added bug unexpected problem or unintended behavior panic issue that results in panics from Telegraf labels Nov 11, 2016
@sparrc
Copy link
Contributor

sparrc commented Nov 14, 2016

the panic lines appear to be truncated, do you have the full traceback? I can't see the line numbers with this one

@sparrc sparrc added this to the 1.2.0 milestone Nov 14, 2016
@sparrc
Copy link
Contributor

sparrc commented Nov 14, 2016

also, what is your docker version?

@andyceo
Copy link
Author

andyceo commented Nov 15, 2016

Docker version 1.12.3, build 6b644ec
I updated telegraf to v1.1.1 (git: release-1.1.0 94de9dc), this error still exists (cheked right now).

Full dump is:

ноя 15 14:05:21 newhope telegraf[10797]: panic: interface conversion: interface is int64, not uint64
ноя 15 14:05:21 newhope telegraf[10797]: goroutine 109 [running]:
ноя 15 14:05:21 newhope telegraf[10797]: panic(0xf24700, 0xc420050940)
ноя 15 14:05:21 newhope telegraf[10797]:         /usr/local/go/src/runtime/panic.go:500 +0x1a1
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.gatherBlockIOMetrics(0xc4201d4000, 0x18dfd20, 0xc4202d9650, 0xc420288390, 0xecfbce2f1, 0xc419c899a3, 0x18acdc0, 0xc42010d800, 0x40, 0xd472062c93e10101)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:492 +0x2109
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.gatherContainerStats(0xc4201d4000, 0x18dfd20, 0xc4202d9650, 0xc420288390, 0xc42010d800, 0x40, 0x18d0101)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:385 +0x30b1
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).gatherContainer(0xc420196360, 0xc42010d800, 0x40, 0xc42010d840, 0x1, 0x4, 0xc4205b2d80, 0xc, 0xc4200f9090, 0x47, ...)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:264 +0x68f
ноя 15 14:05:21 newhope telegraf[10797]: github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather.func1(0xc4205d2450, 0xc420196360, 0x18dfd20, 0xc4202d9650, 0xc42010d800, 0x40, 0xc42010d840, 0x1, 0x4, 0xc4205b2d80, ...)
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:127 +0xc5
ноя 15 14:05:21 newhope telegraf[10797]: created by github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather
ноя 15 14:05:21 newhope telegraf[10797]:         /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:132 +0x2dd
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Unit entered failed state.
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Failed with result 'exit-code'.
ноя 15 14:05:21 newhope systemd[1]: telegraf.service: Service hold-off time over, scheduling restart.
ноя 15 14:05:21 newhope systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
ноя 15 14:05:21 newhope systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.

I get this dump with sudo journalctl -b command and copy-pasted error messages related to telegraf.

@sparrc
Copy link
Contributor

sparrc commented Nov 15, 2016

please try to find a way to get untruncated lines from journalctl, some googling around seems to suggest that journalctl -b --no-pager | less might work

@andyceo
Copy link
Author

andyceo commented Nov 15, 2016

@sparrc Thank you for pointing me! I updated comment above, please check.

Upd.: Update error message in first post.

@dbienapfl
Copy link

dbienapfl commented Dec 7, 2016

I am experiencing the same crash - telegraf running in a ubuntu 14.4 docker container, mounting /var/run into a '--privileged' docker container so i can read both base system metrics as well as '/var/run/docker.sock' - full trace:

panic: interface conversion: interface is int64, not uint64

goroutine 42 [running]:
panic(0xf24700, 0xc420052980)
	/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/influxdata/telegraf/plugins/inputs/docker.gatherBlockIOMetrics(0xc4200f0000, 0x18dfd20, 0xc42000ccc0, 0xc42024da70, 0xecfda6da4, 0xc407be3aee, 0x18acdc0, 0xc42006f0c0, 0x40, 0x9289f8f245e60101)
	/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:492 +0x2109
github.com/influxdata/telegraf/plugins/inputs/docker.gatherContainerStats(0xc4200f0000, 0x18dfd20, 0xc42000ccc0, 0xc42024da70, 0xc42006f0c0, 0x40, 0x18d0101)
	/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:385 +0x30b1
github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).gatherContainer(0xc42006c2a0, 0xc42006f0c0, 0x40, 0xc42006f100, 0x1, 0x4, 0xc4204505a0, 0x13, 0xc420013ae0, 0x47, ...)
	/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:264 +0x68f
github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather.func1(0xc4206ac040, 0xc42006c2a0, 0x18dfd20, 0xc42000ccc0, 0xc42006f0c0, 0x40, 0xc42006f100, 0x1, 0x4, 0xc4204505a0, ...)
	/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:127 +0xc5
created by github.com/influxdata/telegraf/plugins/inputs/docker.(*Docker).Gather
	/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/docker/docker.go:132 +0x2dd````

@sparrc
Copy link
Contributor

sparrc commented Dec 21, 2016

I haven't been able to reproduce this, and looking at the code I don't see how it's possible, but I do have a change that should prevent it from happening. Before merging that I'd like to dig into it a bit more.

Do either of you have steps to reproduce? does the panic happen immediately? how many containers are you running?

njwhite pushed a commit to njwhite/telegraf that referenced this issue Jan 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior panic issue that results in panics from Telegraf
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants