Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.12.1-1] too many opened files #1188

Closed
deric opened this issue May 12, 2016 · 1 comment
Closed

[0.12.1-1] too many opened files #1188

deric opened this issue May 12, 2016 · 1 comment

Comments

@deric
Copy link
Contributor

deric commented May 12, 2016

In case when InfluxDB crashes telegraf keeps opened too many connections:

$ lsof | grep telegraf | wc -l
23690

which might lead to situation where host runs out of file descriptors. If we assume that target DB is always available (which would be nice) this situation couldn't happen. But we don't live in an ideal world and this behavior might crash other services on the host.

ay 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [diskio]: error getting disk io info: open /proc/diskstats: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [disk]: error getting disk usage info: open /etc/mtab: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [net]: error getting net io info: open /proc/net/dev: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 FATAL: Input [cpu] panicked: runtime error: index out of range, Stack:
May 12 12:44:20 psql02a telegraf[848]: goroutine 5673051 [running]:
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/agent.panicRecover(0xc82034a780)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:96 +0xa9
May 12 12:44:20 psql02a telegraf[848]: panic(0x10241a0, 0xc82000e050)
May 12 12:44:20 psql02a telegraf[848]: /usr/local/go/src/runtime/panic.go:426 +0x4e9
May 12 12:44:20 psql02a telegraf[848]: github.com/shirou/gopsutil/cpu.parseStatLine(0x11c43f8, 0x0, 0x1, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/shirou/gopsutil/cpu/cpu_linux.go:169 +0x9e4
May 12 12:44:20 psql02a telegraf[848]: github.com/shirou/gopsutil/cpu.CPUTimes(0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/shirou/gopsutil/cpu/cpu_linux.go:49 +0x2e3
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/plugins/inputs/system.(*systemPS).CPUTimes(0x1bda028, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/system/ps.go:45 +0x22e
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/plugins/inputs/system.(*CPUStats).Gather(0xc82034a720, 0x7f7f7ff59a30, 0xc830957a00, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/system/cpu.go:44 +0x8d
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/agent.(*Agent).gatherParallel.func1(0xc8309923e0, 0xc820204ba0, 0xc820028090, 0x0, 0xc82034a780)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:139 +0x42e
May 12 12:44:20 psql02a telegraf[848]: created by github.com/influxdata/telegraf/agent.(*Agent).gatherParallel
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:143 +0x442
May 12 12:44:20 psql02a telegraf[848]: goroutine 1 [semacquire]:
May 12 12:44:20 psql02a telegraf[848]: sync.runtime_Semacquire(0xc8309923ec)
May 12 12:44:20 psql02a telegraf[848]: /usr/local/go/src/runtime/sema.go:47 +0x26
May 12 12:44:20 psql02a telegraf[848]: sync.(*WaitGroup).Wait(0xc8309923e0)
May 12 12:44:20 psql02a telegraf[848]: /usr/local/go/src/sync/waitgroup.go:127 +0xb4
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/agent.(*Agent).gatherParallel(0xc820028090, 0xc820204ba0, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:150 +0x145
May 12 12:44:20 psql02a telegraf[848]: github.com/influxdata/telegraf/agent.(*Agent).Run(0xc820028090, 0xc820204900, 0x0, 0x0)
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:378 +0xa7e
May 12 12:44:20 psql02a telegraf[848]: main.main()
May 12 12:44:20 psql02a telegraf[848]: /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/cmd/t
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 PLEASE REPORT THIS PANIC ON GITHUB with stack trace, configuration, and OS information: https://github.com/influxdata/telegraf/issues/new
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [diskio]: error getting disk io info: open /proc/diskstats: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [processes]: open /proc: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [system]: open /proc/loadavg: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Error in input [kernel]: open /proc/stat: too many open files
May 12 12:44:20 psql02a telegraf[848]: 2016/05/12 12:44:20 Gathered metrics, (10s interval), from 10 inputs in 15.202445ms
@sparrc
Copy link
Contributor

sparrc commented May 18, 2016

fixed in 0.13: #1058

@sparrc sparrc closed this as completed May 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants