Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No mem data on pfsense #3750

Closed
darox opened this issue Feb 4, 2018 · 45 comments
Closed

No mem data on pfsense #3750

darox opened this issue Feb 4, 2018 · 45 comments
Labels
bug unexpected problem or unintended behavior
Milestone

Comments

@darox
Copy link

darox commented Feb 4, 2018

I'm running telegraf on my pfsense to query data and send it to an influxdb, but for some reason the mem data is not available in influxdb.

Is this a known bug or is mem not supported on pfsense?

@darox
Copy link
Author

darox commented Feb 4, 2018

Log output:

2018-02-04T18:49:00Z E! Error in plugin [inputs.netstat]: error getting net connections info: exec: "lsof": executable file not found in $PATH
2018-02-04T18:49:00Z E! Error in plugin [inputs.mem]: error getting virtual memory info: cannot allocate memory

@danielnelson
Copy link
Contributor

This should work on freebsd, though I do not have a pfsense machine currently. Perhaps you can increase the amount of memory available by modifying a system setting? You might be able to get a better answer at the InfluxData Community site or from the pfsense community.

@danielnelson danielnelson added the discussion Topics for discussion label Feb 5, 2018
@W0CHP
Copy link

W0CHP commented Apr 3, 2018

FYI @danielnelson I am able to reproduce this issue on both a pristine installation of both FreeBSD-11.1-p8-RELEASE and pfSense-2.4.3 (11.1-p7-RELEASE). Both servers possess 4GB RAM.

Telegraph 1.4x and 1.3x on these machines do not exhibit this plugin [inputs.mem] issue, oddly.

@darox
Copy link
Author

darox commented Apr 3, 2018

So the problem is, that Telegraf is running on an old version?

@W0CHP
Copy link

W0CHP commented Apr 3, 2018

@darox I am not sure what is causing the issue.

Telegraf 1.5x on 2x amd64 FreeBSD-11.1-RELEASE machines exhibit this error getting virtual memory info: cannot allocate memory issue; yielding no memory data written to InfluxDB.

Telegraf 1.4x and 1.3x on the same 2 machines, do not exhibit this issue; and I can query memory data points just fine.

@danielnelson
Copy link
Contributor

Can you test with the latest 1.6 release candidate?

https://community.influxdata.com/t/telegraf-v1-6-0-rc2/4601

@W0CHP
Copy link

W0CHP commented Apr 9, 2018

@danielnelson Finally got 1.6.0-rc3 built.

Unfortunately I am still seeing the same issues on two machines (one is a pristine installation):

[2.4.3-RELEASE][root@foo]/root: 2018-04-09T21:54:16Z I! Starting Telegraf v1.6.0-rc3
2018-04-09T21:54:16Z I! Loaded outputs: influxdb
2018-04-09T21:54:16Z I! Loaded inputs: inputs.cpu inputs.diskio inputs.mem inputs.pf inputs.swap inputs.system inputs.disk inputs.kernel inputs.net inputs.processes
2018-04-09T21:54:16Z I! Tags enabled: host=redacted
2018-04-09T21:54:16Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"foo", Flush Interval:10s 
2018-04-09T21:54:20Z E! Error in plugin [inputs.mem]: error getting virtual memory info: cannot allocate memory
2018-04-09T21:54:30Z E! Error in plugin [inputs.mem]: error getting virtual memory info: cannot allocate memory

As a result, no [inputs.mem] plugin/input measurements are being written to InfluxDB.

@danielnelson danielnelson reopened this Apr 9, 2018
@danielnelson danielnelson added bug unexpected problem or unintended behavior and removed discussion Topics for discussion labels Apr 9, 2018
@danielnelson
Copy link
Contributor

Since you are setup to compile, can you try running this program:

package main

import (
        "fmt"

        "github.com/shirou/gopsutil/mem"
)

func main() {
        stats, err := mem.VirtualMemory()
        if err != nil {
                fmt.Println(err)
        }
        fmt.Println(stats)
}

Just save it to your telegraf directory as mem.go and run with go run mem.go. I expect it will print the same error but this should be a good minimal program so we can open a bug report with gopsutil.

@danielnelson
Copy link
Contributor

Actually, digging deeper I wonder if this could have been fixed in golang.org/x/sys/. Can you first try this change to the Godeps file:

diff --git a/Godeps b/Godeps
index 7c504a0b..0715fbd7 100644
--- a/Godeps
+++ b/Godeps
@@ -84,7 +84,7 @@ github.com/yuin/gopher-lua 66c871e454fcf10251c61bf8eff02d0978cae75a
 github.com/zensqlmonitor/go-mssqldb ffe5510c6fa5e15e6d983210ab501c815b56b363
 golang.org/x/crypto dc137beb6cce2043eb6b5f223ab8bf51c32459f4
 golang.org/x/net f2499483f923065a842d38eb4c7f1927e6fc6e6d
-golang.org/x/sys 739734461d1c916b6c72a63d7efda2b27edb369f
+golang.org/x/sys 76c138986e66b22cbf82122ac886457fb5226957
 golang.org/x/text 506f9d5c962f284575e88337e7d9296d27e729d3
 gopkg.in/asn1-ber.v1 4e86f4367175e39f69d9358a5f17b4dda270378d
 gopkg.in/fatih/pool.v2 6e328e67893eb46323ad06f0e92cb9536babbabc

After you change this run:

gdm restore
make telegraf

@W0CHP
Copy link

W0CHP commented Apr 9, 2018

@danielnelson Cool that's a good idea. Here's the output:

[root@foo] telegraf-1.6.0-rc3/src/github.com/influxdata/telegraf]# go run mem.go
cannot allocate memory
<nil>

@W0CHP
Copy link

W0CHP commented Apr 9, 2018

@danielnelson Gah must've replied at the same time. :) I will try that - thanks!

@W0CHP
Copy link

W0CHP commented Apr 9, 2018

@danielnelson no change with patching the Godeps.

@danielnelson
Copy link
Contributor

What do you get when you run sysctl -a?

@danielnelson
Copy link
Contributor

danielnelson commented Apr 9, 2018

Maybe we need a Uint64? Try this out:

package main

import (
	"fmt"

	"golang.org/x/sys/unix"
)

func main() {
	pageSize, err := unix.SysctlUint32("vm.stats.vm.v_page_size")
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(pageSize)
	}

	pageSize64, err := unix.SysctlUint64("vm.stats.vm.v_page_size")
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(pageSize64)
	}
}

@W0CHP
Copy link

W0CHP commented Apr 10, 2018

Uint64:
# command-line-arguments ./sys.go:17:19: no new variables on left side of := ./sys.go:17:19: cannot assign uint64 to pageSize (type uint32) in multiple assignment

sysctl -a output is attached:
sysctl.txt

@danielnelson
Copy link
Contributor

I edited the program above (changed variable name of second pageSize to pageSize64).

@W0CHP
Copy link

W0CHP commented Apr 10, 2018

@danielnelson

# go run sys.go
4096
input/output error
#

@danielnelson
Copy link
Contributor

danielnelson commented Apr 10, 2018

Oh, didn't expect the first command to work... what about this:

package main

import (
	"fmt"

	"golang.org/x/sys/unix"
)

func main() {
	pageSize, err := unix.SysctlUint32("vm.stats.vm.v_page_size")
	fmt.Printf("pageSize: %d %v\n", pageSize, err)
	pageCount, err := unix.SysctlUint32("vm.stats.vm.v_page_count")
	fmt.Printf("pageCount: %d %v\n", pageCount, err)
	free, err := unix.SysctlUint32("vm.stats.vm.v_free_count")
	fmt.Printf("free: %d %v\n", free, err)
	active, err := unix.SysctlUint32("vm.stats.vm.v_active_count")
	fmt.Printf("active: %d %v\n", active, err)
	inactive, err := unix.SysctlUint32("vm.stats.vm.v_inactive_count")
	fmt.Printf("inactive: %d %v\n", inactive, err)
	cached, err := unix.SysctlUint32("vm.stats.vm.v_cache_count")
	fmt.Printf("cached: %d %v\n", cached, err)
	buffers, err := unix.SysctlUint32("vfs.bufspace")
	fmt.Printf("buffers: %d %v\n", buffers, err)
	wired, err := unix.SysctlUint32("vm.stats.vm.v_wire_count")
	fmt.Printf("wired: %d %v\n", wired, err)
}

@W0CHP
Copy link

W0CHP commented Apr 10, 2018

# go run sys.go 
# command-line-arguments
./sys.go:17:9: too many arguments to return
        have (nil, error)
        want ()
./sys.go:22:9: too many arguments to return
        have (nil, error)
        want ()
./sys.go:27:9: too many arguments to return
        have (nil, error)
        want ()
./sys.go:32:9: too many arguments to return
        have (nil, error)
        want ()
./sys.go:37:9: too many arguments to return
        have (nil, error)
        want ()
./sys.go:42:9: too many arguments to return
        have (nil, error)
        want ()

@danielnelson
Copy link
Contributor

I updated the program, this is the joy of programming :)

@W0CHP
Copy link

W0CHP commented Apr 10, 2018

@danielnelson Hahah indeed!

We have some results now:

pageSize: 4096 <nil>
pageCount: 941307 <nil>
free: 31370 <nil>
active: 45418 <nil>
inactive: 702947 <nil>
cached: 0 <nil>
buffers: 0 cannot allocate memory
wired: 134976 <nil>

@danielnelson
Copy link
Contributor

This is interesting, we get this error on vfs.bufspace but the call immediately afterwards is successful. How constrained is the memory on this system? Google tells me that it might be vmstat -s on FreeBSD.

@W0CHP
Copy link

W0CHP commented Apr 11, 2018

Memory is utilized well...

sysctl hw | egrep 'hw.(phys|user|real)':

hw.physmem: 4150489088
hw.usermem: 3821998080
hw.realmem: 4294967296

freecolor -m -o:

             total       used       free     shared    buffers     cached
Mem:          3827        499       3327          0          0          0
Swap:          764          0        764

https://raw.githubusercontent.com/ocochard/myscripts/master/FreeBSD/freebsd-memory.sh output:

SYSTEM MEMORY INFORMATION:
mem_wire:         328806400 (    313MB) [  8%] Wired: disabled for paging out
mem_active:  +    196136960 (    187MB) [  4%] Active: recently referenced
mem_inactive:+    508157952 (    484MB) [ 12%] Inactive: recently not referenced
mem_cache:   +            0 (      0MB) [  0%] Cached: almost avail. for allocation
mem_free:    +   2980216832 (   2842MB) [ 74%] Free: fully available for allocation
mem_gap_vm:  +         8192 (      0MB) [  0%] Memory gap: UNKNOWN
______________ ____________ ___________ ______
mem_all:     =   4013326336 (   3827MB) [100%] Total real memory managed
mem_gap_sys: +    137162752 (    130MB)        Memory gap: Kernel?!
______________ ____________ ___________
mem_phys:    =   4150489088 (   3958MB)        Total real memory available
mem_gap_hw:  +    144478208 (    137MB)        Memory gap: Segment Mappings?!
______________ ____________ ___________
mem_hw:      =   4294967296 (   4096MB)        Total real memory installed

SYSTEM MEMORY SUMMARY:
mem_used:         806592512 (    769MB) [ 18%] Logically used memory
mem_avail:   +   3488374784 (   3326MB) [ 81%] Logically available memory
______________ ____________ __________ _______
mem_total:   =   4294967296 (   4096MB) [100%] Logically total memory

vmstat -s:

187787043 cpu context switches
 45583366 device interrupts
 43935721 software interrupts
 35656805 traps
 50349331 system calls
       28 kernel threads created
   137472  fork() calls
    27001 vfork() calls
        0 rfork() calls
        0 swap pager pageins
        0 swap pager pages paged in
        0 swap pager pageouts
        0 swap pager pages paged out
     2562 vnode pager pageins
    20639 vnode pager pages paged in
    38220 vnode pager pageouts
    51648 vnode pager pages paged out
        0 page daemon wakeups
  4378977 pages examined by the page daemon
        0 clean page reclamation shortfalls
        0 pages reactivated by the page daemon
  8131029 copy-on-write faults
    43662 copy-on-write optimized faults
 24464661 zero fill pages zeroed
        0 zero fill pages prezeroed
    12274 intransit blocking page faults
 35551640 total VM faults taken
     2441 page faults requiring I/O
        0 pages affected by kernel thread creation
 26489458 pages affected by  fork()
  1666482 pages affected by vfork()
        0 pages affected by rfork()
 41373366 pages freed
        0 pages freed by daemon
 11882146 pages freed by exiting processes
    49440 pages active
   122624 pages inactive
        0 pages in the laundry queue
    80310 pages wired down
   727442 pages free
     4096 bytes per page
 14867178 total name lookups
          cache hits (93% pos + 5% neg) system 0% per-directory
          deletions 0%, falsehits 0%, toolong 0%

@danielnelson
Copy link
Contributor

Is it always buffers that fails? Does anything change if it is the only item you collect?:

package main

import (
	"fmt"

	"golang.org/x/sys/unix"
)

func main() {
	buffers, err := unix.SysctlUint32("vfs.bufspace")
	fmt.Printf("buffers: %d %v\n", buffers, err)
}

@W0CHP
Copy link

W0CHP commented Apr 11, 2018

Always the buffer that fails. Using your program above, to collect it alone, also fails:

go run sys-unix.go
buffers: 0 cannot allocate memory

@W0CHP
Copy link

W0CHP commented Apr 11, 2018

@danielnelson Both are amd64, FreeBSD-11.1-RELEASE.

@danielnelson
Copy link
Contributor

I just fired up a amd64 FreeBSD-11.1-RELEASE virtual machine and was unable to reproduce the error, does anything change if you run the program as root?

@W0CHP
Copy link

W0CHP commented Apr 11, 2018

Wow that's strange. One of these is a pristine installation, the other existing. Both throw the same buffers: 0 cannot allocate memory message when running the program as root:

package main

import (
	"fmt"

	"golang.org/x/sys/unix"
)

func main() {
	buffers, err := unix.SysctlUint32("vfs.bufspace")
	fmt.Printf("buffers: %d %v\n", buffers, err)
}

@danielnelson
Copy link
Contributor

Couple more things I would be interested in seeing. First, try using unix.SysctlUint64 on this item, we tried this above but not with vfs.bufspace:

func main() {
	buffers, err := unix.SysctlUint64("vfs.bufspace")
	fmt.Printf("buffers: %d %v\n", buffers, err)
}

If that doesn't help, try this C program:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/sysctl.h>

int main (int argc, char *argv[]) {
        unsigned long oldp;
        size_t oldp_size = 8;
        if (sysctlbyname("vfs.bufspace", &oldp, &oldp_size, NULL, 0) != 0) {
                perror("sysctlbyname");
                return 1;
        }
        printf("%lu\n", oldp);
        return 0;
}

Run this like:

cc mem.c
./a.out

@W0CHP
Copy link

W0CHP commented Apr 12, 2018

Output with unix.SysctlUint64("vfs.bufspace"):

buffers: 73027584 <nil>

@danielnelson
Copy link
Contributor

Nice, is that the same as sysctl vfs.bufspace?

@danielnelson
Copy link
Contributor

I will bring in the update to gopsutil for the 1.7 release, but I'm not planning to do so at this time for 1.6 since it often introduces new issues. In the meantime you will need to compile a custom version to get this plugin working.

@danielnelson danielnelson added this to the 1.7.0 milestone Apr 13, 2018
@W0CHP
Copy link

W0CHP commented Apr 13, 2018

@danielnelson makes perfect sense and I support that. I enjoyed working with you on this!

@W0CHP
Copy link

W0CHP commented Apr 13, 2018

Update: 1.6.0-rc4 does work as intended when gopsutil is compiled in using @danielnelson's pull req. Very nice work and thanks again for your efforts, and for putting this in the 1.7.0 milestone!

screen shot 2018-04-13 at 3 07 11 pm

@danielnelson
Copy link
Contributor

@W0CHP Couldn't have figured it out without your help, really appreciate it.

@danielnelson
Copy link
Contributor

We noticed that the 2 fixes immediately after the gopsutil release we were using addressed issues in Telegraf, and so we decided to bring them into the 1.6 release branch. The fix for this issue will be available in the 1.6.1 release.

@W0CHP
Copy link

W0CHP commented Apr 20, 2018

@danielnelson Fantastic news!
When 1.6.1 is released, I will build and test on both FreeBSD and pfSense platforms, validate them, and then submit the ports changes upstream to the respective projects.
Thanks!

@danielnelson
Copy link
Contributor

That would be amazing, I would love to get Telegraf into the official package repos for all the different platforms.

danielnelson pushed a commit that referenced this issue Apr 20, 2018
@W0CHP
Copy link

W0CHP commented Apr 20, 2018

@danielnelson FreeBSD indeed has the port and packages officially, as does pfSense, which tracks (FreeBSD and -ports).

Both projects lag behind new Telegraf releases/fixes (the latter project being the worse of the two); which is where I come in... :-)

I'm responsible for myriad UN*X machines, the orchestration and deployments of Telegraf to all of them, etc., as well as the TICK Stack instrumentation clusters monitoring everything. I need new versions as they're released kind of pronto, so I may as well keep submitting my changes to the project package maintainers; since part of my daily routine is tracking TICK Stack components for my deployment automation.

have a wonderful weekend! 😃

@girgen
Copy link

girgen commented Apr 25, 2018

Hi! I manage the telegraf port for FreeBSD and I update the telegraf port as soon as I get information about a new releas. Sadly, there is no esablished process for iinfluxdata to inform packagers like me informing me when a new version is about to be released. That would be a well needed process. How can such a relation be esablished?

@W0CHP
Copy link

W0CHP commented Apr 25, 2018

Hi @girgen!

Thanks for committing my patches to FreeBSD-Ports yesterday so quickly! Unfortunately the port/package maintainer for Telegraf for pfSense has been silent for months, even with all of my submissions. :-(

Anyway...
Not sure if this helps, but I have IFTTT recipes that notify me of new released for projects I watch on GH and other mediums. Release mailing lists are also handy. Anyway I'll let @danielnelson chime in here.

Have a great day!

@danielnelson
Copy link
Contributor

I think the method method is to watch for a release candidates using the github atom feed, we usually begin the rc process about 2 weeks before a minor version release 1.x.0. Patch releases 1.6.x have less warning and are based on need.

I also try to keep the milestones updated to the expected release date.

Let me know if this works.

@girgen
Copy link

girgen commented Apr 25, 2018

Sorry to hear about pfSense inactivity. I have no direct relation to pfSense, but perhaps you can ask on some general pfSense ports channel / forum / mailing list where other porters hang?

I'll check out the github atom feed!
Cheers,
Palle

@W0CHP
Copy link

W0CHP commented Apr 26, 2018

@girgen I finally got their attention, and my pull request is in review.

I will continue to work closely and hound the pfSense team, as it's FreeBSD-based, and is a very popular project.

Pleasure working with you gentlemen! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants