[Auditbeat] Login metricset #9327

cwurm · 2018-12-02T16:12:11Z

This adds the login metricset to the Auditbeat system module. It's the last of the six initial metricsets. It only works on Linux, and detects not just user logins and logouts, but also system boots and shutdowns.

It works by reading the /var/log/wtmp and /var/log/btmp file (and rotated files) present on Linux systems. In reading a file, it is similar to Filebeat, except that UTMP is a binary format, so reading happens using a binary Go reader. See utmp(5) for the format of that file.

The logic is roughly as follows:

The config parameters login.utmp_file_pattern and login.btmp_file_pattern will contain the pattern matching the wtmp (good logins, as well as system shutdowns and boots) and btmp (bad/failed logins) files and rotated files (if desired). The defaults are /var/log/wtmp* and /var/log/btmp*. These are expanded using filepath.Glob and the files are sorted lexicographically in reverse order (i.e. /var/log/wtmp.1 will come before /var/log/wtmp) so that we read older login records first - reading in order is required for matching login and logout records, see next steps.
Every Fetch it checks for new entries: Any new files are read from the beginning, while known files are read from a saved offset. To that purpose, the last offset per file is saved and persisted to disk in beat.db. A new file is one that has an unknown inode, but files are also read completely if their newSize < oldSize for some reason (that should make it work with any potential inode reuse - very unlikely since this will never read a lot of files but still possible).
New UTMP records are converted to events (LoginRecord in the code). Boot and shutdown events are fairly straightforward, user login and logout events have to be matched using their tty- so there is a loginSessions map that stores logins to enrich the logouts, and is also persisted to disk.

Note: This dataset also introduces event.origin containing the file the event came from, e.g. /var/log/wtmp.1. In other cases, it would be something like procfs or netlink. It's useful to know where information comes from, e.g. to know how reliable it is.

elasticmachine · 2018-12-02T16:12:13Z

Pinging @elastic/secops

x-pack/auditbeat/module/system/login/utmp.go

x-pack/auditbeat/module/system/login/login.go

x-pack/auditbeat/module/system/login/config.go

x-pack/auditbeat/module/system/login/login.go

tsg · 2018-12-03T12:14:14Z

x-pack/auditbeat/module/system/login/login.go

+
+	// Save new state to disk
+	if len(loginRecords) > 0 {
+		err := ms.utmpReader.saveStateToDisk()


Mentioning it mostly for awareness: if I'm not wrong, the Auditbeat report.Event()doesn't guarantee sending at least once. If Elasticsearch is down, for example, it will retry 3 times, and then drop the event. This means that we can potentially lose login events, despite saving the state. I think we could solve this after the Filebeat refactoring, but I suggest documenting this as a limitation for now.

Hm, that's a shame. It guess this affects all metricsets, and we might lose process starts, socket connections, password changes, audit events, FIM events, etc.

The internal queue does not keep events until they can be sent (or it overflows)? That would seem desirable.

It does, and one can also use spooling to disk to make the queue larger. But Filebeat offers an extra guarantee by waiting for the ACK and only afterwards update the state on disk. This is not currently possible in Auditbeat. That’s the difference I wanted to highlight.

Right, maybe if at some point report.Event() would return false if it couldn't write to the queue or otherwise cannot guarantee sending the event eventually - so we can retry later.

x-pack/auditbeat/module/system/login/login_test.go

tsg · 2018-12-03T19:52:03Z

We should think through what the experience is on non-Linux systems. I'm thinking two things:

If you enable the login module on macOS/Windows, you should get an unimplemented error in the logs
The default windows/macOS configs should not enable this metricset.

Is the first point already happening? I couldn't test because of the user.go error.

cwurm · 2018-12-03T23:21:13Z

* If you enable the `login` module on macOS/Windows, you should get an unimplemented error in the logs

That should be what's happening (in login_other.go). I've added some extra cgo build flags in case it's not enabled when building.

* The default windows/macOS configs should not enable this metricset.

It shouldn't - there's an if/else around it in the config.yml.tmpl.

I couldn't test because of the user.go error.

I opened #9368 to address that - it compiles for me after applying that.

I've changed to using message as well.

tsg

LGTM. Thanks for addressing the comments.

andrewkroh

This will be a useful metricset. I left a few comments that need to be addressed now, and some others that are just suggestions. If you have any questions let me know.

x-pack/auditbeat/module/system/login/login.go

x-pack/auditbeat/module/system/login/utmp.go

andrewkroh · 2018-12-04T16:14:02Z

x-pack/auditbeat/module/system/login/utmp.go

+	if utAddrV6[1] != 0 || utAddrV6[2] != 0 || utAddrV6[3] != 0 {
+		// IPv6
+		b := make([]byte, 16)
+		binary.LittleEndian.PutUint32(b[:4], utAddrV6[0])


What does this mean for big-endian architectures? Will this code work?

Hm, good point and I don't know. Do we support any big-endian systems?

I'm surprised Go doesn't have a binary.MachineEndian.

package main import ( "encoding/binary" "fmt" "unsafe" ) var MachineEndian = getByteOrder() func getByteOrder() binary.ByteOrder { var b [2]byte *((*uint16)(unsafe.Pointer(&b[0]))) = 1 if b[0] == 1 { return binary.LittleEndian } return binary.BigEndian } func main() { var b [4]byte MachineEndian.PutUint32(b[:], 0x12345678) fmt.Printf("%x %x %x %x\n", b[0], b[1], b[2], b[3]) }

Just revisiting this. I'm thinking that it might not be likely that this would be used on a big-endian system that uses UTMP login records. In the interest of keeping the code simple I'd leave it as is. If there's a need to change things we always can.

@adriansr getByteOrder() looks like a useful function that Go itself should have, indeed. Or maybe we should have it in a shared location within Beats so code like this here can use it easily.

I think it's likely that someone will use it on big-endian. We've had several contributions to go-libaudit to make auditbeat work on big-endian.

Ok, I've added support for it. Thanks @adriansr for the code snippet!

x-pack/auditbeat/module/system/login/utmp.go

andrewkroh · 2018-12-04T16:47:54Z

x-pack/auditbeat/module/system/login/utmp.go

+// but will return -1 instead.
+func lookupUsername(username string) int {
+	if username != "" {
+		user, err := user.Lookup(username)


This is probably fine for low volumes. But we have seen from the auditd module that this call is expensive. Some caching could be a useful future enhancement.

x-pack/auditbeat/module/system/login/config.go

urso · 2018-12-04T22:37:56Z

x-pack/auditbeat/module/system/login/login.go

+	IP        *net.IP
+	Timestamp time.Time
+	Origin    string
+}


Same, long name and a many exported symbols that we might not need somewhere else.

LoginRecord is encoded and decoded by gob so it needs to be exported, unfortunately.

urso · 2018-12-04T22:56:01Z

x-pack/auditbeat/module/system/login/utmp.go

+
+// FileRecord represents a UTMP file at a point in time.
+type FileRecord struct {
+	Inode    Inode


In filebeat we also store the device id.

I think for now it's unlikely that UTMP files would be stored on separate file systems. I.e. it would be mean different parts of /var/log are on different devices. I'm not even sure that's possible.

x-pack/auditbeat/module/system/login/utmp.go

urso · 2018-12-05T00:11:21Z

x-pack/auditbeat/module/system/login/utmp.go

+}
+
+// ReadNew returns any new UTMP entries in any files matching the configured pattern.
+func (r *UtmpFileReader) ReadNew() ([]LoginRecord, error) {


How big do we expect these files to be? Looks like this function will load all available logs into memory on first run.

So far I've assumed they're usually quite small, but that might not be true in all cases. The /etc/logrotate.conf on my Ubuntu VM rotates it monthly. On that VM that's getting a number of daily logins/logouts, they're <100K. On a real-world machine with a lot of logins, I assume it could be more, I wonder how many records it would take to be significant.

Nevertheless, I guess we could use a channel between Fetch and ReadNew and output records as events as they come.

This is more or less what filebeat does as well. Have one go-routine per file and forward event by event. That is, the module/input is more or less active all the time. Not sure how well this fits the 'Fetch' semantics in the metricbeat framework. But a bounded channel can indeed help keeping memory usage in place. Given the complexity this introduces I wonder if that's really a problem.

I've changed to using channels now, and storing offsets of the files (instead of re-reading until the last read record). So hopefully that will make it scalable even if the files are very large.

urso · 2018-12-05T00:13:55Z

x-pack/auditbeat/module/system/login/utmp.go

+				r.log.Warnf("Unexpectedly, the file %v did not contain the saved login record %v - reading whole file.",
+					path, *lastKnownRecord)
+
+				return r.readAfter(path, nil)


Looks like we are ignoring all messages read so far if we end up here (e.g. we read while copytruncate like logrotation happens). We still want to publish known records? Alternatively we can just error and return here, but rely on fetch to reset the file offset and read new contents the next time Fetch is executed.

Good point, maybe I was a bit too overzealous to make sure we never miss an event here and just returning is better.

However, I'm curious about copytruncate: As I understand it, the logfile is copied (e.g. cp /var/log/wtmp /var/log/wtmp.1) and then truncated. If that happens, I suspect the metricset as it is at the moment (with the file pattern being /var/log/wtmp* by default) would read all of /var/log/wtmp.1 (because it'll have a different inode) - even though it has read all of those records already. Does Filebeat handle this in some way?

But can wtmp be copy-truncated? I thought it's guaranteed to be renamed with moves. I wouldn't want to bring all the complexity of Filebeat's corner cases in here :)

Rename with move should be ok. But them we don't get into this code path.

Instead of having a tail recursive call we could use a for loop and reset some state and use continue in order to jump to beginning of the reader and not drop events. But then it's an edge case that normally should not occur.

The code is now using offsets and if there is some problem in jumping to the stored offset it will jump back to the beginning and re-read the file. This is maybe a bit overzealous, but I think at least for now I'd prefer it if events were sent twice rather than not at all. If there are any problems we can always change.

adriansr

LGTM, suggesting minor refactor, feel free to do it on a separate PR.

adriansr · 2019-01-30T15:42:21Z

x-pack/auditbeat/module/system/login/utmp_c.go

+
+var byteOrder = getByteOrder()
+
+func getByteOrder() binary.ByteOrder {


Turns out gosigar has a GetEndian() method for this.

beats/vendor/github.com/elastic/gosigar/sys/endian.go

Line 8 in f28cf30

func GetEndian() binary.ByteOrder {

Oh nice, thanks!

Adds the login metricset to the Auditbeat system module as the last of the six initial metricsets. It only works on Linux, and detects not just user logins and logouts, but also system boots and shutdowns. It works by reading the /var/log/wtmp and /var/log/btmp file (and rotated files) present on Linux systems. In reading a file, it is similar to Filebeat, except that UTMP is a binary format, so reading happens using a binary Go reader. (cherry picked from commit 1566e66)

cwurm added review Auditbeat SecOps labels Dec 2, 2018

cwurm requested review from webmat, tsg and andrewkroh December 2, 2018 16:12

houndci-bot reviewed Dec 2, 2018

View reviewed changes

andrewkroh reviewed Dec 2, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/config.go Outdated Show resolved Hide resolved

x-pack/auditbeat/module/system/login/config.go Outdated Show resolved Hide resolved

cwurm mentioned this pull request Dec 2, 2018

[Auditbeat] System module 6.6 #8725

Closed

21 tasks

tsg reviewed Dec 3, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/login.go Outdated Show resolved Hide resolved

tsg reviewed Dec 3, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/login.go Outdated Show resolved Hide resolved

tsg reviewed Dec 3, 2018

View reviewed changes

cwurm requested review from tsg and removed request for webmat December 3, 2018 19:22

tsg reviewed Dec 3, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/login_test.go Show resolved Hide resolved

tsg approved these changes Dec 3, 2018

View reviewed changes

andrewkroh requested changes Dec 4, 2018

View reviewed changes

urso reviewed Dec 4, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/config.go Outdated Show resolved Hide resolved

urso reviewed Dec 4, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/utmp.go Outdated Show resolved Hide resolved

urso reviewed Dec 5, 2018

View reviewed changes

x-pack/auditbeat/module/system/login/utmp.go Show resolved Hide resolved

urso reviewed Dec 5, 2018

View reviewed changes

cwurm added in progress Pull request is currently in progress. and removed review labels Dec 11, 2018

cwurm mentioned this pull request Jan 16, 2019

[Auditbeat] System module 6.7 / 7.0 #10103

Closed

10 tasks

Christoph Wurm added 20 commits January 29, 2019 15:24

Make file re-reading non-recursive.

1458563

Use channels instead of passing arrays to reduce memory footprint.

a7ca217

Save offset instead of last read record.

0823d7e

Handle failed logins from btmp files as well.

9c9e1b6

System test

9774ca0

Remove unreachable code.

04af736

Make Hound happy

3b8ed59

Change to user.terminal and document event.origin

175c992

Config and docs.

3b8002a

Changelog

1a8bbbc

Update data.json

5915f85

Test with samle wtmp file.

3617549

Remove cgo build constraint.

3e21182

Improve message string building.

03eedee

Add UtType.

539feb4

Respect machine byte order.

0453ce2

Fix debug message.

028663d

Correct type for all UtType values.

c649510

Skip login dataset tests when not on a little-endian system.

b1f70fb

mage update

fe229c7

cwurm force-pushed the logins_metriset branch from a165894 to fe229c7 Compare January 29, 2019 15:40

Christoph Wurm added 2 commits January 29, 2019 17:00

Remove user.id from system test.

71d61c2

Fix unit test in integration env.

d0b1969

adriansr approved these changes Jan 30, 2019

View reviewed changes

cwurm merged commit 1566e66 into elastic:master Jan 30, 2019

cwurm mentioned this pull request Feb 3, 2019

[Auditbeat] Cherry-pick #9327 to 6.x: Login metricset #10509

Merged

cwurm added v6.7.0 and removed needs_backport PR is waiting to be backported to other branches. labels Feb 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Auditbeat] Login metricset #9327

[Auditbeat] Login metricset #9327

cwurm commented Dec 2, 2018 •

edited

Loading

elasticmachine commented Dec 2, 2018

tsg Dec 3, 2018

cwurm Dec 3, 2018

tsg Dec 3, 2018

cwurm Dec 3, 2018

tsg commented Dec 3, 2018

cwurm commented Dec 3, 2018

tsg left a comment

andrewkroh left a comment

andrewkroh Dec 4, 2018

cwurm Dec 5, 2018

adriansr Dec 5, 2018

adriansr Dec 5, 2018

cwurm Jan 22, 2019

andrewkroh Jan 24, 2019

cwurm Jan 25, 2019

andrewkroh Dec 4, 2018

urso Dec 4, 2018

cwurm Jan 23, 2019

urso Dec 4, 2018

cwurm Jan 23, 2019

urso Dec 5, 2018

cwurm Dec 5, 2018

urso Dec 12, 2018

cwurm Jan 23, 2019

urso Dec 5, 2018

cwurm Dec 5, 2018

tsg Dec 5, 2018

urso Dec 12, 2018

cwurm Jan 23, 2019

adriansr left a comment

adriansr Jan 30, 2019

cwurm Jan 30, 2019


		var byteOrder = getByteOrder()

		func getByteOrder() binary.ByteOrder {

[Auditbeat] Login metricset #9327

[Auditbeat] Login metricset #9327

Conversation

cwurm commented Dec 2, 2018 • edited Loading

elasticmachine commented Dec 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsg commented Dec 3, 2018

cwurm commented Dec 3, 2018

tsg left a comment

Choose a reason for hiding this comment

andrewkroh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriansr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cwurm commented Dec 2, 2018 •

edited

Loading