Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more info about flows to the doc #4197

Merged
merged 4 commits into from
May 9, 2017

Conversation

dedemorton
Copy link
Contributor

@dedemorton dedemorton commented May 3, 2017

First cut of the content on flows.

I also moved up a couple of items in the TOC because they seemed important.

Please respond to the comments that I've added inline.


//REVIEWERS: Is it OK for me to show all these IP addresses (not sure what might be sensitive)? I don't worry about that as much as I should, but now that I know Mr. Robot is watching...perhaps I should be more careful. Do we have a better image to use? This one is actually from a v5.3.

image:./images/flows.png[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedemorton I am not able to find flows.png. Did you forget to add it to the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monicasarbu whoops! just added it.

Here’s an example of a flow event sent by Packetbeat. See
<<exported-fields-flows_event>> for a description of each field.

//REVIEWERS: As always, I sit on the fence about whether to show the output event because it may change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can include the content of the https://github.com/elastic/beats/blob/master/metricbeat/module/system/diskio/_meta/data.json. Unfortunately this exists only for the Metricbeat metricsets, and not for all Beats.


* IP/TCP
* UDP
* SSL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SSL is not done yet. There is an open issue for it: #3604

* UDP
* SSL

//REVEIWER: Not sure this list is complete. should it also say Application? Anything else?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No for Application.

flow.

Packetbeat collects and reports statistics per flow on the following network
layers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I hear network layers I think of the OSI model. Flows collect the mac address and vlan ID from layer 2 (data-link), the ip address from layer 3, and udp/tcp port info from layer 4 (transport).

It doesn't do anything higher level layers yet (like SSL).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh Do you think users will want to know this level of detail? If not, then I can remove lines 16-21.

Copy link
Member

@andrewkroh andrewkroh May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's enough to say it collects info up to and including the transport layer. The details aren't that important since they can jump over to the exported field docs.

Here’s an example of a flow event sent by Packetbeat. See
<<exported-fields-flows_event>> for a description of each field.

//REVIEWERS: As always, I sit on the fence about whether to show the output event because it may change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like seeing the output in docs.

But I would only show the contents of the _source field (the other stuff is specific to ES). To be consistent with some other example docs, change the hostname and name to host.example.com. You could swap out the IP addresses with ones reserved for documentation purposes (see IPv4 Address Blocks Reserved for Documentation).

For each flow, Packetbeat reports the number of packets and the total number of
bytes sent from the source to the destination. Each flow event also contains
information about the source and destination hosts, such as their IP address and
location. For bi-directional flows, Packetbeat reports statistics for the reverse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Location is not provided by Beat, but we do have the field in our index template. We should explain how to get these fields populated with data using geoip lookups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh I've created a PR to track this work because I don't have time to do it right now. #4238

"name": "My-MacBook-Pro.local",
"version": "{stack-version}"
},
"connection_id": "AQAAAAAAAAA=",
Copy link
Member

@andrewkroh andrewkroh May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the field documentation needs more details too, but we probably need urso to help us out. Some questions I had about the data:

  • What is connection_id and icmp_id and where to they come from? How can I use them?
  • What data contributes to the flow_id? Under what conditions would a flow ID be reused (or when can collisions occur).
  • Clarify (and confirm) that net_bytes_total includes the total amount of bytes across all layers. Does this count retransmits?
  • Clarify if net_packets_total includes retransmits.
  • transport does not appear in the field docs. What are the possible values of the field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created a separate issue to track this list of issues: #4236

* IP/TCP
* UDP
* SSL

Copy link
Member

@andrewkroh andrewkroh May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere in here we should explain the final flag and how the packetbeat sends intermediate reports about each flow that it is tracking with the final = false, then when the flow completes it sends one last event with final = true. Users that wish to aggregate sums of traffic will want to know this because they should be filtering on final:true (or using some other technique to get only the latest update from each flow).

And if period: -1 is in the config, then the intermediate reports are disabled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to point out the is not period: -1 the correct setting is period: -1s

It is also incorrect in the existing docs. I ran into this a couple weeks ago.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robcowart Thanks and could you please open a new issue for this. Otherwise I'm afraid it might get lost in the noise. Plus DeDe is on vacation this week so even more change it might not be seen right away by her.

@dedemorton
Copy link
Contributor Author

This PR is ready for a final review.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the JSON output a bit to drop the _source key and some other ES/Kibana specific fields.

@andrewkroh andrewkroh merged commit 1af914d into elastic:master May 9, 2017
dedemorton added a commit to dedemorton/beats that referenced this pull request May 17, 2017
* Add more info about Packetbeat flows to the docs.
* Add dashboard image.
ruflin pushed a commit that referenced this pull request May 18, 2017
* Add more info about Packetbeat flows to the docs.
* Add dashboard image.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants