Logtag.py is a simple tool to process flow logs and aggregate counts by tag and by kind. Only Python >3.12 is supported, but there’s every reason to think that any >3 will work.
In the same directory as `logtag.py`, run
python logtag.py [tags] [log] [out]
The tags file must be a CSV with headers of the form
dstport,protocol,tag 25,tcp,sv_P1 68,udp,sv_P2 23,tcp,sv_P1
The protocol column consists of case-insensitive IANA keywords. To double check any issue with spelling or description, `lookup.py` is the final source of truth. The table `_KEY_TABLE` lists every usable keyword and its canonical spelling.
The log file must be a version 2 flow log, as described here. Columns are separated by any number of spaces or tabs, a backwards compatible alteration to the specification which allows for human readability. No blank lines or incomplete rows are permitted.
Logtag.py has been tested on mock flow log files up to 8MiB without noticeable performance impact. The required space is in the worse case roughly linear with the size of the input file- if few connections are to the same port and via the same protocol. In many real world networks, however, usage is dominated by a few services, and in this happy scenario space usage can be much lower.
This is the path at which the results will be output. Logtag.py does not clobber existing results, however. If a file already exists at this path, it will error out and the file will not be altered.