V2 Flow Log Analyzer

Parses a flow log file and maps each row to a tag based on a lookup table. Outputs summary of tag frequency and unique port/protocol combinations.

The lookup table is defined as a csv file, and it has 3 columns, dstport,protocol,tag. The dstport and protocol combination decide what tag can be applied.

How to run locally

Download source code

git clone https://github.com/marten-sova/flow-log-analyzer.git
cd flow-log-analyzer

Optional: run in virtual env

python3 -m venv venv
source venv/bin/activate. if you get permission denied, fix with: chmod a+x ./venv/bin/activate

Run the program

python3 analyze.py sample-flow-logs.txt sample-lookup-table.csv

Run unit tests

python3 test.py

Generate random logs for testing

This script is used to generate randomized logs and lookup tables for stress testing the program. Entries are randomized valid records/mappings with a bias toward a subset of tag names and common protocols.

python3 generate_sample_data.py <flow_log_line_count> <lookup_table_line_count>

Assumptions

Supports default v2 log format only. Reference https://docs.aws.amazon.com/vpc/latest/userguide/flow-log-records.html
Tags can map to multiple (port, protocol) combinations. However, each (port, protocol) combination can have only one tag.
The requirements state "The matches should be case insensitive". I assume this refers to the protocol strings in the lookup table. Tag names remain case sensitive.
No tags are named "Untagged" (if they are, they will be counted as untagged!)

Testing done so far

Works with provided sample data in exercise description.
Used generate_sample_data.py script to test logs with 10,000,000 entries (>1GB) and lookup tables with 10,000 entries, appears to work successfully.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze.py		analyze.py
generate_sample_data.py		generate_sample_data.py
iana.py		iana.py
sample-flow-logs.txt		sample-flow-logs.txt
sample-lookup-table.csv		sample-lookup-table.csv
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

V2 Flow Log Analyzer

How to run locally

Download source code

Optional: run in virtual env

Run the program

Run unit tests

Generate random logs for testing

Assumptions

Testing done so far

About

Releases

Packages

Languages

License

marten-sova/flow-log-analyzer

Folders and files

Latest commit

History

Repository files navigation

V2 Flow Log Analyzer

How to run locally

Download source code

Optional: run in virtual env

Run the program

Run unit tests

Generate random logs for testing

Assumptions

Testing done so far

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages