HTTP log monitoring console program

Requirements

The program has been written in Java 8, so you need at least a JRE 1.8+ installed. You also need maven to build the project.

Getting Started

You can build the project by launching the command mvn clean install in the root directory. You can then find the resulting jar file in the target directory.


Usages: java -jar PATH_TO_JAR_FILE [options]. e.g java -jar http-log-monitoring-1.0-SNAPSHOT.jar -AW 5 -AT 5
 
             -P: specifies the path of the log file (/tmp/access.log by default)
             -R: specifies the stats display rate in seconds (10s by default)
             -AT: specifies the average section hits above which an alert should be emitted (10 by default)
             -AW: specifies the alert window duration in seconds on which the average section hits is
             calculated upon (120s by default)

Overview

The program consumes an actively written-to w3c-formatted HTTP access log file and should follow these principles:

Display stats every 10s about the traffic during those 10s: the sections of the web site with the most hits, as well as interesting summary statistics on the traffic as a whole. A section is defined as being what's before the second '/' in the resource section of the log line. For example, the section for "/pages/create" is "/pages"
A user can keep the app running and monitor the log file continuously
Whenever total traffic for the past 2 minutes exceeds a certain number on average, a message saying that “High traffic generated an alert - hits = {value}, triggered at {time}” is printed. The default threshold should be 10 requests per second, and is overridable.
Whenever the total traffic drops again below that value on average for the past 2 minutes, a message detailing when the alert recovered is printed.
All messages showing when alerting thresholds are crossed remain visible on the page for historical reasons.

Design improvements

The console printed messages are not that great. The statistics could have more details for example include some stats on IPs.
It is not ready for tremendous amount of data. The locations where HashMap are used can cause the JVM to crash due to an OutOfMemoryError. HashMap could be replaced by a noSQL DB distributed over a chosen number of clusters. Every section name could be hashed with SHA2 for e.g, converted back to base 10 and then use modulo over the chosen number of clusters to have an even repartition of data across the servers. To go further we can even use a probabilist data structure for example a bloom filter in order to reduce the calls to the database.
Nothing is persisted, we lose statistics we have calculated and if the program shuts down, we also lose all in-memory data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTTP log monitoring console program

Requirements

Getting Started

Overview

Design improvements

About

Releases

Packages

Contributors 3

Languages

Shaance/http-access-log-monitoring

Folders and files

Latest commit

History

Repository files navigation

HTTP log monitoring console program

Requirements

Getting Started

Overview

Design improvements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages