Skip to content
devinkramer edited this page Oct 2, 2015 · 15 revisions

Description

LogWatcher is a python daemon that gathers metrics from the access logs of web applications (apache and tomcat have been tested), and sends the metrics downstream to either a Graphite Server or Ganglia gmond listener in near-realtime. Metrics are named with the prefix "LW_" for easy identification. An instance of LogWatcher is required for each access log that is to be watched. The log file can be either statically named or pre-rotated with timestamp in the filename. Metrics are typically collected/averaged by minute, but this is configurable.

Requirements

  • Python 2.6+ (with time,os,re,sys,atexit,ConfigParser,getopt,string)
  • Ganglia or Graphite

Installing

LogWatcher can be installed using the existing python package at https://pypi.python.org/pypi by simply running

pip install LogWathcher

You can also build and deploy your own python package from the GitHub source. There is also a sample Spec file that can be used to build and deploy LogWatcher as an RPM.

Running

A Sample init (start/stop script) and ini (configuration) file are provided here as well and are not part of the python package. Typically these are deployed via some other CM tool like Puppet or Chef.

Testing

There is a test access log and test INI file available in the test directory. To test basic functionality after you have installed the LogWatcher package you can run the following via the command line:

python2.7 /usr/lib/python2.7/site-packages/logwatcher/logwatcher.py -D -V -c /app/logwatcher/etc/test.ini -b

The output should look like:

DEBUG: FOUND A NEW LOGFILE, we should switch (after finishing)
DEBUG: Last line was None (try 1)
DEBUG: opening logfile /tmp/access_test
DEBUG: log count = 0
DEBUG: current position is 0
DEBUG: readlines() returned 650 lines
DEBUG: Found new count metric: return_code_404
DEBUG: Found new count metric: isCust_NotSet
.
.
.
DEBUG: readlines() returned 0 lines

You may see ERRORS related to not being able to find gmetric if your system does not have this Ganglia binary installed.

Configuration

Go here for details about the LogWatcher configuration file. There is a sample INI that should produce the default LogWatcher metrics with every little customization.

Log Formatting

LogWatcher can be used to generate metrics from anything that can be found in the access logs using regular expressions. Some basic log format suggestions for Tomcat and Apache are as follows.

Basic Tomcat Log Format

<Valve className="org.apache.catalina.valves.AccessLogValve"
    directory="/var/log/access"
    prefix="access"
    resolveHosts="false"
    checkExists="true"
    rotatable="false"
    pattern="%a %v %u %t &quot;%r&quot; %s %b &quot;%{Referer}i&quot; &quot;%{User-agent}i&quot; &quot;%{REQUEST_DETAILS}r&quot; %D"
/>

Basic Apache Log Format

LogFormat "%h %{Host}i %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D" custom_fmt

Additional custom metrics can be added and found via regex in the config. The following is a recommended format for these metrics:

[key=value]

Example Tomcat Log Line with Additional Custom Metrics:

1.23.45.678 logwather.com  - [26/Sep/2015:15:59:16 -0700] "GET /profile HTTP/1.1" 200 2989 "http://referrer.com/restaurants" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko" " [wsTime=7] [isCust=0] [ver=2] [showAds=true] [daoTimelisting=2.678338] [daoTimecgmdblisting=2.678338]  [oTime=0] [pTime=1]  [daoTimecontent=2.511219] [daoTimecgmdbcontent=2.511219] [clientIp=12.3.4.5.6]" 8

Metric Destinations

LogWather can currently send metrics either directly to a Graphite servers or to Ganglia.

Ganglia

Ganglia is the default destination for metrics. Logwatcher will expect and use the /etc/gmond.conf file on your system.

Graphite

To send to graphite you need to use the following runtime options. Using these will disable the default behavior of sending metrics to Ganglia.

  -g --graphite-server <s> Use graphite, with server <s>
  -G --use-graphite        Use graphite, find server in /etc/graphite.conf

Supported Metric Types

There are basically three primary types of metrics supported, plus another derived from the first two and special-use priming metrics. Most use a regexp which finds a value.

counts (metrics_count)

The value saved as $1 in the regex will be counted, and a separate metric created for each $1 found (as well as a _NotSet metric that counts lines not matching your regex) Note that metric names are dynamically generated from the values found. The metrics are persisted for the run-time of the LW instance, typically months.

sums (metrics_sum)

The values saved as $1 since the last notify event (based on the notify_schedule) are added together and saved as a single metric.

ratios (metrics_ratio)

These are derived from either counts or sums. The ratio is the value of the original metric divided by the Queries metric (unfiltered requests per minute) Used for alerting, since ratios don't vary much with traffic changes during the day.

calculated (metrics_calc)

These ratios are derived from counts and/or sums using user-defined expressions Can be used to configure ratio-style metrics on specific segments of traffic, instead of all requests

distribution (metrics_dist)

Each of these is a collection of counts showing the distribution of values over N buckets of size M. Used primarily to provide data for processing time histograms (typically 11 buckets of 100ms each, the last bucket counting any value over 1000ms)

Automatic/Default Metrics

Metric Name Reported Units Units Description
LW_<distinguisher>_Total_Processing_Time seconds seconds The sum of the processing time value from every log line, in ms, since the last notify event (based on the notify_schedule). Requires the following parameters be set: processing_time_regex processing_time_units (see the Config Options table below for suggested settings)
LW_<distinguisher>_Avg_Processing_Time seconds seconds LW_<distinguisher>Total_Processing_Time / LW<distinguisher>_Queries
LW__Max_Processing_Time seconds seconds The maximum value matching processing_time_regex since the last notify event (based on the notify_schedule option).
LW_<distinguisher>_exceeding_SLA percent percent The percent of not-ignored log lines (see ignore_pattern option) processed since the last notify event (based on the notify_schedule option), who's processing time exceeds sla_ms value.
LW_<distinguisher>_exceeding_SLA_ct percent decimal The count of not-ignored log lines (see ignore_pattern) processed since the last notify event (based on the notify_schedule), who's processing time exceeds sla_ms.
LW_<distinguisher>_Queries count decimal The total number of not-ignored log lines (see ignore_pattern option) processed since the last notify event (based on the notify_schedule).
LW_<distinguisher>_QPS qps decimal The (average?) QPS, not including ignored log lines (see ignore_pattern) since the last notify event (based on the notify_schedule).
LW_LW_Version string string The version of LogWatcher.
LW__ignored count decimal A count of the number of log lines ignored (matching ignore_pattern) since the last notify event (based on the notify_schedule).
LW_<distinguisher>QPS qps decimal The (average?) QPS for log lines matching brand_regex, not including ignored log lines (see ignore_pattern) since the last notify event (based on the notify_schedule).
LW_<distinguisher>_QPS_NULL_brand qps decimal The (average?) QPS for log lines that do not match brand_regex, not including ignored log lines (see ignore_pattern) since the last notify event (based on the notify_schedule).
LW_LW_LogTime seconds decimal
LW_LW_NewMetrics float decimal Count of new metrics that were not sent on the last cycle, or what? ...it does seem to exclude some, or all, of the built-in metrics.
LW_LW_TotalMetrics float decimal A count of the number of metrics LogWatcher is sending, not counting this metric.
LW_LW_NotifyTime seconds decimal

Plugins

LogWatcher supports very simple LinePlugins. The plugins can modify the log lines, compute complex metrics, or even send some or all of the lines to a separate log file or other system (kafka). Note that lines excluded by the exclude filter are not sent to plugins. Plugin Details are available here.