Skip to content

monk-ee/s3-log-parser

Repository files navigation

s3-log-parser

Parses log lines from an s3 log file

Build Status

Usage:

import s3_log_parser
line_parser = s3_log_parser.make_parser("%BO %B %t %a %r %si %o %k \"%R\" %s %e %b %y %m %n \"%{Referer}i\" \"%{User-Agent}i\" %v")

This creates & returns a function, line_parser, which accepts a line from an s3 log file in that format, and will return the parsed values in a dictionary.

Key:

%BO - bucket owner - The canonical user ID of the owner of the source bucket.
%B - bucket - The name of the bucket that the request was processed against. If the system receives a malformed 
request and cannot determine the bucket, the request will not appear in any server access log.
%t - date/time - The time at which the request was received. The format, using strftime() terminology, is 
[%d/%b/%Y:%H:%M:%S %z]
%a - remote ip - Remote IP-address The apparent Internet address of the requester. Intermediate proxies and 
firewalls might obscure the actual address of the machine making the request.
%r - requester_id - The canonical user ID of the requester, or the string "Anonymous" for unauthenticated requests.
 If the requester was an IAM user, this field will return the requester's IAM user name along with the AWS root 
 account that the IAM user belongs to. This identifier is the same one used for access control purposes.
%si - s3_request_id - The request ID is a string generated by Amazon S3 to uniquely identify each request.
%o - operation - The operation listed here is declared as SOAP.operation, REST.HTTP_method.resource_type, 
WEBSITE.HTTP_method.resource_type, or BATCH.DELETE.OBJECT.
%k - key - The "key" part of the request, URL encoded, or "-" if the operation does not take a key parameter.
\"%R\" - request_firs_line - First line of request. The Request-URI part of the HTTP request message.
%s - status - The request method The numeric HTTP status code of the response.
%e - error - The Amazon S3 Error Code, or "-" if no error occurred.
%b - bytes - Size of response in bytes, excluding HTTP headers. The number of response bytes sent, excluding 
HTTP protocol overhead, or "-" if zero.
%y - total_bytes - Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a '-' rather than a 0 
when no bytes are sent. The total size of the object in question.
%m - total_time - The number of milliseconds the request was in flight from the server's perspective. This value is 
measured from the time your request is received to the time that the last byte of the response is sent. Measurements 
made from the client's perspective might be longer due to network latency.
%n - turnaround_time - The number of milliseconds that Amazon S3 spent processing your request. This value is 
measured from the time the last byte of your request was received until the time the first byte of the response was 
sent.
\"%{Referer}i\" - referer - The contents of Foobar: header line(s) in the request sent to the server. Changes made 
by other modules (e.g. mod_headers) affect this. If you're interested in what the request header was prior to when 
most modules would have modified it, use mod_setenvif to copy the header into an internal environment variable and 
log that value with the %\{VARNAME}e described above. The value of the HTTP Referrer header, if present. HTTP 
user-agents (e.g. browsers) typically set this header to the URL of the linking or embedding page when making a 
request.
\"%{User-Agent}i\" - user agent - The value of the HTTP User-Agent header.
%v - version_id - The version ID in the request, or "-" if the operation does not take a versionId parameter.

Based on apache-log-parser by © 2013 Rory McCann, released under the terms of the GNU GPL v3

Bitdeli Badge

About

Parses log lines from an s3 log file

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages