-
Notifications
You must be signed in to change notification settings - Fork 51
Documentation
Enterprise Log Search and Archive is a solution to achieve the following:
- Normalize, store, and index logs at unlimited volumes and rates
- Provide a simple and clean search interface and API
- Provide an infrastructure for alerting, reporting and sharing logs
- Control user actions with local or LDAP/AD-based permissions
- Plugin system for taking actions with logs
- Exist as a completely free and open-source project
ELSA accomplishes these goals by harnessing the highly-specialized strengths of other open-source projects: Perl provides the glue to asynchronously tie the log receiver (Syslog-NG) together with storage (MySQL) and indexing (Sphinx Search) and serves this over a web interface provided either by Apache or any other web server, including a standalone pure-Perl server for a lighter footprint.
- Introduction
- [Table of Contents] (#TableofContents)
- Why ELSA?
- Capabilities
- Installation
- Plugins
- File Locations
- Caveats for Local File Hosting
- Web Server
- Configuration
- Permissions
- Queries
- Archive Queries
- Web Services API
- Alerts
- Scheduled Queries
- Command-line Interface and API
- Performance Tuning
- Monitoring
- Adding Parsers
- Transforms
- Subsearches
- OSSEC Integration
- Bro Integration
- Calculating Disk Requirements
- GeoIP Support
- Configuring IDS to Forward Logs
- Eventlog-to-Syslog
- Dashboards
- Performance Considerations
- Troubleshooting
- Datasources
- Livetail
- Saved Searches (Macros)
- Importing Logs
- Host Checks
- Pcap/Stream/Block Integration
- Preferences
- Keyboard Shortcuts
- External Documentation
In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to slowest for indexing speeds (non-scientifically tested):
- Tokyo Cabinet
- MongoDB
- TokuDB MySQL plugin
- Elastic Search (Lucene)
- Splunk
- HBase
- CouchDB
- MySQL Fulltext
Log reception rates greater than 50,000 events per second per node are achieved through the use of a fast pattern parser in Syslog-NG called PatternDB. The pattern parser allows Syslog-NG to normalize logs without resorting to computationally expensive regular expressions. This allows for sustained high log reception rates in Syslog-NG which are piped directly to a Perl program which further normalizes the logs and prepares large text files for batch inserting into MySQL. MySQL is capable of inserting over 100,000 rows per second when batch loading like this. After each batch is loaded, Sphinx indexes the newly inserted rows in temporary indexes, then again in larger batches every few hours in permanent indexes.
Sphinx can create temporary indexes at a rate of 50,000 logs per second consolidate these temporary indexes at around 35,000 logs per second, which becomes the terminal sustained rate for a given node. The effective bursting rate is around 100,000 logs per second, which is the upper bound of Syslog-NG on most platforms. If indexing cannot keep up, a backlog of raw text files will accumulate. In this way, peaks of several hours or more can be endured without log loss but with an indexing delay.
The overall flow diagram looks like this:
Live, continuously:
Network → Syslog-NG (PatternDB) → Raw text file
or
HTTP upload → Raw text file
Batch load (by default every minute):
Raw text file → MySQL → Sphinx
### Installation Installation is done by running the install.sh file obtained either by downloading from the sources online or grabbing from the install tarball featured on the ELSA Google Code home page. When install.sh runs, it will check for the existence of /etc/elsa_vars.sh to see if there are any local customizations, such as passwords, file locations, etc. to apply. The install.sh script will update itself if it finds a newer version online, so be sure to store any changes in /etc/elsa_vars.sh. The install.sh script should be run separately for a node install and a web install. You can install both like this: sh install.sh node && sh install.sh web. Installation will attempt to download and install all prerequisites and initialize databases and folders. It does not require any interaction.Currently, Linux and FreeBSD 8.x are supported, with Linux distros based on Debian (including Ubuntu), RedHat (including CentOS), and SuSE tested. install.sh should run and succeed on these distributions, assuming that the defaults are chosen and that no existing configurations will conflict.
### UpdatesUpdating an installation is done via the install.sh file (assuming your ELSA directory is /usr/local/elsa): sh /usr/local/elsa/contrib/install.sh node update && sh /usr/local/elsa/contrib/install.sh web update. This will check the web for any updates and apply them locally, taking into account local customizations in /etc/elsa_vars.sh.
### Plugins ELSA ships with several plugins:- Windows logs from Eventlog-to-Syslog
- Snort/Suricata logs
- Bro logs
- Url logs from httpry_logger
These plugins tell the web server what to do when a user clicks the "Info" link next to each log. It can do anything, but it is designed for returning useful information in a dialog panel in ELSA with an actions menu. An example that ships with ELSA is that if a StreamDB URL is configured (or OpenFPC) any log that has an IP address in it will have a "getPcap" option which will autofill pcap request parameters for one-click access to the traffic related to the log being viewed.
New plugins can be added easily by subclassing the "Info" Perl class and editing the elsa_web.conf file to include them. Contributions are welcome!
### File Locations The main ELSA configuration files are /etc/elsa_node.conf and /etc/elsa_web.conf. All configuration is controlled through these files, except for query permissions which are stored in the database and administrated through the web interface. Nodes read in the elsa_node.conf file every batch load, so changes may be made to it without having to restart Syslog-NG.Most Linux distributions do not ship recent versions of Syslog-NG. Therefore, the install compiles it from source and installs it to $BASE_DIR/syslog-ng with the configuration file in $BASE_DIR/syslog-ng/etc/, where it will be read by default. By default, $BASE_DIR is /usr/local and $DATA_DIR is /data. Syslog-NG writes raw files to $DATA_DIR/elsa/tmp/buffers/ and loads them into the index and archive tables at an interval configured in the elsa_node.conf file, which is 60 seconds by default. The files are deleted upon successful load. When the logs are bulk inserted into the database, Sphinx is called to index the new rows. When indexing is complete, the loader notes the new index in the database which will make it available to the next query. Indexes are stored in $DATA_DIR/sphinx and comprise about as much space as the raw data stored in MySQL.
Archive tables typically compress at a 10:1 ratio, and therefore use only about 5% of the total space allocated to logs compared with the index tables and indexes themselves. The index tables are necessary because Sphinx searches return only the ID's of the matching logs, not the logs themselves, therefore a primary key lookup is required to retrieve the raw log for display. For this reason, archive tables alone are insufficient because they do not contain a primary key.
If desired, MySQL database files can be stored in a specified directory by adding the "mysql_dir" directive to elsa_node.conf and pointing it to a folder created which has proper permissions and SELinux/apparmor security settings.
### Hosting all files locally If your ELSA web server will not have Internet access, you will need to host the Javascript for the web pages locally. To do this, after installing:cd /usr/local/elsa/web/inc
wget "http://yuilibrary.com/downloads/yui2/yui_2.9.0.zip"
unzip yui_2.9.0.zip
Edit the elsa_web.conf file and set yui/local to be "inc" and comment out "version" and "modifier."
### Caveats for Local File Hosting If Internet access is not available, some plugins will not function correctly. In particular the whois plugin uses an external web service to do lookups, and these will not be possible without Internet connectivity. In addition, dashboards will not work if the client's browser does not have connectivity to Google to pull down their graphing library. ### Web Server The web frontend is typically served with Apache, but the Plack Perl module allows for any web server to be used, including a standalone server called Starman which can be downloaded from CPAN. Any implementation will still have all authentication features available because they are implemented in the underlying Perl.The server is backended on the ELSA web database, (elsa_web by default), which stores user information including permissions, query log, stored results, and query schedules for alerting.
Admins are designated by configuration variables in the elsa_web.conf file, either by system group when using local auth, or by LDAP/AD group when using LDAP auth. To designate a group as an admin, add the group to the array in the configuration. Under the “none” auth mode, all users are admins because they are all logged in under a single pseudo-username.
The web server is required for both log collectors and log searchers (node and web) because searches query nodes (peers) using a web services API.
### Configuration Most settings in the elsa_web.conf and elsa_node.conf files should be fine with the defaults, but there are a few important settings which need to be changed depending on the environment. ### elsa_web.conf: * Nodes: Contains the connection information to the log node databases which hold the actual data. * Auth_method: Controls how authentication and authorization occurs. For LDAP, the ldap settings must also be filled out. * Link_key: should be changed to something other than the default. It is used to salt the auth hashes for permalinks. * Email: For alerts and archive query notifications, you need to setup the email server to use. If you wish to get the actual results from an alert, in addition to a link to the results, add the following config to the email section: "email": {
"include_data": 1
}
-
Meta_db: Should point to the database which stores the web management information. This can reside on a node, but probably shouldn't. The performance won't be much of a factor, so running this locally on the web server should be fine.
-
Excluded_classes: If you want to remove some classes from the menus and searches altogether, configure the config entry for excluded_classes like this:
"excluded_classes": { "BRO_SSL": 1 },
APIKeys: The "apikeys" hash holds all known username/apikey combinations, such as:
"apikeys": { "elsa": "abc" }
Peers: Configuration for how this ELSA node will talk to other ELSA nodes. Note that a configuration for itself (127.0.0.1) is required for any query to complete. An example configuration is:
"peers": {
"127.0.0.1": {
"url": "http://127.0.0.1/",
"user": "elsa",
"apikey": "abc"
}
}
-
Default OR: By default, all search terms are required to be found in the event to constitute a match (AND). If you wish, you can set the config value "default_or" to a true value to change the default behavior to making the search match if any of the given values are true:
"default_or": 1