Skip to content

Commit

Permalink
[WIP] Add a README
Browse files Browse the repository at this point in the history
  • Loading branch information
SwooshyCueb committed Jul 26, 2022
1 parent 84de099 commit f534d8d
Showing 1 changed file with 106 additions and 0 deletions.
106 changes: 106 additions & 0 deletions irods_audit_elk_stack/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# iRODS Audit Demonstration ELK Stack Container

This is an ELK-like stack container used for demonstrating the
[audit rule engine plugin](https://github.com/irods/irods_rule_engine_plugin_audit_amqp>) for
[iRODS](https://irods.org/)

## CAVEAT EMPTOR

While some effort has been taken to ensure this container lives up to certain standards of quality,
it is by no means production-ready.
This is by design.
The express purpose of this container is to demonstrate the audit rule engine plugin,
not to serve as an example of a properly configured ELK stack for use in production environments.
Case in point:
Elasticstack security is explicitly disabled,
and the Python script standing in for Logstash was not written with performance or resilience in mind.

## Container Contents

### Overview

This Ubuntu Focal-based container contains Elasticsearch 8, Kibana 8, RabbitMQ (with AMQP 1.0 plugin and management plugin), and a Python daemon that was specifically written for this demonstration to stand in for an AMQP 1.0-capable Logstash.

RabbitMQ receives AMQP 1.0 messages containing JSON data from the audit plugin. The Python daemon takes the messages from RabbitMQ, does a little type conversion in the JSON, and puts the information in Elasticsearch. Kibana is configured with a sample dashboard that displays extracted metrics from the data in Elasticsearch.

Other than the Ubuntu release itself and the major versions for Temurin, Elasticsearch, and Kibana, the Dockerfile does not specify specific versions of software to be used; therefore, packages can be updated to their latest versions by simply rebuilding the docker image without cache.

### Credentials

RabbitMQ is configured with an administrator account with username `test` and password `test`.
Security is explicitly disabled in Elasticsearch. No credentials are required for Kibana.

### Entrypoint

Upon running the container, RabbitMQ, ElasticSearch, the Logstash stand-in, and Kibana will start up, in that order.
Once all services are running, the entrypoint script runs `ip addr`, which allows for easy access to the container's IP address.

#### Arguments

The entrypoint takes a single optional argument, `--es-java-heap-size`, to set the Elasticsearch Java heap size. By default, it is set to `512m`. It can be set to any value that Java would recognize, or `auto` to allow Elasticsearch/Java to decide on a heap size automatically.

### Relevant Ports

| Port | Description |
| ------: | :------------------------------------------------------------------------------------ |
| `5672` | RabbitMQ listens on this port for AMQP 1.0 (and AMQP 0-9-1) clients |
| `15672` | RabbitMQ management plugin listens on this port for web browsers and HTTP API clients |
| `5601` | Kibana listens on this port for web browsers and REST API clients |

### Details

#### JVM

The JDK/JRE used in this container is [Temurin](https://adoptium.net/temurin) 17 with the Hotspot JVM.

The decision not to use Elasticsearch's bundled JDK/JRE was made for two reasons:
- To de-bloat the container image. Having multiple JDK/JRE installations uses a lot of space.
- To ensure everything uses the same JDK/JRE installation.

Temurin was chosen over the distro-provided JDK/JRE for a couple of reasons:
- The Hotspot AdoptOpenJDK flavor of JVM handles memory pressure very well.
- The AdoptOpenJDK flavors of JVM work well in containers.

Instead of using the [Eclipse-provided Focal-based Temurin 17 docker image](https://hub.docker.com/_/eclipse-temurin?tab=tags&page=1&name=17-jre-focal) <sub>[[Dockerfile](https://github.com/adoptium/containers/blob/main/17/jre/ubuntu/focal/Dockerfile.releases.full)]</sub> for our base, we use the JDK debian package from [Adoptium's apt repository](https://adoptium.net/installation/linux#_deb_installation_on_debian_or_ubuntu), as the JDK/JRE in the Eclipse-provided containers is not set up to work properly with [Ubuntu/Debian's `java-common` system](https://manpages.debian.org/buster/java-common/update-java-alternatives.8.en.html).

`dpkg` is configured to drop includes, manpages, source zips, and samples from this package, so they are not installed in the container.

#### RabbitMQ

The [`rabbitmq_amqp1_0`](https://github.com/rabbitmq/rabbitmq-server/tree/master/deps/rabbitmq_amqp1_0) and [`rabbitmq_management`](https://github.com/rabbitmq/rabbitmq-server/tree/master/deps/rabbitmq_management) plugins are enabled. The `test` administrator account is created in the Dockerfile.

#### Elasticsearch

Elasticsearch is configured for a single-node cluster. Security is explicitly disabled, as are machine learning APIs. Both the transport and HTTP ports are configured to specific ports instead of a port range (`9200` and `9300`, respectively).

Elasticsearch is initalized with an (empty) index `irods_audit`, with a field limit of `2000`.

Starting with Elasticsearch 8, `init.d` scripts are no longer included in the deb packages, in lieu of systemd unit files. As such, we provide our own `init.d` script based on the `init.d` script provided by the Elasticsearch 7 packages.

The Elasticsearch JVM is configured to not dump its heap on an out-of-memory error.

`dpkg` is configured to drop the bundled JVM from the Elasticsearch package, so it is not installed in the container.

#### Kibana

Kibana is initialized with a sample dashboard useful for demonstrating how one might use Kibana to aggregate metrics from audit data.

Starting with Kibana 8, `init.d` scripts are no longer included in the deb packages, in lieu of systemd unit files. As such, we provide our own `init.d` script based on the `init.d` script provided by the Elasticsearch 7 packages.
Compared to other `init.d` script implementations for Kibana (and the systemd unit), our `init.d` script has the ability to actually perform health-checks on the running Kibana server. This means that the `start` command does not return until Kibana is actually finished starting up, and the `status` command actually tries to verify that Kibana is not degraded.

`dpkg` is configured to drop includes and manpages from Kibana's bundled `nodejs`, so they are not installed in the container.

#### Logstash Stand-In Python Script



## Updating This Container



## README TODO

- init scripts
- not-logstash details
- Updating the container
- Updating the `ndjson`

0 comments on commit f534d8d

Please sign in to comment.