Skip to content

Commit

Permalink
official release of v3.2.0 (#1234)
Browse files Browse the repository at this point in the history
* bumping version to 3.2.0

* migrating Athena function to use tf_lambda module (#1217)

* rename of athena function

* updating terraform generation code to use tf_lambda module

* updating tf_athena module to remove lambda code

* updates for packaging, rollback, and deploy

* misc updates related to config path renaming, etc

* removing no-longer-used method (athena is default)

* addressing PR feedback

* adding more granular time prefix to athena client

* fixing duplicate resource issues (#1218)

* fixing duplicate resource issues

* fixing some other bugs in #1217

* fixing tf targets for athena deploy (#1220)

* adding "--config-dir" flag to CLI to support specifying path for config files (#1224)

* adding support for supplying path to config via CLI flag

* misc touchups

* updating publishers to accept configurable paths (#1223)

* moving matchers outside of rules directory

* updating rules for new matcher path

* updating unit test for consistency

* making publisher locations configurable

* fixing typo

* updating tf_lambda module to remove extra resources (#1225)

* fixing rollback for all functions, removing 'all' flag for function deploys (#1222)

* updating rollback functionality to include all funcs

* updating tests to check for rollback of all funcs

* updating docs

* fixing tf cycle and index issue (#1226)

* Add missing dependency (#1228)

* Implements a v2 Lambda Output with AssumeRole (#1227)

* First draft of aws-lambda-v2

* Tests

* Fixup

* Fixup

* Fioxup

* Fixup

* fixup

* adding terraform references for some buckets (#1229)

* adding athena terraform references instead of literals

* fixing tests

* GitHub Actions (#1231)

* port to github actions

* remove travis

* cover the 3.2 branch for now too

* initial updates to simplify lambda packaging logic (#1232)

* moving some precompiled files

* initial revamp to packaging to remove multiple pacakges

* taking out more trash

* update scheduled queries module

* updating deploy logic to suck garbage slightly less

* updates to unit tests

* addressing pr feedback

* addressing PR feedback

* small update to docs (#1233)

Co-authored-by: Ryxias <derek.wang@airbnb.com>
Co-authored-by: Paul Kehrer <paul.l.kehrer@gmail.com>
  • Loading branch information
3 people authored Apr 9, 2020
1 parent 41da6b5 commit 4afadf5
Show file tree
Hide file tree
Showing 91 changed files with 1,330 additions and 1,447 deletions.
47 changes: 47 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Actions CI
on:
pull_request: {}
push:
branches:
- master
- release-3-2-0
tags:
- 'v*.*.*'

jobs:
testing:
runs-on: ubuntu-latest
strategy:
matrix:
python:
- 3.7
task:
- name: Lint
command: |
./tests/scripts/pylint.sh
- name: Test
command: |
./tests/scripts/unit_tests.sh
./manage.py test rules
./manage.py test classifier
- name: Docs
command: |
sphinx-build -W docs/source docs/build
- name: Bandit
command: |
bandit --ini setup.cfg -r .
name: "Python ${{ matrix.python }}/${{ matrix.task.name }}"
steps:
- uses: "actions/checkout@v2"
- uses: "actions/setup-python@v1"
with:
python-version: ${{ matrix.python }}
- name: Install requirements
run: pip install -r requirements.txt
- name: ${{ matrix.task.name }}
run: ${{ matrix.task.command }}
- name: Submit Coverage
run: coveralls
if: matrix.task.name == 'Test'
env:
COVERALLS_REPO_TOKEN: ${{ secrets.COVERALLS_REPO_TOKEN }}
19 changes: 0 additions & 19 deletions .travis.yml

This file was deleted.

4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
StreamAlert - Serverless, Realtime Data Analysis Framework
==========================================================

.. image:: https://travis-ci.org/airbnb/streamalert.svg?branch=master
:target: https://travis-ci.org/airbnb/streamalert
.. image:: https://github.com/airbnb/streamalert/workflows/Actions%20CI/badge.svg
:target: https://github.com/airbnb/streamalert/actions?query=workflow%3AActions+CI

.. image:: https://coveralls.io/repos/github/airbnb/streamalert/badge.svg?branch=master
:target: https://coveralls.io/github/airbnb/streamalert?branch=master
Expand Down
7 changes: 7 additions & 0 deletions conf/global.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,13 @@
],
"scheduled_query_locations": [
"scheduled_queries"
],
"publisher_locations": [
"publishers"
],
"third_party_libraries": [
"pathlib2==2.3.5",
"policyuniverse==1.3.2.1"
]
},
"infrastructure": {
Expand Down
7 changes: 3 additions & 4 deletions conf/lambda.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,10 @@
"subnet_ids": []
}
},
"athena_partition_refresh_config": {
"athena_partitioner_config": {
"concurrency_limit": 10,
"memory": 128,
"timeout": 300,
"file_format": null,
"log_level": "info"
},
Expand Down Expand Up @@ -101,9 +103,6 @@
}
},
"sqs_record_batch_size": 10,
"third_party_libraries": [
"pathlib2"
],
"timeout": 60,
"vpc_config": {
"security_group_ids": [],
Expand Down
3 changes: 3 additions & 0 deletions conf/outputs.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
"aws-lambda": {
"sample-lambda": "function-name:qualifier"
},
"aws-lambda-v2": [
"sample-lambda"
],
"aws-s3": {
"bucket": "aws-s3-bucket"
},
Expand Down
2 changes: 1 addition & 1 deletion docs/source/apps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ The recommended process is to deploy both the `apps` function and the `classifie

.. code-block:: bash
python manage.py deploy --function classifier apps
python manage.py deploy --functions classifier apps
Authorizing the Slack App
Expand Down
2 changes: 1 addition & 1 deletion docs/source/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ configured `outputs <outputs.html>`_. All alerts implicitly include a Firehose o
an S3 bucket that can be queried with Athena. Alerts will be retried indefinitely until they are
successfully delivered, at which point they will be removed from the DynamoDB table.

6. An "athena partition refresh" Lambda function runs periodically to onboard new StreamAlert data
6. An Athena Partitioner Lambda function runs periodically to onboard new StreamAlert data
and alerts into their respective Athena databases for historical search.

Other StreamAlert components include DynamoDB tables and Lambda functions for optional rule
Expand Down
8 changes: 8 additions & 0 deletions docs/source/config-global.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,12 @@ Configuration
],
"scheduled_query_locations": [
"scheduled_queries"
],
"publisher_locations": [
"publishers"
],
"third_party_libraries": [
"pathlib2==2.3.5"
]
}
}
Expand All @@ -82,6 +88,8 @@ Options
``matcher_locations`` Yes ``["matchers"]`` List of local paths where ``matchers`` are defined
``rule_locations`` Yes ``["rules"]`` List of local paths where ``rules`` are defined
``scheduled_query_locations`` Yes ``["scheduled_queries"]`` List of local paths where ``scheduled_queries`` are defined
``publisher_locations`` Yes ``["publishers"]`` List of local paths where ``publishers`` are defined
``third_party_libraries`` No ``["pathlib2==2.3.5"]`` List of third party dependencies that should be installed via ``pip`` at deployment time. These are libraries needed in rules, custom code, etc that are defined in one of the above settings.
============================= ============= ========================= ===============


Expand Down
24 changes: 12 additions & 12 deletions docs/source/deployment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,20 +35,20 @@ To deploy new changes for all AWS Lambda functions:

.. code-block:: bash
python manage.py deploy --function all
python manage.py deploy
Optionally, to deploy changes for only a specific AWS Lambda function:

.. code-block:: bash
python manage.py deploy --function alert
python manage.py deploy --function alert_merger
python manage.py deploy --function apps
python manage.py deploy --function athena
python manage.py deploy --function classifier
python manage.py deploy --function rule
python manage.py deploy --function rule_promo
python manage.py deploy --function threat_intel_downloader
python manage.py deploy --functions alert
python manage.py deploy --functions alert_merger
python manage.py deploy --functions apps
python manage.py deploy --functions athena
python manage.py deploy --functions classifier
python manage.py deploy --functions rule
python manage.py deploy --functions rule_promo
python manage.py deploy --functions threat_intel_downloader
To apply infrastructure level changes (additional Kinesis Shards, new CloudTrails, etc), run:

Expand Down Expand Up @@ -95,8 +95,8 @@ to point to the previous version:

.. code-block:: bash
python manage.py rollback --function rule
python manage.py rollback --function alert
python manage.py rollback --function all
python manage.py rollback --functions rule
python manage.py rollback --functions alert
python manage.py rollback
This is helpful to quickly revert changes to Lambda functions, e.g. if a bad rule was deployed.
6 changes: 3 additions & 3 deletions docs/source/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ Deploy

.. code-block:: bash
"athena_partition_refresh_config": {
"athena_partitioner_config": {
"concurrency_limit": 10,
"file_format": "parquet",
"log_level": "info"
Expand Down Expand Up @@ -237,7 +237,7 @@ alerts on any usage of the root AWS account. Change the rule decorator to:
python manage.py build
# Deploy a new version of all of the Lambda functions with the updated rule and config files
python manage.py deploy --function all
python manage.py deploy
.. note:: Use ``build`` and ``deploy`` to apply any changes to StreamAlert's
configuration or Lambda functions, respectively. Some changes (like this example) require both.
Expand Down Expand Up @@ -284,7 +284,7 @@ dropdown on the left and preview the ``alerts`` table:
:target: _images/athena-alerts-search.png

(Here, my name prefix is ``testv2``.) If no records are returned, look for errors
in the Athena Partition Refresh function or try invoking it directly.
in the Athena Partitioner function or try invoking it directly.

And there you have it! Ingested log data is parsed, classified, and scanned by the rules engine.
Any resulting alerts are delivered to your configured output(s) within a matter of minutes.
26 changes: 13 additions & 13 deletions docs/source/historical-search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ StreamAlert historical search feature is backed by Amazon S3 and `Athena <https:
By default, StreamAlert will send all alerts to S3 and those alerts will be searchable in Athena table. StreamAlert
users have option to enable historical search feature for data as well.

As of StreamAlert v3.1.0, a new field, ``file_format``, has been added to ``athena_partition_refresh_config``
As of StreamAlert v3.1.0, a new field, ``file_format``, has been added to ``athena_partitioner_config``
in ``conf/lamba.json``, defaulting to ``null``. This field allows users to configure how the data processed
by the Classifier is stored in S3 bucket, either in ``parquet`` or ``json``.

Expand Down Expand Up @@ -39,7 +39,7 @@ The pipeline is:

#. StreamAlert creates an Athena Database, alerts kinesis Firehose and ``alerts`` table during initial deployment
#. Optionally create Firehose resources and Athena tables for historical data retention
#. S3 events will be sent to an SQS that is mapped to the Athena Partition Refresh Lambda function
#. S3 events will be sent to an SQS that is mapped to the Athena Partitioner Lambda function
#. The Lambda function adds new partitions when there are new alerts or data saved in S3 bucket via Firehose
#. Alerts, and optionally data, are available for searching via Athena console or the Athena API

Expand All @@ -50,30 +50,30 @@ Alerts Search
*************

* Review the settings for the :ref:`Alerts Firehose Configuration <alerts_firehose_configuration>` and
the :ref:`Athena Partition Refresh<configure_athena_partition_refresh_lambda>` function. Note that
the :ref:`Athena Partitioner<configure_athena_partitioner_lambda>` function. Note that
the Athena database and alerts table are created automatically when you first deploy StreamAlert.
* If the ``file_format`` value within the :ref:`Athena Partition Refresh<configure_athena_partition_refresh_lambda>`
* If the ``file_format`` value within the :ref:`Athena Partitioner<configure_athena_partitioner_lambda>`
function config is set to ``parquet``, you can run the ``MSCK REPAIR TABLE alerts`` command in
Athena to load all available partitions and then alerts can be searchable. Note, however, that the
``MSCK REPAIR`` command cannot load new partitions automatically.
* StreamAlert includes a Lambda function to automatically add new partitions for Athena tables when
the data arrives in S3. See :ref:`configure_athena_partition_refresh_lambda`
the data arrives in S3. See :ref:`configure_athena_partitioner_lambda`

.. code-block:: bash
{
"athena_partition_refresh_config": {
"athena_partitioner_config": {
"concurrency_limit": 10,
"file_format": "parquet",
"log_level": "info"
}
}
* Deploy the Athena Partition Refresh Lambda function
* Deploy the Athena Partitioner Lambda function

.. code-block:: bash
python manage.py deploy --function athena
python manage.py deploy --functions athena
* Search alerts in `Athena Console <https://console.aws.amazon.com/athena>`_

Expand All @@ -99,7 +99,7 @@ It is optional to store data in S3 bucket and available for search in Athena tab

.. code-block:: bash
python manage.py deploy --function classifier
python manage.py deploy --functions classifier
* Search data `Athena Console <https://console.aws.amazon.com/athena>`_

Expand All @@ -109,7 +109,7 @@ It is optional to store data in S3 bucket and available for search in Athena tab
.. image:: ../images/athena-data-search.png


.. _configure_athena_partition_refresh_lambda:
.. _configure_athena_partitioner_lambda:

*************************
Configure Lambda Settings
Expand All @@ -120,8 +120,8 @@ Open ``conf/lambda.json``, and fill in the following options:
=================================== ======== ==================== ===========
Key Required Default Description
----------------------------------- -------- -------------------- -----------
``enabled`` Yes ``true`` Enables/Disables the Athena Partition Refresh Lambda function
``enable_custom_metrics`` No ``false`` Enables/Disables logging of metrics for the Athena Partition Refresh Lambda function
``enabled`` Yes ``true`` Enables/Disables the Athena Partitioner Lambda function
``enable_custom_metrics`` No ``false`` Enables/Disables logging of metrics for the Athena Partitioner Lambda function
``log_level`` No ``info`` The log level for the Lambda function, can be either ``info`` or ``debug``. Debug will help with diagnosing errors with polling SQS or sending Athena queries.
``memory`` No ``128`` The amount of memory (in MB) allocated to the Lambda function
``timeout`` No ``60`` The maximum duration of the Lambda function (in seconds)
Expand All @@ -134,7 +134,7 @@ Key Required Default Descriptio
.. code-block:: json
{
"athena_partition_refresh_config": {
"athena_partitioner_config": {
"log_level": "info",
"memory": 128,
"buckets": {
Expand Down
8 changes: 4 additions & 4 deletions docs/source/outputs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ Adding a new configuration for a currently supported service is handled using ``
.. note::

``<SERVICE_NAME>`` above should be one of the following supported service identifiers.
``aws-cloudwatch-log``, ``aws-firehose``, ``aws-lambda``, ``aws-s3``, ``aws-sns``, ``aws-sqs``,
``carbonblack``, ``github``, ``jira``, ``komand``, ``pagerduty``, ``pagerduty-incident``,
``pagerduty-v2``, ``phantom``, ``slack``
``aws-cloudwatch-log``, ``aws-firehose``, ``aws-lambda``, ``aws-lambda-v2``, ``aws-s3``,
``aws-sns``, ``aws-sqs``, ``carbonblack``, ``github``, ``jira``, ``komand``, ``pagerduty``,
``pagerduty-incident``, ``pagerduty-v2``, ``phantom``, ``slack``

For example:

Expand Down Expand Up @@ -158,7 +158,7 @@ The ``OutputProperty`` object used in ``get_user_defined_properties`` is a ``nam

:cred_requirement:
A ``boolean`` that indicates whether this value is required for API access with this service. Ultimately, setting this value to ``True`` indicates
that the value should be encrypted and stored in Amazon S3.
that the value should be encrypted and stored in Amazon Systems Manager.
Default is: ``False``


Expand Down
2 changes: 1 addition & 1 deletion docs/source/rule-promotion.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ function code.

.. code-block:: bash
python manage.py deploy --function rule_promo
python manage.py deploy --functions rule_promo
.. note::

Expand Down
2 changes: 1 addition & 1 deletion docs/source/rule-staging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ staged during a deploy. To allow for this, the Rules Engine can be deployed with

.. code-block:: bash
python manage.py deploy --function rule --skip-rule-staging
python manage.py deploy --functions rule --skip-rule-staging
This will force all new rules to send to user-defined outputs immediately upon deploy, bypassing
the default staging period. Alternatively, the ``--stage-rules`` and ``--unstage-rules`` flags
Expand Down
Loading

0 comments on commit 4afadf5

Please sign in to comment.