Create alert merger which dispatches alerts from Dynamo to alert processors #642

austinbyers · 2018-03-19T20:08:16Z

to: @ryandeivert
cc: @airbnb/streamalert-maintainers
size: large

Background

This new Lambda function (the Alert Merger) will, as its name implies, eventually be responsible for alert merging. For now, it just dispatches alerts from the alerts Dynamo table and sends them to the alert processor.

The lifecycle of an alert is now as follows:

The rule processor generates alerts and saves them in batches to Dynamo
The alert merger runs every minute, finding all alerts in Dynamo which have not yet been sent or which need to be retried
The alert merger invokes the alert processor once for each alert, providing the Dynamo record as the payload
The alert processor deletes the Dynamo record once it is sent successfully, or else updates the list of outputs which need to be retried

Note that this PR does not implement alert merging, it just sets up the last piece of infra required to make it happen

Changes

Rule processor no longer invokes alert processor
Add a new Alert Merger Lambda function, which runs every minute
Update the alert processor with the new invocation format
Support metric filters in the tf_lambda module
Add a new AlertAttempts custom metric emitted by the Alert MErger

Testing

100% coverage for the alert merger main
Deploy to a test account and verify alerts end-to-end
Deploy just the alert merger: ./manage.py lambda deploy -p alert_merger
Rollback just the alert merger: ./manage.py lambda rollback -p alert_merger
Try an alert which can't be sent and verify it is retried
Enable and disable custom metrics

coveralls · 2018-03-19T20:12:22Z

Coverage decreased (-0.2%) to 95.554% when pulling a53994f on austin-merger into e6bd483 on master.

ryandeivert

😮 this is so good!!!! no real requests from me :) well done!

ryandeivert · 2018-03-19T22:11:10Z

stream_alert/alert_merger/main.py

+
+def handler(event, context):  # pylint: disable=unused-argument
+    """Entry point for the alert merger."""
+    global MERGER  # pylint: disable=global-statement


I thought the global keyword was supposed to avoided whenever possible. We could favor class properties (not instance properties) that would keep the client active for the life of the container. I thought we were doing this elsewhere much I'm having trouble finding it.. but basically this:

class Test(object): CLS_PROP = None def __init__(self): if not Test.CLS_PROP: Test.CLS_PROP = "now created" def get_prop(self): return self.CLS_PROP for i in range(2): # when i == 1 the object is already cached print 'Class prop before init #%d: %s' % (i+1, Test.CLS_PROP) tester = Test() print 'Class prop after init #%d: %s' % (i+1, Test.CLS_PROP) print 'Instance prop after init #%d: %s' % (i+1, tester.get_prop())

I like this because it still encapsulates the client within the class that actually uses it.

EDIT: Pardon my naivety - I was thinking this was a client wrapper but quickly realized it is not.

No, that's a good point. After some discussion, I decided to go with your suggestion and cache the instantiation in the class itself. It makes the handler itself super clean:

def handler(event, context): return AlertProcessor.get_instance(context.invoked_function_arn).run(event)

ryandeivert · 2018-03-19T22:15:33Z

stream_alert/alert_merger/main.py

+        except ClientError as error:
+            if error.response['Error']['Code'] == 'ConditionalCheckFailedException':
+                LOGGER.warn('Conditional update failed: %s', error.response['Error']['Message'])
+            else:


Can you add some context as to why this particular error is okay and only logs, while others will be raised?

It's possible (though very unlikely) that the alert processor could delete the entry from the table before the alert merger updates it with its additional state. If that happens, that's totally fine - we want the rows to be deleted when we don't need them anymore.

I removed the logging entirely - it shouldn't even be a warning, it's just an unusual order of operations. I also clarified with some more comments

…rger

rebase + new metric == substantial change

austinbyers · 2018-03-21T00:50:00Z

@ryandeivert Thanks for the feedback, I've addressed it! I also rebased with your recent change (#643 ) and added support for custom metrics in the alert merger. I tested again end-to-end with a new deploy, but I felt these were substantial enough code changes to require another review when you get a chance!

ryandeivert

thanks for these changes! 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑 🐑

ryandeivert · 2018-03-21T19:24:54Z

stream_alert/alert_merger/main.py

-    if not MERGER:
-        MERGER = AlertMerger()
-    MERGER.dispatch()
+    AlertMerger.get_instance().dispatch()


this is excellent! :) thanks for dealing with my requests!

ryandeivert · 2018-03-21T19:27:19Z

stream_alert/shared/metrics.py

@@ -60,6 +64,9 @@ class MetricLogger(object):
    FIREHOSE_FAILED_RECORDS = 'FirehoseFailedRecords'
    NORMALIZED_RECORDS = 'NormalizedRecords'

+    # Alert Merger metric names
+    ALERT_ATTEMPTS = 'AlertAttempts'


Austin Byers added 3 commits March 16, 2018 17:06

Rule processor no longer invokes alert processor

9e01eda

Add alert merger, running on a regular interval

aaa6192

Update Alert Processor with new invocation format

e0839db

austinbyers added this to the 2.0.0 milestone Mar 19, 2018

austinbyers requested a review from ryandeivert March 19, 2018 20:08

ryandeivert previously approved these changes Mar 19, 2018

View reviewed changes

Austin Byers added 3 commits March 20, 2018 16:17

Address feedback

50a226d

Add AlertAttempts metric to alert merger

8cb1373

Merge branch 'master' of github.com:airbnb/streamalert into austin-me…

75633f1

…rger

Restore missing newlines

a53994f

ryandeivert approved these changes Mar 21, 2018

View reviewed changes

austinbyers merged commit 7e96914 into master Mar 21, 2018

austinbyers mentioned this pull request Mar 22, 2018

Fix bugs introduced from #642 #647

Merged

austinbyers deleted the austin-merger branch March 26, 2018 22:57

austinbyers mentioned this pull request Mar 27, 2018

More efficient helper functions and alert merger bug fix #653

Merged

ryandeivert added alert merging new feature labels Jul 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create alert merger which dispatches alerts from Dynamo to alert processors #642

Create alert merger which dispatches alerts from Dynamo to alert processors #642

austinbyers commented Mar 19, 2018 •

edited

Loading

coveralls commented Mar 19, 2018 •

edited

Loading

ryandeivert left a comment

ryandeivert Mar 19, 2018

austinbyers Mar 21, 2018

ryandeivert Mar 19, 2018

austinbyers Mar 21, 2018

austinbyers commented Mar 21, 2018 •

edited

Loading

ryandeivert left a comment

ryandeivert Mar 21, 2018

ryandeivert Mar 21, 2018

Create alert merger which dispatches alerts from Dynamo to alert processors #642

Create alert merger which dispatches alerts from Dynamo to alert processors #642

Conversation

austinbyers commented Mar 19, 2018 • edited Loading

Background

Changes

Testing

coveralls commented Mar 19, 2018 • edited Loading

ryandeivert left a comment

Choose a reason for hiding this comment

ryandeivert Mar 19, 2018

Choose a reason for hiding this comment

austinbyers Mar 21, 2018

Choose a reason for hiding this comment

ryandeivert Mar 19, 2018

Choose a reason for hiding this comment

austinbyers Mar 21, 2018

Choose a reason for hiding this comment

austinbyers commented Mar 21, 2018 • edited Loading

ryandeivert left a comment

Choose a reason for hiding this comment

ryandeivert Mar 21, 2018

Choose a reason for hiding this comment

ryandeivert Mar 21, 2018

Choose a reason for hiding this comment

austinbyers commented Mar 19, 2018 •

edited

Loading

coveralls commented Mar 19, 2018 •

edited

Loading

austinbyers commented Mar 21, 2018 •

edited

Loading