Skip to content

CloudWatchLogs Firehose Method

Paul Reeves edited this page Dec 9, 2024 · 7 revisions

CloudWatch Logs Firehose Method

CloudWatch Logs Firehose Method Overview

AWS CloudWatch Logs contains valuable log events from AWS services such as Lambda and API Gateway, as well a custom applications running on AWS. These events can be easily ingested into Splunk for analysis, correlation, and long-term storage. These events can be streamed from CloudWatch Logs to Firehose, processed by a Lambda function, then sent onto Splunk from Firehose over the HTTP Event Collector (HEC). In the event that the Firehose cannot reach Splunk, events are sent to an S3 backsplash bucket for later retrieval.

Visual overview:

Mermaid code of visual overview:

graph TB;
	cwl[CloudWatch Logs]
	adf[Amazon Data Firehose]
	splunk[Splunk]
	s3Backsplash[(Firehose Backsplash Bucket)]
    lambda[Lambda Processor]
	cwl-->adf
    adf-->lambda
    lambda-->adf
	adf-->|HEC|splunk
	adf-->|On failure sending to Splunk|s3Backsplash

Splunk Dashboard

There is code for Splunk dashboards located in https://github.com/splunk/splunk-aws-gdi-toolkit/blob/main/CloudWatchLogs-Firehose-Resources/splunkDashboard.json that can be used to check the health of the different events coming in through this method.

Deployment Instructions

These instructions are for configuring Amazon CloudWatch Logs events to be sent to Splunk. As part of the deployment, a CloudWatch Logs subscription is created so that CloudWatch Logs events are streamed to Splunk. This can be significantly cheaper than using a pull-based method where the CloudWatch API is called to retrieve this data, as well as less delay in getting the events into Splunk.

  1. Install the Splunk Add-on for Amazon Web Services (AWS).
    • If you use Splunk Cloud, install the add-on on the ad-hoc search head or ad-hoc search head cluster.
    • If you do not use Splunk Cloud, install this add-on on the HEC endpoint (probably either the indexer(s) or heavy forwarder), your indexer(s), and your search head(s).
  2. Configure any firewall rules in front of Splunk to receive data from Amazon Data Firehose.
    • Reference the AWS documentation for the IP ranges required. Make sure to add the IP ranges from the region you'll be deploying the CloudFormation to.
    • If you use Splunk Cloud, you'll want to add the relevant IP range(s) to the HEC feature in the IP allowlist.
    • If you do not use Splunk Cloud, you'll need to consult with your Splunk Architect, Splunk Admin, and/or network team to determine which firewall rules to change and where.
  3. Create a HEC token with Indexer acknowledgment turned on, in Splunk to ingest the events, with these specific instructions:
    • Make sure to enable indexer acknowledgment.
    • Leave the sourcetype to Automatic. The Lambda function will set the sourcetype before sending it to Firehose.
    • Select the index(es) you want the data to be sent to.
    • Amazon Data Firehose does check the format of the tokens, so we recommend letting Splunk generate this rather than setting it manually through inputs.conf.
    • If you use Splunk Cloud, follow these instructions
    • If you do not use Splunk Cloud, the HEC token will need to be created on the Splunk instance that will be receiving this data (probably either the indexer(s) or a heavy forwarder). Instructions for this can be found here.
  4. Deploy the CloudWatchLogs-Firehose-Resources/cwlToSplunk.yml CloudFormation template in each region, in each account you want to collect CloudWatch Logs events from. This may mean you need to deploy this CloudFormation template multiple times. This template will create the necessary resources to stream CloudWatch log events to Firehose, transform the events to be formatted correctly, then send them onto Splunk. These parameters need to be changed from the default values:
    • cwlLogGroupName: The CloudWatch log group name containing events to be sent to Splunk.
    • logType: A description of the logs to be sent to Splunk. Used in AWS resource naming.
    • splunkHECEndpoint: https://{{url}}:{{port}}
      • For Splunk Cloud, this will be https://http-inputs-firehose-{{stackName}}.splunkcloud.com:443, where {{stackName}} is the name of your Splunk Cloud stack.
      • For non-Splunk Cloud deployments, consult with your Splunk Architect or Splunk Admin.
    • splunkHECToken: The value of the HEC token from step 3
    • splunkHost: the host field setting on CloudWatch log events data
    • splunkIndex: the index the events will be sent to. If you selected event for the splunkEventType, this needs to be a traditional event index.
    • splunkSource: the source field setting on the events.
    • splunkSourcetype: the sourcetype field setting. Usually a custom sourcetype or one defined in the AWS TA.
  5. Verify the data is being ingested. The easiest way to do this is to wait a few minutes, then run a search like index={{ splunkIndex }} sourcetype={{ sourcetype }} | head 100 for event-style data where {{ splunkIndex }} is the destination index selected in step 3, and {{ sourcetype }} is the sourcetype defined in step 4.
  • Deploy cwlToSplunk.yml:
aws cloudformation create-stack --region us-west-2 --stack-name lambda-cwl-to-splunk --capabilities CAPABILITY_NAMED_IAM --template-body file://cwlToSplunk.yml --parameters ParameterKey=cwlLogGroupName,ParameterValue="'/aws/lambda/0123456789012-us-west-2-route53-lambda-function'" ParameterKey=logType,ParameterValue=route53lambda ParameterKey=splunkHECEndpoint,ParameterValue=https://http-inputs-firehose-contoso.splunkcloud.com:443 ParameterKey=splunkHECToken,ParameterValue=01234567-89ab-cdef-0123-456789abcdef ParameterKey=splunkHost,ParameterValue=aws ParameterKey=splunkIndex,ParameterValue=aws  ParameterKey=splunkSource,ParameterValue=aws ParameterKey=splunkSourcetype,ParameterValue=lambdaLog ParameterKey=stage,ParameterValue=prod ParameterKey=cloudWatchAlertEmail,ParameterValue=jsmith@contoso.com ParameterKey=contact,ParameterValue=jsmith@contoso.com

FAQ

  • What events in a LogGroup are sent to Splunk? Since no filter is specified, by default all events in a single CloudWatch log group will be sent to Splunk.
  • How can I see what resources are being deployed? You can see the resources that are going to be deployed in the CloudFormation template.
  • Do I have to create 1 HEC token per sourcetype + index combination, or can I use the same HEC token for multiple sourcetype + index combinations? You can re-use an existing HEC token as long as the HEC token is configured to send to all of the indexes you want to send data to using that HEC token.
  • Why is the CAPABILITY_NAMED_IAM capability required to deploy cwlToSplunk.yml? There are IAM roles and permissions with custom names defined. The IAM roles/permissions are required to grant resources access to other resources (eg Lambda to be able to pull the log files from the S3 bucket), and custom names were set for uniform naming across all the resources deployed in the stack.
  • I found a typo. What should I do? Feel free to message me on the community Slack or submit a PR. I'm really bad at typos. Thank you!
  • How can I see statistics (event latency, license usage, event count, etc) about data coming in through this method of sending data to Splunk? There is code for dashboards located that you can use that reports on this type of information.
  • Why not use the built-in ability for Amazon Data Firehose (KDF) to uncompress the data from CloudWatch Logs, instead of using a Lambda function like you are now? Great question! I did test out this functionality, but it has two major drawbacks and one minor one as compared to the existing solution. The first is that KDF will send batches of data to Splunk's HEC raw endpoint (as opposed to the event endpoint) and have Splunk handle things like event breaking and timestamp extraction. This puts additional load on the indexers, which is something I try to avoid. The second is that the data format coming from KDF is just different enough to require a different sourcetype to handle those items like event breaking and timestamp extraction. I'd rather rely on the aws:cloudwatchlogs sourcetype built into the AWS TA. The minor issue is that since the data is being sent to the raw endpoint, metadata fields (index, host, sourcetype, source, etc) are set at the HEC token level. This means for each unique set of metadata values, a new HEC token will need to be created, which can lead to lots of HEC tokens being created; another thing I try to avoid. I'd rather continue to use the event endpoint where the metadata fields can be set source-side, and not at the HEC token.
  • What if I want to send data to Edge Processor? The recommended way to get the data to Edge Processor is to set up the architecture defined in the EP SVA, with Firehose acting as the HEC source. You will need to configure a signed certificate on the load balancer since Amazon Data Firehose requires the destination it's sending to to have a signed certificate and appropriate DNS entry. The DNS Round Robin architecture could also be used.
  • What if I want to send data to Ingest Processor? To get data to Ingest Processor, first send it to the Splunk Cloud environment like normal, then when creating the pipeline to interact with the data specify a partition that will apply the data being sent from the Splunk AWS GDI Toolkit. For more information on how to do this, refer to the Ingest Processor documentation.
  • I have another question. What should I do? Feel free to message me on the community Slack! I'd love to help.

Troubleshooting

  • The CloudFormation template fails to deploy when I try to deploy it..
    • Verify that the role you are using to deploy the CloudFormation template has the appropriate permissions to deploy the resources you're trying to deploy.
    • Verify that the parameters are set correctly on the stack.
    • Also, in the CloudFormation console, check the events on the failed stack for hints to where it failed.
    • Also see the official Troubleshooting CloudFormation documentation.
  • Events aren't getting to Splunk. What should I check? In this order, check the following:
    1. That the CloudFormation template deployed without error.
    2. That the parameters (especially splunkHECEndpoint and splunkHECToken) are correct.
    3. That the log files are being put into the S3 bucket. The easiest way to do this is just to check the bucket through the AWS Console to see if objects (files) are being put (copied) to it.
    4. That the Lambda transformation function is executing by going to the Lambda function in the AWS Console, clicking the "Monitor" tab, then the "Metrics" sub-tab and looking at the "Invocations" pane.
    5. That the Lambda function is executing successfully by going to the Lambda function in the AWS Console, clicking the "Monitor" tab, then the "Metrics" sub-tab and looking at the "Error count and success rate (%)" pane.
    6. That the Lambda function isn't producing errors by going to the Lambda function in the AWS Console, clicking the "Monitor" tab, clicking "View logs in CloudWatch", then checking the events in the Log streams.
    7. That the Amazon Data Firehose is receiving records by going to the Firehose delivery stream in the AWS Console, clicking the "Monitoring" tab if it's not selected, and viewing the "Incoming records" pane.
    8. That the Amazon Data Firehose is sending records to Splunk by going to the Firehose delivery stream in the AWS Console, clicking the "Monitoring" tab if it's not selected, and viewing the "Delivery to Splunk success" pane. You can also view the "Destination error logs" pane on that same page.
    9. That there are no errors related to ingestion in Splunk.
    10. That any firewall ports are open from Firehose to Splunk.
  • Not all of the events are getting to Splunk or Amazon Data Firehose is being throttled.
    • Amazon Data Firehose has a number of quotas associated with each Firehose. You can check whether you're being throttled by navigating to the monitoring tab in the AWS console for the Firehose and checking if the "Throttled records (Count)" value is greater than zero. If the Firehose is being throttled, you can use the Kinesis Firehose Service quota increase form to request that quota be increased.