In this post, I show how you can build a business intelligence capability for streaming IoT device data using AWS IoT Core, Amazon Firehose, Amazon S3, Amazon Athena and Amazon QuickSight.
In this walkthrough, you run a script to mimic multiple sensors publishing messages on an IoT MQTT topic, with one message published every second. The events get sent to AWS IoT, where an IoT rule is configured to send all message to Firehose. From there, Firehose writes the messages in batches to objects stored in S3. In S3, you set up a table in Athena and use QuickSight to analyze the IoT data.
Archiecture for IoT-Athena-QuickSight
- Kinesis Firehose Delivery Steam Name:
IoT-to-BI-Example
- S3 Bucket:
your bucket
- S3 Prefix:
iot_to_bi_example/
- Convert record format from json to Glue automatically
- Buffer conditions:
64 MiB or 300 seconds
- Name:
Iot_to_bi_demoRule
- Attribute: *
- Topic Filter:
/health/#
- Add Action:
Send messages to an Amazon Kinesis Firehose stream (from previous section).
- Select Separator:
"\n (newline)"
Running the heartrate.py python script generates fictitious IoT messages.
python hearrate.py
{"heartRate": 72, "userId": "Brady", "rateType": "NORMAL", "dateTime": "2021-03-03 15:20:35"}
{"heartRate": 78, "userId": "Beatrice", "rateType": "NORMAL", "dateTime": "2021-03-03 15:20:36"}
{"heartRate": 65, "userId": "Beatrice", "rateType": "NORMAL", "dateTime": "2021-03-03 15:20:37"}
{"heartRate": 96, "userId": "Bonny", "rateType": "NORMAL", "dateTime": "2021-03-03 15:20:38"}
- Create a table using the following query
CREATE EXTERNAL TABLE `heartrate_iot_data`(
heartRate int,
userId string,
rateType string,
dateTime timestamp)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
LOCATION 's3://<CREATED-BUCKET>/iot_to_bi_example/';
- Create the query
# preview data
SELECT userid, rateType, heartRate, dateTime FROM heartrate_iot_data limit 20;
# count the user heart rate data
SELECT userid, rateType, COUNT(userid) FROM heartrate_iot_data GROUP BY userid,rateType
-
Choose Visualize
The
Edit/Preview
data button allows you to prepare the dataset differently prior to visualizing it. Examples include being able to format and transform your data, create alias data fields, change data types, and filter or join your data. Here we keey everything as default