Skip to content

The Game Analytics Pipeline solution helps game developers to apply a flexible, and scalable DataOps methodology to their games. Allowing them to continuously integrate, and continuously deploy a scalable serverless data pipeline for ingesting, storing, and analyzing telemetry data generated from games, and services.

License

Notifications You must be signed in to change notification settings

ifebrand6/guidance-for-game-analytics-pipeline-on-aws

 
 

Repository files navigation

Game Analytics Pipeline on AWS

Table of Content

Overview

The games industry is increasing adoption of the Games-as-a-Service operating model, where games have become more like a service than a product, and recurring revenue is frequently generated through in-app purchases, subscriptions, and other techniques. With this change, it is critical to develop a deeper understanding of how players use the features of games and related services. This understanding allows game developers to continually adapt, and make the necessary changes to keep players engaged.

The Game Analytics Pipeline guidance helps game developers to apply a flexible, and scalable DataOps methodology to their games. Allowing them to continuously integrate, and continuously deploy (CI/CD) a scalable serverless data pipeline for ingesting, storing, and analyzing telemetry data generated from games, and services. The guidance supports streaming ingestion of data, allowing users to gain critical insights from their games, and other applications in near real-time, allowing them to focus on expanding, and improving game experience almost immediately, instead of managing the underlying infrastructure operations. Since the guidance has been codified as a CDK application, game developers can determine the best modules, or components that fit their use case, allowing them to test, and QA the best architecture before deploying into production. This modular system allows for additional AWS capabilities, such as AI/ML models, to be integrated into the architecture in order to further support real-time decision making, and automated LiveOps using AIOps, to further enhance player engagement. Essentially allowing developers to focus on expanding game functionality, rather than managing the underlying infrastructure operations.

Architecture

Prerequisites

Before deploying the sample code, ensure that the following required tools have been installed:

  • AWS Cloud Development Kit (CDK) 2.92
  • Python 3
  • NodeJS 16.20.0

NOTE: It is recommended that that you configure, and deploy the sample code using a pre-configured AWS Cloud9 development environment. Refer to the Individual user setup for AWS Cloud9 for more information on how to set up Cloud9 as the only user in the AWS account. The Cloud9 IDE may have an updated version of the CDK installed therefore, run the npm install -g aws-cdk@2.92.0 --force to ensure that version 2.92.0 of the CDK is installed. Additionally, due to the fact that you will be building NodeJS packages, ensure that there is sufficient disk space on the Cloud9 instance. See Resize an Amazon EBS volume for more information.

Sample Code Configuration and Customization

Before deploying the sample code, it needs to be customized to suite your specific usage requirements. Guidance configuration, and customization, is managed using a config.yaml file, located in the infrastructure folder of the repository.

Configuration Setup

The following steps will walk you through how to customize the sample code configuration to suite your usage requirements:

  1. A configuration template file, called config.yaml.TEMPLATE has been provided as a reference for use case customizations. Run the following command to create a usable copy of this file:

    cp ./infrastructure/config.yaml.TEMPLATE ./infrastructure/config.yaml
  2. Open the ./infrastructure/config.yaml file for editing.

Custom Settings

The following settings can be adjusted to suite your use case:

  • WORKLOAD_NAME
    • Description: The name of the workload that will deployed. This name will be used as a prefix for for any component deployed into your AWS Account.
    • Type: String
    • Example: "GameAnalyticsPipeline"
  • CDK_VERSION
    • Description: The version of the CDK installed in your environment. To see the current version of the CDK, run the cdk --version command. The guidance has been tested using CDK version 2.92.0 of the CDK. If you are using a different version of the CDK, ensure that this version is also reflected in the ./infrastructure/package.json file.
    • Type: String
    • Example: "2.92.0"
  • NODE_VERSION
    • Description: The version of NodeJS being used. The default value is set to "latest", and should only be changed this if you require a specific version.
    • Type: String
    • Example: "latest"
  • PYTHON_VESION
    • Description: The version of Python being used. The default value is set to "3.8", and should only be changed if you require a specific version.
    • Type: String
    • Example: "3.8"
  • DEV_MODE
    • Description: Wether or not to enable developer mode. This mode will ensure synthetic data, and shorter retention times are enabled. It is recommended that you set the value to true when first deploying the sample code for testing, as this setting will enable S3 versioning, and won't delete S3 buckets on teardown. This setting can be changed at a later time, and the infrastructure re-deployed through CI/CD.
    • Type: Boolean
    • Example: true
  • ENABLE_STREAMING_ANALYTICS
    • Description: Wether or not to enable the Kinesis Data Analytics component/module of the guidance. It is recommended to set this value to true when first deploying this sample code for testing, as this setting will allow you to verify if streaming analytics is required for your use case. This setting can be changed at a later time, and the guidance re-deployed through CI/CD.
    • Type: Boolean
    • Example: true
  • STREAM_SHARD_COUNT
    • Description: The number of Kinesis shards, or sequence of data records, to use for the data stream. The default value has been set to 1 for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD. For information about determining the shards required for your use case, refer to Amazon Kinesis Data Streams Terminology and Concepts in the Amazon Kinesis Data Streams Developer Guide.
    • Type: Integer
    • Example: 1
  • CODECOMMIT_REPO
    • Description: The name of the AWS CodeCoomit, repository used as source control for the codified infrastructure, and CI/CD pipeline.
    • Type: String
    • Example: "game-analytics-pipeline"
  • RAW_EVENTS_PREFIX
    • Description: The prefix for new/raw data files stored in S3.
    • Type: String
    • Example: "raw_events"
  • PROCESSED_EVENTS_PREFIX
    • Description: The prefix processed data files stored in S3.
    • Type: String
    • Example: "processed_events"
  • RAW_EVENTS_TABLE
    • Description: The name of the of the AWS Glue table within which all new/raw data is cataloged.
    • Type: String
    • Example: "raw_events"
  • GLUE_TMP_PREFIX
    • Description: The name of the temporary data store for AWS Glue.
    • Type: String
    • Example: "glueetl-tmp"
  • S3_BACKUP_MODE
    • Description: Wether or not to enable Kinesis Data Firehose to send a backup of new/raw data to S3. The default value has been set to false for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD.
    • Type: Boolean
    • Example: false
  • CLOUDWATCH_RETENTION_DAYS
    • Description: The default number of days in which Amazon CloudWatch stores all the logs. The default value has been set to 30 for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD.
    • Type: Integer
    • Example: 30
  • API_STAGE_NAME
    • Description: The name of the REST API stage for the Amazon API Gateway configuration endpoint for sending telemetry data to the pipeline. This provides an integration option for applications that cannot integrate with Amazon Kinesis directly. The API also provides configuration endpoints for admins to use for registering their game applications with the guidance, and generating API keys for developers to use when sending events to the REST API. The default value is set to live.
    • Type: String
    • Example: "live"
  • EMAIL_ADDRESS
    • Description: The email address to receive operational notifications, and delivered by CloudWatch.
    • Type: String
    • Example: "user@example.com"
  • accounts
    • Description: Leverages CDK Cross-account, Cross-region capabilities for deploying separate CI/CD pipeline stages to separate AWS Accounts, AWS Regions. For more information on Cross-account CI/CD pipelines, using the CDK, refer to the Building a Cross-account CI/CD Pipeline workshop.
    • Example:
      accounts:
        - NAME: "QA"
          ACCOUNT: "<YOUR-ACCOUNT-NUMBER>"
          REGION: "<QA-ACCOUNT-REGION>"
        - NAME: "PROD"
          ACCOUNT: "<YOUR-ACCOUNT-NUMBER>"
          REGION: "<PROD-ACCOUNT-REGION>"

      NOTE: It is recommended that you use the same AWS Account, as well as the same AWS Region, for both the QA, and PROD stages, when first deploying the guidance.

Sample Code Deployment

Once you will have to add your own custom configuration settings, and saved the config.yaml file, then following steps can be used to deploy the CI/CD pipeline:

  1. Build the sample code dependencies, by running the following command:
    npm run build
  2. Bootstrap the sample code, by running the following command:
    npm run deploy.bootstrap
  3. Deploy the sample code, by running the following command:
    npm run deploy

After the sample code has been deployed, two CloudFormation stacks are created within you AWS Account, and AWS Region:

  1. PROD-<WORKLOAD NAME>: The deployed version of the guidance infrastructure.
  2. <WORKLOAD NAME>-Toolchain: The CI/CD Pipeline for the guidance.

Deployed Infrastructure

The stack hosts the deployed production version of the AWS resources for you to validate, and further optimize the guidance for your use case.

CI/CD Toolchain

Once the deployed infrastructure has been validated, or further optimized for your use case, you can trigger the continuos deployment, by committing any updated source code into the newly create CodeCommit repository, using the following steps:

  1. Copy the URL for cloning CodeCommit repository that you specified in the config.yanl file. See the View repository details (console) section of the AWS CodeCommit User Guid for more information on how to vie the Clone URL for the repository.
  2. Create a news Git repository, by running the following command:
    rm -rf .git
    git init --initial-branch=main
  3. Add the CodeCommit repository as the origin, using the following command:
    git remote add origin <CodeCommit Clone URL>
  4. Commit the code to trigger the CI/CD process, by running the following commands:
    git add -A
    git commit -m "Initial commit"
    git push --set-upstream origin

Next Steps

Make any code changes to subsequently optimize the guidance for your use case. Committing these changes will trigger a subsequent continuous integration, and deployment of the deployed production stack, PROD-<WORKLOAD NAME>.

Cleanup

To clean up any of the deployed resources, you can either delete the stack through the AWS CloudFormation console, or run the cdk destroy command.

NOTE: Deleting the deployed resources will not delete the Amazon S3 bucket, in order to protect any game data already ingested, and stored with the data lake. The Amazon S3 Bucket, and data, can be deleted from Amazon S3 using the Amazon S3 console, AWS SDKs, AWS Command Line Interface (AWS CLI), or REST API. See the Deleting Amazon S3 objects section of the user guide for mor information.


Security

See CONTRIBUTING for more information.


License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

The Game Analytics Pipeline solution helps game developers to apply a flexible, and scalable DataOps methodology to their games. Allowing them to continuously integrate, and continuously deploy a scalable serverless data pipeline for ingesting, storing, and analyzing telemetry data generated from games, and services.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 47.4%
  • TypeScript 41.5%
  • Python 11.1%