Does the AWS Clickstream solution take into account session_id stickiness? #1041

AdamUrbanfox · 2024-04-10T10:16:53Z

AdamUrbanfox
Apr 10, 2024

I have followed the implementation guide for the AWS clickstream solution, currently using Kinesis on demand, sinking to an S3 bucket (just following the steps in the setup). I'm not bothered with setting up data processing and analytics dashboards. I simply want to use the data ingestion module and consume from the stream (MSK or Kinesis) using custom built processing consumers.

Requirements: I need to spin up a fleet of consumers as I potentially will be dealing with huge volumes of data that requires parallelised consumption and processing of the data in the stream. All events that belong to a single session (example '_session_id' ) MUST end up in the same consumer.

If the events get randomly passed into different partitions or shards, then the session becomes split across multiple consumers which completely breaks my processing of the data stream. I know kinesis and Kafka support partition keys so that data arriving into the streams with the same partition key (ideally a session ID of some kind) are ensured to end up in the same partition and therefore end up in the same consumer. The documentation is not clear about how incoming events sent from the client application within a single "_session_id" (or some other identifier of a users events) are split up across partitions within within the data stream.

The clickstream solution seems designed to dump the data into an S3 bucket which only fine if I wasn't doing real-time processing of my data. I could simply sort my S3 data into sessions and do batch processing then, but this is not what I want.

Are there any experts on the AWS clickstream solution that can tell me if partition keys are taken into account in order to sort the session's events into partitions/shards? Or is the ingestion module designed to just dump all incoming data randomly into whatever partition/shard it wants? Session stickiness across partitions is an absolute requirement of my project.

EDIT: Maybe another related question is: If it does not partition the data by a session_id or other similar ID, is it possible to modify the solution to customise it? I'm guessing it would be clone the AWS clickstream analytics repo -> modify the source code for the vector server (the configuration toml files I'm assuming) to partition by session_id -> re-bootstrap the cdk, re-deploy the stack with the modified vector server configuration? Is this possible? Or are there components that the solution pulls from image repositories that are not created by the CDK and therefore non-modifiable? In other words, is the entire solution completely customisable by forking and modifying the GitHub repo?

Thanks.

zxkane · 2024-04-10T14:54:24Z

zxkane
Apr 10, 2024

Clickstream does not support session stickiness out-of-box. This does not align with the design philosophies of the ingestion module. However, it is a high-throughput and reliable ingestion server that can handle hundreds of thousands of requests per second in our benchmark. It is SDK-agnostic and supports Clickstream SDKs, GTM, and other third-party SDKs. It's important to note that the session might not always be available in the data. Clickstream SDKs compress multiple events and then base64 encodes them by default before sending the data. However, this process may consume lots of compute resources and increase latencies if parsing the attribute of each event on the ingestion server.

To process events per session in real-time, you can have a consumer process the events in KDS/MSK. This involves decoding, uncompressing, and extracting the session information of events. Once this is done, the events can be put into another downstream stream (KDS/MSK topic) for consumption by your business consumers. If you have a large volume of requests, MSK might be a cost-efficient option.

If you want to customize the ingestion server, there is an option to forward events with the session serving as the partition key. If you are using Clickstream SDKs, you will also need to customize them to send events per batch in the same session. The complete source code, including the container image of the ingestion server, is available in this repository, and you can modify it to meet your specific requirements.

5 replies

AdamUrbanfox Apr 10, 2024
Author

Thank you for your extremely well put answer. I think I have identified the file to modify for adding the partition key as
src/ingestion-server/server/images/vector/config/vector-kinesis-ack.toml
If I change partition_key_field to "attributes._session_id" , how do I re-deploy the the solution? Is it as simple as to re-bootstrap with:
$ npx cdk bootstrap
then, re-run:
$ npx cdk deploy cloudfront-s3-control-plane-stack-global --parameters Email=<your email> --require-approval never
?

Presumably then, when I log into the newly deployed dashboard and create a new project/pipeline through the control, it will use new stacks with the updated configuration/modifications to deploy the new pipeline (i.e, it will use the new partition key). I'm new to cloud formation stacks and CDK, so any clarity you can provide about the re-deployment steps after making changes is appreciated.

zxkane Apr 11, 2024

The Clickstream solution has a critical on-click user experience to simplify the deployment. There is an in-house procedure to build and publish (using scripts in deployment/) the assets (CloudFormation templates, Lambda code, and container images) to AWS-owned accounts. It's worthy of another long guide.

I recommend a simpler procedure for deploying the customized modules from source code.

create a project with pipeline in solutions's web console
use this script to update the existing stack (for example, Ingest stack in your case) with local code

If this is your first time deploying with AWS CDK, you must do cdk bootstrap first.

AdamUrbanfox Apr 11, 2024
Author

I used the solution web console to create a new pipeline (kinesis pipeline in this scenario).

Then I ran the following command:

% bash e2e-deploy.sh ingestToKinesisStackName Clickstream-Ingestion-kinesis-766d3e26c06041919a1eae8b14117961

and all seemed to run fine until it failed to deploy due to the error shown below. Presumably I've run a correct deploy command? but not with the correct stackNameKey and stackName. This seems to re-deploy the entire pipeline and not just the ingestion server, correct?
This results in an error (pasted at the end of this comment) where the nested stack for the s3 sink seems to fail due to it already existing + whatever other errors might be occurring.

I'm guessing there is a way to only update the ingestion server specific part of the pipeline stack without redeploying the entire pipeline (which would also bypass this error that occurs when the S3 sink update is attempted).

What are the correct parameters to provide to the following command in order to only redeploy the ingestion server?

% bash e2e-deploy.sh stackNameKey stackName

It's probably something like the following but I'm not sure what stackNameKey should be used deploy nested ingestion server stack correctly (given that I'm assuming that the correct stack to modify has the following name pattern: Clickstream-Ingestion-kinesis-XXXXXXXXXXXX-IngestionServerK2C0100NestedStackIngestion-XXXXXXXXXXXX )

% bash e2e-deploy.sh <ingestion server stack name key???> Clickstream-Ingestion-kinesis-XXXXXXXXXXXX-IngestionServerK2C0100NestedStackIngestion-XXXXXXXXXXXX

EDIT: Does there exist an obvious place in the repository with a list of the stack name keys that can be used?
I think you are correct about this process being worthy of a long guide. The solution is great as it is, so I can see a modification/re-deployment tutorial going a long way for developers who wish to customise their deployments in a more fine tuned way to suit their requirements.

Here is the error when re-deploying the entire pipeline stack for reference:

`12:05:39 PM | CREATE_FAILED | AWS::Lambda::EventSourceMapping | kinesisDataStreamT...rovisioned1418BC98
Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-k
in-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Co

12:05:39 PM | CREATE_FAILED | AWS::Lambda::EventSourceMapping | kinesisDataStreamT...rovisioned1418BC98
Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-k
in-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Co
12:05:39 PM | CREATE_FAILED | AWS::Lambda::EventSourceMapping | kinesisToS3LambdaK...ed1418BC98CEBA1101
Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-k
in-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Co
de: 409, Request ID: 294c2699-82e9-47db-8763-c5158a0511a6)" (RequestToken: d9309714-4fed-f75e-e3be-019a1d23cb8d, HandlerErrorCode: AlreadyExists)

12:05:42 PM | UPDATE_FAILED | AWS::CloudFormation::Stack | kinesisDataStreamT...ckResourceA3CE46B9
Embedded stack arn:aws:cloudformation:eu-west-1:714462417551:stack/Clickstream-Ingestion-kinesis-766d3e26c0604-kinesisDataStreamToS3ProvisionedNestedStac-15LYIZ5YR372P/c5c361d0-f7ed-11ee-90dc-0211
61952dff was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [kinesisToS3LambdaKinesisEventSourceClickstreamIngestionkin
esis766d3e26c06041919a1eae8b14117961kinesisDataStreamToS3ProvisionedkinesisStreamProvisioned1418BC98CEBA1101]. The following resource(s) failed to update: [kinesisDataStreamMetricsCustomResource].

❌ Clickstream-Ingestion-kinesis-766d3e26c06041919a1eae8b14117961 failed: Error: The stack named Clickstream-Ingestion-kinesis-766d3e26c06041919a1eae8b14117961 failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-kin-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Code: 409, Request ID: 294c2699-82e9-47db-8763-c5158a0511a6)" (RequestToken: d9309714-4fed-f75e-e3be-019a1d23cb8d, HandlerErrorCode: AlreadyExists), Embedded stack arn:aws:cloudformation:eu-west-1:714462417551:stack/Clickstream-Ingestion-kinesis-766d3e26c0604-kinesisDataStreamToS3ProvisionedNestedStac-15LYIZ5YR372P/c5c361d0-f7ed-11ee-90dc-021161952dff was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [kinesisToS3LambdaKinesisEventSourceClickstreamIngestionkinesis766d3e26c06041919a1eae8b14117961kinesisDataStreamToS3ProvisionedkinesisStreamProvisioned1418BC98CEBA1101]. The following resource(s) failed to update: [kinesisDataStreamMetricsCustomResource].
at FullCloudFormationDeployment.monitorDeployment (/Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:433:10615)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:436:200335)
at async /Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:436:181173

❌ Deployment failed: Error: The stack named Clickstream-Ingestion-kinesis-766d3e26c06041919a1eae8b14117961 failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-kin-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Code: 409, Request ID: 294c2699-82e9-47db-8763-c5158a0511a6)" (RequestToken: d9309714-4fed-f75e-e3be-019a1d23cb8d, HandlerErrorCode: AlreadyExists), Embedded stack arn:aws:cloudformation:eu-west-1:714462417551:stack/Clickstream-Ingestion-kinesis-766d3e26c0604-kinesisDataStreamToS3ProvisionedNestedStac-15LYIZ5YR372P/c5c361d0-f7ed-11ee-90dc-021161952dff was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [kinesisToS3LambdaKinesisEventSourceClickstreamIngestionkinesis766d3e26c06041919a1eae8b14117961kinesisDataStreamToS3ProvisionedkinesisStreamProvisioned1418BC98CEBA1101]. The following resource(s) failed to update: [kinesisDataStreamMetricsCustomResource].
at FullCloudFormationDeployment.monitorDeployment (/Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:433:10615)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:436:200335)
at async /Users/adamprior/VSCodeProjects/clickstream-analytics-on-aws/node_modules/.pnpm/aws-cdk@2.136.0/node_modules/aws-cdk/lib/index.js:436:181173

The stack named Clickstream-Ingestion-kinesis-766d3e26c06041919a1eae8b14117961 failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "The event source arn (" arn:aws:kinesis:eu-west-1:714462417551:stream/Clickstream_clickstream_collector_ehtc_c5c361d0 ") and function (" Clickstream-Ingestion-kin-kinesisToS3LambdaB660939-405lqf9IM29V ") provided mapping already exists. Please update or delete the existing mapping with UUID fa5ca6de-b52e-4283-92bb-1dda46802a97 (Service: Lambda, Status Code: 409, Request ID: 294c2699-82e9-47db-8763-c5158a0511a6)" (RequestToken: d9309714-4fed-f75e-e3be-019a1d23cb8d, HandlerErrorCode: AlreadyExists),

Embedded stack arn:aws:cloudformation:eu-west-1:714462417551:stack/Clickstream-Ingestion-kinesis-766d3e26c0604-kinesisDataStreamToS3ProvisionedNestedStac-15LYIZ5YR372P/c5c361d0-f7ed-11ee-90dc-021161952dff was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [kinesisToS3LambdaKinesisEventSourceClickstreamIngestionkinesis766d3e26c06041919a1eae8b14117961kinesisDataStreamToS3ProvisionedkinesisStreamProvisioned1418BC98CEBA1101]. The following resource(s) failed to update: [kinesisDataStreamMetricsCustomResource]. `

AdamUrbanfox Apr 11, 2024
Author

EDIT 2:
I deleted the lambda function mapping which allowed the deployment to not fail at the s3 sink lambda function update, but then it just fails at:
1:19:03 PM | UPDATE_FAILED | AWS::ApplicationAutoScaling::ScalingPolicy | IngestionServerK1C.../Target/CpuScaling Resource handler returned message: "Only one TargetTrackingScaling policy for a given metric specification is allowed. (Service: ApplicationAutoScaling, Status Code: 400, Request ID: 4bc1f539-a757-44cb-b5f7-6d 4313d2f1d6)" (RequestToken: e24b3012-8b8d-29ae-7645-bf94c48f1ba5, HandlerErrorCode: GeneralServiceException) 1:19:04 PM | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack | Clickstream-Ingest...tion-1K1P715PS9USL The following resource(s) failed to update: [IngestionServerclickstreamingestionserviceecsserviceTaskCountTargetCpuScalingC3839581]. 1:19:04 PM | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack | Clickstream-Ingest...tion-1K1P715PS9USL The following resource(s) failed to update: [IngestionServerclickstreamingestionserviceecsserviceTaskCountTargetCpuScalingC3839581]. 1:19:07 PM | UPDATE_FAILED | AWS::CloudFormation::Stack | IngestionServerK2C...estedStackResource Embedded stack arn:aws:cloudformation:eu-west-1:714462417551:stack/Clickstream-Ingestion-kinesis-766d3e26c0604-IngestionServerK2C0100NestedStackIngestion-1K1P715PS9USL/56a44f70-f7ee-11ee-9a8b-0accc56650a7 was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to update: [IngestionServerclickstreamingestionserviceecsserviceTaskCountTargetCpuScalingC383958 1].

It seems like doing the re-deploy to update the stack that exists already does not work as expected (resources already existing) Likely more errors will occur if I can even find a hack to fix this new error.

It seems like I will have to do a from-scratch deployment of a brand new pipeline using the CDK, with the huge list of parameters using
% npx cdk deploy stack_name --parameters ...

How do I find the correct stack name to put in this command and the parameter list?

zxkane Apr 12, 2024

Adam, you used the script correctly. But the failure was caused by the logical ID of some resources changing after synthesizing the stack with a different name.

There is a workaround for it.

copy the stack ID/ARN for the ingestion stack Clickstream-Ingestion-kinesis-xxx
delete the stack Clickstream-Ingestion-kinesis-xxx in CloudFormation console
use the script to recreate the stack by stack ID/ARN

bash e2e-deploy.sh -n ingestToKinesisStackName -s <stack arn>

NOTE: the script has different parameters when using the latest code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the AWS Clickstream solution take into account session_id stickiness? #1041

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Does the AWS Clickstream solution take into account session_id stickiness? #1041

AdamUrbanfox Apr 10, 2024

Replies: 1 comment · 5 replies

zxkane Apr 10, 2024

AdamUrbanfox Apr 10, 2024 Author

zxkane Apr 11, 2024

AdamUrbanfox Apr 11, 2024 Author

AdamUrbanfox Apr 11, 2024 Author

zxkane Apr 12, 2024

AdamUrbanfox
Apr 10, 2024

Replies: 1 comment 5 replies

zxkane
Apr 10, 2024

AdamUrbanfox Apr 10, 2024
Author

AdamUrbanfox Apr 11, 2024
Author

AdamUrbanfox Apr 11, 2024
Author