Enable auto-scaling for sidekiq #20

michaelwittig · 2022-12-02T12:42:04Z

see

    My Sidekiq task is regularly pegging at 100% CPU utilization... definitely need some guidance on configuring scaling...

Originally posted by @scrappydog in #1 (comment)

    @scrappydog Same for us. I'm not sure if that is an issue. It likely doesn't matter if the background tasks utilize all resources as long as they finish withou much delay. For us, we see spikes to 100% but only for minutes. Do you see the same pattern?

Originally posted by @michaelwittig in #1 (comment)

    That looks very similar to utilization on my instance.

My inner system admin really "wants" to add another task... but I agree as long as jobs are completing in a reasonable time it's not an immediate issue.

BUT we are running tiny instances for testing... we NEED a way to scale up... :-)

Originally posted by @scrappydog in #1 (comment)

    I bumped the CPU allocation up on the Sidekiq task to CPU .5 vCPU | Memory 3 GB...

This feels happier for now... but it doesn't address the real scalability question...

Originally posted by @scrappydog in #1 (comment)

    ![image](https://user-images.githubusercontent.com/125875/204807795-541c039e-3b58-4bb2-922f-5f1e3d528938.png)

Upgraded about half way through this graph... definably a lot better!

Originally posted by @scrappydog in #1 (comment)

The text was updated successfully, but these errors were encountered:

scrappydog · 2022-12-02T14:26:33Z

Status update after a couple days with the Sidekiq task to CPU .5 vCPU | Memory 3 GB

compuguy · 2022-12-04T21:03:07Z

There is a way to do auto scaling for most of the sidekiq queues. Except for the scheduler. You can only have one of those. This article helped me work on some of my experiments with scaling Sidekiq (https://nora.codes/post/scaling-mastodon-in-the-face-of-an-exodus/). At a minimum you need at least 1 gigabyte of memory for each instance. I'm not sure how many threads though. The default is 5 but it might make sense to reduce it to maybe 2 based on the amount of cpu units on each container instance you have (I'm using 0.5 for each).

vesteinn · 2022-12-08T08:15:56Z

Were you able to integrate these changes into the CloudFormation configuration @compuguy? After increasing the Cpu and Memory flags I'm still seeing full load.

compuguy · 2022-12-11T22:56:10Z

I honestly went down a different road @vesteinn. I moved the mail and scheduler queues to their own separate instance, with 0.25 vCPU and .5 GB of memory. You can only have one scheduler queue per Mastodon instance, so I left it with the mail queue which wasn't using much CPU or RAM. Then I made the SidekiqService container run the rest of the needed queues AppCommand: 'bash,-c,bundle exec sidekiq -q default -q pull -q push -q ingress' with 0.5 CPU and 1 gig of memory (See: https://github.com/compuguy/mastodon-on-aws/blob/istoleyourpw-deploy/mastodon.yaml#L269). Memory seems to be good, but I still get way to many CPUUtilizationTooHighAlarms, especially when trends are updating. On the bright side, it is scaling up the instances when needed. I'm thinking of upping it to 1 vCPU, which would require upping the memory per container to 2 GB of memory. Here's a CPU utilization chart for the past week:

pegli · 2022-12-22T17:09:08Z

I wanted to share an incident report I created after a member of my instance reported problems uploading videos:

https://hub.montereybay.social/blog/degraded-service-video-transcoding-failures.html

tl;dr: iPhone video transcoding with ffmpeg was causing the CPU and memory usage to spike on the Sidekiq service. Changing vCPUs from 0.25 -> 0.5 and memory from 0.5 Gb -> 1 Gb in the Task Definition and redeploying that service resolved the issue, at least temporarily.

My instance is still pretty small at 19 users. If anyone would like me to report additional statistics, let me know what you want to see -- I'm happy to share operational metrics.

michaelwittig · 2022-12-22T20:01:44Z

@pegli We increased memory from 0.5 to 1 GB in #16
The CPU is still at 0.25 which is not a lot of horse powers :)

Yes, we are interested in metrics! RequestCountPerTarget for both ALB target groups (web and streaming) as well as CPU and memory of web, streaming and sidekiq.

pegli · 2022-12-22T21:24:07Z

At your service! https://hub.montereybay.social/Operations.html now has a public CloudWatch dashboard with all of those metrics.

michaelwittig · 2022-12-23T07:44:09Z

@pegli That's cool :) Do you mind sharing the JSON definition (open the dashboard in the CloudWatch UI, click Actions -> View/edit source) of the dashboard? We could add it to the template.

michaelwittig mentioned this issue Dec 2, 2022

Enable auto-scaling for web and streaming API #1

Open

michaelwittig added the enhancement New feature or request label Dec 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable auto-scaling for sidekiq #20

Enable auto-scaling for sidekiq #20

michaelwittig commented Dec 2, 2022

scrappydog commented Dec 2, 2022

compuguy commented Dec 4, 2022 •

edited

Loading

vesteinn commented Dec 8, 2022

compuguy commented Dec 11, 2022 •

edited

Loading

pegli commented Dec 22, 2022

michaelwittig commented Dec 22, 2022

pegli commented Dec 22, 2022

michaelwittig commented Dec 23, 2022

Enable auto-scaling for sidekiq #20

Enable auto-scaling for sidekiq #20

Comments

michaelwittig commented Dec 2, 2022

scrappydog commented Dec 2, 2022

compuguy commented Dec 4, 2022 • edited Loading

vesteinn commented Dec 8, 2022

compuguy commented Dec 11, 2022 • edited Loading

pegli commented Dec 22, 2022

michaelwittig commented Dec 22, 2022

pegli commented Dec 22, 2022

michaelwittig commented Dec 23, 2022

compuguy commented Dec 4, 2022 •

edited

Loading

compuguy commented Dec 11, 2022 •

edited

Loading