Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack creation fails, then cleanup fails as well #37

Open
nleskiw opened this issue Jun 4, 2023 · 4 comments
Open

Stack creation fails, then cleanup fails as well #37

nleskiw opened this issue Jun 4, 2023 · 4 comments

Comments

@nleskiw
Copy link
Contributor

nleskiw commented Jun 4, 2023

I'm trying this for the first time, and the creation process fails.
There's 2 pending tasks, create SidekiqService and CreateCertificate, it looks like creating the sidekiq service triggers a circuitbreaker and that causes CloudFormation to delete the stacks. It leaves the hosted zone and alb with a "deletion failed".

I can manually clean it up by deleting some records in the hosted zone and an S3 bucket.

Not sure how to troubleshoot this, best I found was some logs with a ruby error about a table not existing.
mastodon_on_aws.csv

I was using us-east-1 but saw other people had issues with that region so then I tried us-west-2 with the same result, several times.

If there's any more info I could provide or any hints I'm all ears.

EDIT: I went ahead an registered a domain with Route53, so that the entire stack would be controlled by AWS, including DNS and registrar and attempted to deploy. Same result. Cert and Sidekiq never finish and the stack goes into rollback, fails to clean itself up. I'm more concerned that it doesn't start a Mastodon service than the cleanup.

@nleskiw
Copy link
Contributor Author

nleskiw commented Jul 3, 2023

Is this just an issue I'm having? Is this working for everyone else?

@andreaswittig
Copy link
Contributor

@nleskiw Could you please share the logs from the web service/container?

@nleskiw
Copy link
Contributor Author

nleskiw commented Apr 11, 2024

@andreaswittig

I've pulled the latest changes and tried this again:

These are the logs from the Sidekiq service that dies and triggers the rollback/deletion:

sidkiq_logs.csv

I don't see anything named "web service / container" in the CF Stack, can you be more specific about which logs you're interested in?

Here's what I see in the CF Stack:

Alb
AlbAccessLogBucket
Alerting
Bucket
Cache
Certificate
ClientSg
CloudFront
Cluster
Database
Dkim1Record
Dkim2Record
Dkim3Record
EmailIdentity
EmailUser
EmailUserAccessKey
HostedZone
HttpListener
Key
LambdaLogGroup
LambdaPolicy
LambdaRole
Record
Redirect
S3Policy
ScheduledTask
Secret
SidekiqService
SmtpPasswordConverter
SmtpPasswordConverterLamdaFunction
Vpc

Not sure which of these is the web service / container. Perhaps the Sidekiq container is starting before the web container and that's why it can't find the DB table it's looking for?

I believe this may be ultimately why Sidekiq dies:

2024-04-11T19:48:04.280Z pid=7 tid=5en WARN: ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation "users" does not exist
--
LINE 9:  WHERE a.attrelid = '"users"'::regclass
^

@andreaswittig
Copy link
Contributor

@nleskiw That's a good point. I've made a small fix. See v0.20.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants