-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch integration tests to provision stacks in the CFT region in production #3459
Comments
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
++ to using the more-stable CFT region in production. One of the reasons for using the QA environment was that deployments in that environment are auto-terminated after 24 hours. We will need to build some automation for handling that when we switch over to the CFT region in production. |
++ QA cloud stability is definitely a constant issue we are hitting running the tests. Does the region provide access to snapshot builds? |
This is blocked by #3463 and #3456. I would also add a daily job that cleans up all integration tests deployments (we need to label them or make them easily discoverable not just by name) older than 24h |
@pmoust in CFT region is there any way to tag deployment that needs to be removed after a given time period? |
As a first step to move away from QA I am preparing a quick PR to move to staging (it's not the long term solution but it should help a bit) |
Pulled into the current sprint as we have too many stability problems in the non-prod testing regions. Let's move to the gcp-us-west2 CFT region. We will pair this change with #3463 which should minimize the number of deployments we leak. Separately we will create a scheduled job to detect orphaned deployments but we will no longer consider this a blocker for this change. The thinking is that most of the deployments we leak happen because they fail to come up, which should be less likely in the CFT region in prod. |
We have recently had some stability challenges with the QA GCP region and this is because it isn't granted the same reliability guarantees as the staging or production regions.
Luckily there is a dedicated cloud first testing region for internal use with the same stability guarantees as production in gcp-us-west2 Los Angeles.
Let's switch our integration tests to use that region. We could also decide to use staging cloud, which will give us more cloud providers and regions to test against but with slightly less stability as it will still have pre-production cloud code. Generally the CFT region is the best place for internal testing like we are doing with our integration tests.
The text was updated successfully, but these errors were encountered: