Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(e2e-tests): increase timeouts for AWS tests MONGOSH-1924 #2286

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

addaleax
Copy link
Contributor

@addaleax addaleax commented Dec 3, 2024

This should hopefully address AWS auth test flakiness on s390x, where DNS queries can take multiple seconds each due to the host CI setup.

Example strace that indicates a 5 second delay for a DNS query: https://docs.google.com/document/d/1jWxKTZWma5nN5I3a9iTANnulzm_9GlLanEUbf9q9vh0/edit?tab=t.0

This should hopefully address AWS auth test flakiness on s390x,
where DNS queries can take multiple seconds each due to the host CI
setup.
Copy link
Contributor

@nirinchev nirinchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something we can work on together with devprod to reduce DNS lookup times on these hosts? If you've already talked to them, might be good to add a comment in the code that summarizes the motivation/intricacies of this setup, as well as any steps we might want to take in the future. To be clear, this seems like a reasonable bandaid, but ideally, we'd want to improve the latency to avoid wasting CI resources.

@addaleax
Copy link
Contributor Author

addaleax commented Dec 3, 2024

@nirinchev Oh yeah, I was chatting with Thomas from the DevProd team who also helped with providing debug access to a host. I don't think there's a known root cause here – but they're eager to investigate this on their side.

@nirinchev
Copy link
Contributor

Cool, if they have a ticket tracking this, mind if you link to it in a code comment as it'll provide valuable context for future readers.

@addaleax
Copy link
Contributor Author

addaleax commented Dec 3, 2024

@nirinchev I don't think there's a ticket right now. We do have:

We recently got honeycomb metrics functional on this platform, so hopefully stuff like that will start showing up in traces

so that's likely going to help the DevProd team with keeping an eye on things, and if we feel that this is really an issue that is significant enough to require special attention, we can always open a new ticket with them. But I also have to be honest and say that this problem doesn't seem bad enough that it requires extra attention at this point – if the tests start timing out with the new 1 minute limit I'd be a lot more concerned.

@nirinchev
Copy link
Contributor

@addaleax fair enough - I agree it's sufficiently unimportant that creating a ticket is not something we need to do at this point.

@addaleax addaleax merged commit 6c9a9b4 into main Dec 4, 2024
142 of 146 checks passed
@addaleax addaleax deleted the 1924-dev branch December 4, 2024 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants