-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jdk_time target fails on some arm32_linux nodes #3360
Comments
Will be part of the top level #2662 |
Some jdk_time_2 tests failed on test-docker-ubuntu2004-armv7l-4 with the last jdk17 release
However all of these tests passed when run individually on the machine itself
using the latest jdk17 release
|
Looks like it times out when you run them in jdk_time together, but pass when running them individually. https://ci.adoptium.net/job/Grinder/8698/console
|
Running jdk_time_2 from my own branch, with |
I think the jdk_time test target might be a bit much for the docker static containers. The grinder has caused test-docker-ubuntu2004-armv7l-4 to disconnect from jenkins
|
Rerunning on a non docker container (odroid machine) https://ci.adoptium.net/job/Grinder/8702/console |
Passed on the odroid machine. Also passed on test-docker-ubuntu2004-armv7l-2 https://ci.adoptium.net/job/Grinder/8701/console |
Interesting ... @Haroon-Khel Are they all new containers and did you cap the CPU/memory on them? From the Grinder logs the run on both containers as mentioned in #3360 (comment) and #3360 (comment) were both running with Noting that the calculation for currency is the one I was looking at adjusting in adoptium/aqa-tests#4792 (in case you want to see where it is and play with it) but I don't think that will affect this particular situation. |
Ive updated my branch to run the test with concurrency:2
Yes, the images are built with And are run with Bear in mind the jdk_time_2 tests passed on test-docker-ubuntu2004-armv7l-2, another static container but running on dockerhost-equinix-ubuntu2004-armv8-1 not dockerhost-equinix-ubuntu2204-armv8-1. Link to the grinder https://ci.adoptium.net/job/Grinder/8701/console |
https://ci.adoptium.net/job/Grinder/8703/console passed. So we need to limit concurrency on the docker static containers, and eventhough its passes with -concurrency:81 on some, it shouldnt be using a value this high |
adoptium/aqa-tests#4792 looks good, but I want to first look at why the test scripts are reporting 160 cores and not the set 4 (or whatever) cores of the docker containers |
Unfortunately I'm not yet convinced which is why it's still in draft, but I need to get back to it ;-) |
Im having some luck with Container started with
Container started with
So a solution is to change https://github.com/adoptium/aqa-tests/blob/c27e9e649919f839444c638844e133a0d13edd04/openjdk/openjdk.mk#L22 to use |
I've redeployed test-docker-ubuntu2004-armv7l-4 with Rerunning jdk_time_2 from my branch on it, Concurrency should be 4/2 +1 =3 which shouldn't cause a timeout |
Looks good |
Grinder passed 👍🏻 |
Passes for jdk11 too https://ci.adoptium.net/job/Grinder/8711/console |
The remaining arm32 nodes on dockerhost-equinix-ubuntu2204-armv8-1 have been restarted with --cpuset-cpus="0-3" |
Do you know what the difference in behaviour is between the two options i.e. is there likely to be any downside to this change? We should check whether this is happening elsewhere too e g. X64 and ppcle systems to see if they are the same (trying to use more cores than they should be at present) Nice find! |
Please set the title to indicate the test name and machine name where known.
As indicated by Jan 2024 CPU triage, in adoptium/aqa-tests#4982 (comment), the jdk_time test target was failing with timeouts on ci.adoptium.net arm32 nodes. It was able to pass on the temurin-compliance arm32 linux node.
To make it easy for the infrastructure team to repeat and diagnose, please
answer the following questions:
Test_
job on https://ci.adoptium.net which showed the failureAny other details:
Suspect related to the findings in #2002 but not confirmed
The text was updated successfully, but these errors were encountered: