-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System unavailable: test-azure-win2012r2-x64-1 cannot clean workspace #1669
Comments
These are mostly covered under #1573 |
Hmmm there doesn't obviously seem to be a process using that file at the moment - I've queued up a new version of that job on the macine so we can see if it recurs when it runs (@lumpfish you have Grinder running just now so it will happen once your run is finished) |
Now Running https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_extended.system_x86-64_windows/365 on test-azure-win2012r2-x64-1 after renaming that directory from the workspace (which seems to have been successful |
(ref #1396 ) |
@Willsparker Can you take a look on this given that moving the directory doesn't appear to have resolved this (the job in the previous comment failed so I've taken it offline) - let's try and understand what's going on here rather than just deleting / rebooting as it wasn't immediately obvious to me what was causing the problem here |
Yep - pretty sure it's leftover Java processes (again) So, the Java process 14984, is apparently using various Tests aren't clearing up their java processes properly, and Jenkins isn't handling them properly. So- 2 options. We run your As the machine is disabled, I'm going to leave those 2 java processes running, just in case we want to trial anything. I'd like to try running |
Can you tell how long the processes have been there for? I recently fixed this issue which would have meant system tests did not clean up processes: adoptium/STF#89 (merged 05/11/2020). That would only account for processes started via extended.system or sanity.system (and also potentially a few grinders). |
The issue I have with using Looking at azure-test-1 if I run |
Tasklist unfortunately doesn't seem to let you filter on the whole command line. There's probably a way of doing it through PowerShell but I'm not sure what the right incantation is - we'd likely need to that to be able to reliable filter out the jenkins agent process (and determine which test is causing the problem) ...You can see the details in the process' properties in task manager for each individual process but that's not scriptable. The tlist tool will do it but that's not a default part of a windows installation AFAIK. If we can do it with standard systems tools that would be preferable |
Got it- in Powershell.
(typed, apologies for bad formatting). And Jenkins agent is running the |
Did you kill off the jobs - I can't see those last three on the machine any more? |
No? 😕 |
Scratch that - it needs to be run as Administrator to get that information |
Had to clear out several machines today which had a load of leftover processes, mostly from the recently added extended test suite runs. Details below: https://ci.adoptopenjdk.net/computer/test-godaddy-win2016-x64-1/https://ci.adoptopenjdk.net/computer/test-godaddy-win2016-x64-4/https://ci.adoptopenjdk.net/computer/test-azure-win2012r2-x64-3/ |
There we go, I've found a more generalised solution that will find all the Java Processes that have been created since the Jenkins job started running:
|
Thanks Will! @smlambert @andrew-m-leonard Does the above look feasible for inclusion into the test scripts? It should kill off any processes created since the start of the job. Worth considering what we want to do in this situation - should the test job be marked as "failed" or "warning"? I suspect we should display the list and (at the very least) make it a warning condition. |
Sounds good to me |
As we add new testing environment checks (adoptium/TKG#45) into TKG, we have started with warnings. |
@smlambert Do you agree that we should add the above proposed operations from Will into the testing scripts? Would adding it into |
I linked to adoptium/TKG#45 (see 4th entry in checklist) because these checks may more properly belong in TKG where we have a set of machine checks. |
So, I guess I misunderstood what Will's solution does, which is using Jenkins info as a corral around the processes started. This does not help us identify issues in other CI systems, or when someone runs on their laptop via commandline. It would be good to cover those cases (which is why we proposed putting them into TKG). Would be worth a design discussion so we consider all options. |
@Willsparker Solution is simply a "run command to store the start time" then "check for new java processes since then". It's not tied to jenkins at all (unless we tie it to the presence of |
I think I'm going to close this and move follow on discussion to #1573 to avoid fragmented discussions - we will add in @Willsparker proposed code into |
test-azure-win2012r2-x64-1 is failing to run tests due to not being able to clean ots workspace: https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_extended.system_x86-64_windows/361/
The text was updated successfully, but these errors were encountered: