-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing synchronization in SemaphoreStep
#235
Conversation
static synchronized State get() { | ||
File home = Jenkins.get().getRootDir(); | ||
State state = states.get(home); | ||
if (state == null) { | ||
LOGGER.info(() -> "Initializing state in " + home); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some extra logging here and in stop
in case the fault lied there, but it does not seem to have.
final Map<String,Object> returnValues = new HashMap<String,Object>(); | ||
final Map<String,Throwable> errors = new HashMap<String,Throwable>(); | ||
final Set<String> started = new HashSet<String>(); | ||
final Map<String,String> contexts = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just code cleanup
@@ -96,47 +101,61 @@ private String k() { | |||
return id + "/" + number; | |||
} | |||
|
|||
/** Marks the step as having successfully completed; or, if not yet started, makes it do so synchronously when started. */ | |||
/**@deprecated use {@link #success(String, Object)} */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there are any tests actually obtaining this object directly. The normal usage involves solely static
calls.
synchronized (s) { | ||
if (!s.contexts.containsKey(k)) { | ||
System.err.println("Planning to unblock " + k + " as success"); | ||
LOGGER.info(() -> "Planning to unblock " + k + " as success"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Logger
is better style and gives us timestamps too.
s.returnValues.put(k, returnValue); | ||
return; | ||
} | ||
c = getContext(s, k); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible point where synchronization was lacking.
@@ -147,29 +166,50 @@ public static void waitForStart(@NonNull String k, @CheckForNull Run<?,?> b) thr | |||
if (b != null && !b.isBuilding()) { | |||
throw new AssertionError(JenkinsRule.getLog(b)); | |||
} | |||
s.wait(1000); | |||
s.wait(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might shave off a fraction of a second from tests to check this more promptly.
*/ | ||
public final class SemaphoreStep extends AbstractStepImpl implements Serializable { | ||
public final class SemaphoreStep extends Step implements Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaning up deprecations.
Object returnValue = null; | ||
Throwable error = null; | ||
boolean success = false, failure = false, sync = true; | ||
synchronized (s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point where synchronization was inadequate. Rather cumbersome idiom to avoid holding the lock while making Pipeline API calls, to avoid possible deadlocks.
#188 does not seem to have been complete; various
HashMap
-valued fields inState
were accessed without consistent locking.Motivated by frequent flakes in
SCMRetrieverTest.selfTestLibraries
(frompipeline-groovy-lib
) in PCT. The test would time out waiting for one of the branch builds to complete; log messages Blocking… indicated that the step had been started on the right string, but then later the log would show Planning to unblock… rather than Unblocking…, as ifcontexts
did not contain the string which had been added. I was able to reproduce such a failure locally after updating the core & BOM to newest 2.414.x and running the test about ten times in a row. My hypothesis is that unsynchronized access corrupted theHashMap
.