Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More comments and logging fixes. #2362

Merged
merged 1 commit into from
Oct 3, 2017
Merged

Conversation

anshulpundir
Copy link
Contributor

Signed-off-by: Anshul Pundir anshul.pundir@docker.com

@codecov
Copy link

codecov bot commented Aug 29, 2017

Codecov Report

Merging #2362 into master will decrease coverage by 0.21%.
The diff coverage is 77.77%.

@@            Coverage Diff             @@
##           master    #2362      +/-   ##
==========================================
- Coverage   60.49%   60.28%   -0.22%     
==========================================
  Files         128      128              
  Lines       26275    26277       +2     
==========================================
- Hits        15895    15841      -54     
- Misses       8976     9035      +59     
+ Partials     1404     1401       -3

@@ -542,6 +542,9 @@ func (n *Node) Run(ctx context.Context) error {
n.done()
}()

// Flag to indicate if this manager node was the raft leader in the previous iteration of the loop.
// It is mainly used to cancel out proposals that this node might have made when it was the leader
// in the previous iteration. It not only needed to avoid deadlocking processCommitted().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the last sentence - seems like it's missing a word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't understand it either. Don't remember what I was trying to say. I'll fix/remove it.

@@ -630,6 +633,8 @@ func (n *Node) Run(ctx context.Context) error {
// cancelAll, or by its own check of signalledLeadership.
n.wait.cancelAll()
} else if !wasLeader && rd.SoftState.RaftState == raft.StateLeader {
// If this node was the leader in the previous iteration,
// set wasLeader to make sure all outstanding proposals are cancelled.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is actually when the node becomes the leader.

!wasLeader - wasn't the leader before
rd.SoftState.RaftState == raft.StateLeader - is the leader now

Copy link
Contributor Author

@anshulpundir anshulpundir Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of the variable 'wasLeader' ? Would 'isLeader' be more appropriate ? Currently this is confusing because 'wasLeader', to me, sound like the node was the leader in the past, and from your explanation it sounds like that's not what it means. @aaronlehmann

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name might be confusing. I think renaming it to isLeader would be okay. The idea was that it's a long-lived state variable, not an instantaneous value.

I think the reason I named it that way was that if it as isLeader, this line would be } else if !isLeader && rd.SoftState.RaftState == raft.StateLeader {, which seems kind of nonsensical - we're the leader and we're not the leader. wasLeader tried to convey that it was the past state.

// processInternalRaftRequest proposes a value to be appended to the raft log.
// It calls Propose() on etcd/raft, which calls back into the raft FSM,
// which then sends a message to each of the participating nodes
// in the raft groupp to apply a log entry and then waits for it to be applied
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: groupp

@@ -630,6 +633,8 @@ func (n *Node) Run(ctx context.Context) error {
// cancelAll, or by its own check of signalledLeadership.
n.wait.cancelAll()
} else if !wasLeader && rd.SoftState.RaftState == raft.StateLeader {
// If this node was the leader in the previous iteration,
// set wasLeader to make sure all outstanding proposals are cancelled.
wasLeader = true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we let the FSM proceed here and not cancel the outstanding proposals right away here ? @aaronlehmann

@@ -542,6 +542,9 @@ func (n *Node) Run(ctx context.Context) error {
n.done()
}()

// Flag to indicate if this manager node was the raft leader in the previous iteration of the loop.
// It is mainly used to cancel out proposals that this node might have made when it was the leader
// in the previous iteration. It not only needed to avoid deadlocking processCommitted().
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't understand it either. Don't remember what I was trying to say. I'll fix/remove it.

@@ -630,6 +633,8 @@ func (n *Node) Run(ctx context.Context) error {
// cancelAll, or by its own check of signalledLeadership.
n.wait.cancelAll()
} else if !wasLeader && rd.SoftState.RaftState == raft.StateLeader {
// If this node was the leader in the previous iteration,
// set wasLeader to make sure all outstanding proposals are cancelled.
Copy link
Contributor Author

@anshulpundir anshulpundir Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of the variable 'wasLeader' ? Would 'isLeader' be more appropriate ? Currently this is confusing because 'wasLeader', to me, sound like the node was the leader in the past, and from your explanation it sounds like that's not what it means. @aaronlehmann

Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
@anshulpundir anshulpundir changed the title Add more comments #2360. More comments and logging fixes. Oct 2, 2017
@@ -1681,7 +1689,7 @@ func (n *Node) processInternalRaftRequest(ctx context.Context, r *api.InternalRa

// Do this check after calling register to avoid a race.
if atomic.LoadUint32(&n.signalledLeadership) != 1 {
log.G(ctx).Errorf("node %x is no longer leader, aborting propose", n.opts.ID)
log.G(ctx).Error("node is no longer leader, aborting propose")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anshulpundir why remove node IDs from these log messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node id is a field already added as a field to the contest logger https://github.com/docker/swarmkit/blob/master/node/node.go#L286 @nishanttotla

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay got it.

Copy link
Contributor

@nishanttotla nishanttotla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anshulpundir anshulpundir merged commit e28f07e into moby:master Oct 3, 2017
denverdino pushed a commit to AliyunContainerService/swarmkit that referenced this pull request Nov 5, 2017
Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants