Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-node integration tests do not log node IDs correctly #32574

Closed
DaveCTurner opened this issue Aug 2, 2018 · 4 comments · Fixed by #32751
Closed

Multi-node integration tests do not log node IDs correctly #32574

DaveCTurner opened this issue Aug 2, 2018 · 4 comments · Fixed by #32751
Assignees
Labels
>bug :Core/Infra/Logging Log management and logging utilities v7.0.0-beta1

Comments

@DaveCTurner
Copy link
Contributor

Digging into
#32552 was made a little trickier by the lack of the correct node name in a few log messages. The log excerpt below has examples of all the cases of this that I think I encountered. I think this was introduced in #31588.

  1> [2018-08-01T06:44:59,262][INFO ][o.e.n.Node               ] [testMultipleNodesShutdownNonMasterNodes] closed
  1> [2018-08-01T06:44:59,263][INFO ][o.e.n.Node               ] [testMultipleNodesShutdownNonMasterNodes] stopping ...
  1> [2018-08-01T06:44:59,263][DEBUG][o.e.t.d.TestZenDiscovery ] [node_t1] filtered ping responses: (ignore_non_masters [false])
  1> 	--> ping_response{node [{node_t1}{N7pZoArIQkagoOjazUUnpg}{CHz8p-1ZSWK6x_N0-Frh7Q}{127.0.0.1}{127.0.0.1:36663}], id[1831], master [null],cluster_state_version [9], cluster_name[TEST-CHILD_VM=[1]-CLUSTER_SEED=[4742294414363736227]-HASH=[181D1FE4F82A7D]-cluster]}
  1> [2018-08-01T06:44:59,263][WARN ][o.e.t.d.TestZenDiscovery ] [node_t1] not enough master nodes discovered during pinging (found [[Candidate{node={node_t1}{N7pZoArIQkagoOjazUUnpg}{CHz8p-1ZSWK6x_N0-Frh7Q}{127.0.0.1}{127.0.0.1:36663}, clusterStateVersion=9}]], but needed [3]), pinging again
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.IndicesService     ] [[indices_shutdown[T#1]]] [test] closing ... (reason [NO_LONGER_ASSIGNED])
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.IndicesService     ] [[indices_shutdown[T#1]]] [test/xM1OBH02QE2z4aIJH9p5Ug] closing index service (reason [NO_LONGER_ASSIGNED][shutdown])
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.IndexService       ] [[indices_shutdown[T#1]]] [test] [3] closing... (reason: [shutdown])
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.s.IndexShard       ] [[indices_shutdown[T#1]]] [test][3] state: [STARTED]->[CLOSED], reason [shutdown]
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.e.Engine           ] [[indices_shutdown[T#1]]] [test][3] flushing shard on close - this might take some time to sync files to disk
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.e.Engine           ] [[indices_shutdown[T#1]]] [test][3] close now acquiring writeLock
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.e.Engine           ] [[indices_shutdown[T#1]]] [test][3] close acquired writeLock
  1> [2018-08-01T06:44:59,266][DEBUG][o.e.i.t.Translog         ] [[indices_shutdown[T#1]]] [test][3] translog closed
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.e.Engine           ] [[indices_shutdown[T#1]]] [test][3] engine closed [api]
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.s.Store            ] [[indices_shutdown[T#1]]] [test][3] store reference count on close: 0
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.IndexService       ] [[indices_shutdown[T#1]]] [test] [3] closed (reason: [shutdown])
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.c.b.BitsetFilterCache] [[indices_shutdown[T#1]]] [test] clearing all bitsets because [close]
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.c.q.IndexQueryCache] [[indices_shutdown[T#1]]] [test] full cache clear, reason [close]
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.c.b.BitsetFilterCache] [[indices_shutdown[T#1]]] [test] clearing all bitsets because [close]
  1> [2018-08-01T06:44:59,267][DEBUG][o.e.i.IndicesService     ] [[indices_shutdown[T#1]]] [test/xM1OBH02QE2z4aIJH9p5Ug] closed... (reason [NO_LONGER_ASSIGNED][shutdown])
  1> [2018-08-01T06:44:59,267][INFO ][o.e.n.Node               ] [testMultipleNodesShutdownNonMasterNodes] stopped
  1> [2018-08-01T06:44:59,267][INFO ][o.e.n.Node               ] [testMultipleNodesShutdownNonMasterNodes] closing ...
  1> [2018-08-01T06:44:59,269][INFO ][o.e.n.Node               ] [testMultipleNodesShutdownNonMasterNodes] closed
  1> [2018-08-01T06:44:59,269][INFO ][o.e.c.MinimumMasterNodesIT] [testMultipleNodesShutdownNonMasterNodes] after test
@DaveCTurner DaveCTurner added >bug :Core/Infra/Logging Log management and logging utilities v7.0.0 labels Aug 2, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@ywelsch
Copy link
Contributor

ywelsch commented Aug 2, 2018

@nik9000: see IndicesService.doStop(). Same issue with the SchedulerEngine I think.

@jasontedor
Copy link
Member

@nik9000 Would you take a look?

@nik9000
Copy link
Member

nik9000 commented Aug 9, 2018

Picking this up now.

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Aug 9, 2018
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.

Closes elastic#32574

I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
nik9000 added a commit that referenced this issue Aug 20, 2018
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.

Closes #32574

I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
nik9000 added a commit that referenced this issue Aug 20, 2018
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.

Closes #32574

I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
jasontedor pushed a commit that referenced this issue Aug 21, 2018
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.

Closes #32574

I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Logging Log management and logging utilities v7.0.0-beta1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants