Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.io.IOException: Read end dead permanently - master disconnects docker attached agents #581

Closed
spbkelt opened this issue Dec 14, 2017 · 9 comments

Comments

@spbkelt
Copy link

spbkelt commented Dec 14, 2017

jenkins: 2.73.3
docker-plugin: 1.0.4

We are facing constantly issue with disconnecting master -> agent

Cannot contact DLB1-68a3142a64ab: java.io.IOException: remote file operation failed: /home/jenkins/workspace/FirstRain Projects/RC Builds/Tools@2/tools/parent_tools at hudson.remoting.Channel@72a90e54:DLB1-68a3142a64ab: java.io.IOException: Read end dead

Jenkins master logs:

Dec 14, 2017 2:45:11 PM INFO hudson.slaves.ChannelPinger$1 onDead
Ping failed. Terminating the channel DLB1-68a3142a64ab.
java.io.IOException: Read end dead
	at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:262)
	at java.io.PipedInputStream.receive(PipedInputStream.java:226)
	at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:93)
	at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:89)
	at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:58)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45)
	at hudson.remoting.Channel.send(Channel.java:671)
	at hudson.remoting.Request.callAsync(Request.java:229)
	at hudson.remoting.Channel.callAsync(Channel.java:937)
	at hudson.remoting.PingThread.ping(PingThread.java:106)
	at hudson.remoting.PingThread.run(PingThread.java:86)

Dec 14, 2017 2:45:11 PM WARNING hudson.remoting.Channel close
Having to terminate early
java.io.IOException: Read end dead
	at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:262)
	at java.io.PipedInputStream.receive(PipedInputStream.java:226)
	at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:93)
	at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:89)
	at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:58)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:45)
	at hudson.remoting.Channel.send(Channel.java:671)
	at hudson.remoting.Channel.close(Channel.java:1389)
	at hudson.remoting.Channel.close(Channel.java:1356)
	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:731)
	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:649)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Docker container is live and running at the same time

docker ps | grep 68a3142a64ab
68a3142a64ab        ***/ignite/firstrain/jenkins-agent:1.0                               "/bin/sh"                23 minutes ago       Up 23 minutes                                                                                                                                                                                                               dreamy_bardeen

I can even connect and run command inside it

docker exec -it 68a3142a64ab bash
[jenkins@68a3142a64ab ~]$ ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jenkins        1  0.0  0.0  11780  2704 pts/0    Ss+  14:24   0:00 /bin/sh
jenkins     1554 19.7  0.0  11784  3056 pts/1    Ss   14:48   0:00 bash
jenkins     1572  0.0  0.0  47456  3324 pts/1    R+   14:49   0:00 ps au
@ndeloof
Copy link
Contributor

ndeloof commented Dec 15, 2017

Can you give more details on this build ?
How long does it take ?
Does it produce some huge amount of logs ?
or maybe is it quiet for a long time during the build ?

@magn2
Copy link

magn2 commented Dec 15, 2017

Same problem here since I migrated from "ssh key injection" to "attach".
Seems to work sometimes, other times gets the mentioned error.
I am now back to 1.0.4.

@ndeloof
Copy link
Contributor

ndeloof commented Dec 15, 2017

seems to be comparable issue with github.com/KostyaSha/yet-another-docker-plugin/pull/216/files/1c2e885d30bda0638e423c0df66489f7eaf2a5b3

@ndeloof
Copy link
Contributor

ndeloof commented Dec 15, 2017

@spbkelt as you exec into container, can you check with ps -ef the slave.jar agent process is still running ? Please also check agent.log in agent's working directory

@ndeloof
Copy link
Contributor

ndeloof commented Dec 15, 2017

I've tested running a job to run endless ping localhost
as well as another one running sleep 7200 so there's not traffic
in both cases I can't reproduce (at least on docker.sock, will try with tcp)

Can you describe what's happening on your build at the time this issue occurs, so I can try to establish which condition trigger this error ?

@spbkelt
Copy link
Author

spbkelt commented Dec 18, 2017

New logs:

Connecting to docker container d3356f5856d08a820d72ae5b38743ccc64cfeaa4feab1ce65bbdf8d50af8c143
<===[JENKINS REMOTING CAPACITY]===>ERROR: channel started

Slave.jar version: 3.10.2
This is a Unix agent
Evacuated stdout
Agent successfully connected and online
ERROR: Dec 18, 2017 6:50:04 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.io.UTFDataFormatException
	at java.io.ObjectInputStream$BlockDataInputStream.readUTFSpan(ObjectInputStream.java:3137)
	at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3062)
	at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2874)
	at java.io.ObjectInputStream.readString(ObjectInputStream.java:1639)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
	at java.lang.Throwable.readObject(Throwable.java:914)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
	at hudson.remoting.Command.readFrom(Command.java:97)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)


ERROR: channel stopped

ERROR: Dec 18, 2017 6:50:04 PM hudson.remoting.UserRequest perform
WARNING: LinkageError while performing UserRequest:org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1@32dbb042
java.lang.NoClassDefFoundError: jnr/posix/JavaPOSIX
	at jnr.posix.POSIXFactory.getJavaPOSIX(POSIXFactory.java:51)
	at jnr.posix.POSIXFactory.loadPOSIX(POSIXFactory.java:109)
	at jnr.posix.LazyPOSIX.loadPOSIX(LazyPOSIX.java:36)
	at jnr.posix.LazyPOSIX.posix(LazyPOSIX.java:32)
	at jnr.posix.LazyPOSIX.chmod(LazyPOSIX.java:62)
	at hudson.FilePath._chmod(FilePath.java:1611)
	at hudson.FilePath.access$1500(FilePath.java:197)
	at hudson.FilePath$29.invoke(FilePath.java:1596)
	at hudson.FilePath$29.invoke(FilePath.java:1592)
	at hudson.FilePath.act(FilePath.java:998)
	at hudson.FilePath.act(FilePath.java:976)
	at hudson.FilePath.chmod(FilePath.java:1592)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.createSshKeyFile(CliGitAPIImpl.java:1708)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1624)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$300(CliGitAPIImpl.java:71)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:380)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:589)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:153)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:146)
	at hudson.remoting.UserRequest.perform(UserRequest.java:208)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:360)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: jnr.posix.JavaPOSIX
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:157)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 27 more


ERROR: Dec 18, 2017 6:50:04 PM hudson.remoting.Request$2 run
WARNING: Failed to send back a reply to the request hudson.remoting.Request$2@61970ff8
hudson.remoting.ChannelClosedException: channel is already closed
	at hudson.remoting.Channel.send(Channel.java:667)
	at hudson.remoting.Request$2.run(Request.java:372)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.UTFDataFormatException
	at java.io.ObjectInputStream$BlockDataInputStream.readUTFSpan(ObjectInputStream.java:3137)
	at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3062)
	at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2874)
	at java.io.ObjectInputStream.readString(ObjectInputStream.java:1639)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
	at java.lang.Throwable.readObject(Throwable.java:914)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
	at hudson.remoting.Command.readFrom(Command.java:97)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)

Can you give more details on this build ?

I have general pipeline with stages where mvn command is running

How long does it take ?

Normally it takes more than 10 minutes overall. The longest one was 14 min

Does it produce some huge amount of logs ?

No

or maybe is it quiet for a long time during the build ?

No

@alexeygrigorovaurea
Copy link

alexeygrigorovaurea commented Dec 20, 2017

Hi guys I have same issue when using windows docker agent via TCP socket and some observations here:
when attach method is chosen I see from logs remoting is started via docker exec and it keeps connection to docker daemon (i got it from 3 "true" arguments in a method call below) (not sure what for, is it possible to start remoting process in detached mode? ).
I suggest after some time docker host drops network connection and process is terminated. At least it happen to me when I use docker exec -it to same host from terminal and stay inactive for a while

read tcp 10.0.2.15:54206->10.x.x.x:2375: read: connection reset by peer

Dec 20, 2017 5:03:30 AM FINE com.github.dockerjava.core.command.AbstrDockerCmd exec
Cmd: 670ba4a21f8bc8ee9bba7162c41630c51e2ee8fb9aca779cde385f9c0eabe827,true,true,true,false,,{java,-jar,c://remoting-3.10.2.jar}

@ndeloof
Copy link
Contributor

ndeloof commented Dec 20, 2017

@alexeygrigorovaurea this is by design for this "attach" connector : is uses docker interactive mode as a transport layer for jenkins remoting. Starting remoting agent detached require another transport protocol, either SSH or JNLP, to connect master with agent.

Please note remoting channel is never inactive : PingThread do at least ensure some regular traffic. Also, I never noticed a container being killed for inactive interactive mode (I use this a lot).

@alexeygrigorovaurea
Copy link

Hi @ndeloof , thanks for explanantion
I have tried to play with pingthread by decreasing ping interval to 3 minutes but still have no luck and see error below, and at the end switched to JNLP as you suggested, it seems work better in my environment.

java.io.IOException: Read end dead
	at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:262)
	at java.io.PipedInputStream.receive(PipedInputStream.java:226)
	at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:93)
	at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:89)
	at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:58)
	at java.io.OutputStream.write(OutputStream.java:75)
	at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants