Skip to content

Commit

Permalink
wsd: reduce noise while draining with ECONNRESET
Browse files Browse the repository at this point in the history
Now that we re-use connections, we have revealed
a corner-case that causes some noise.

Remote connections might not retain HTTP
connections past their immediate purpose and
rudely reset the connection after a certain
timeout.

Apache has such a short timeout of between 5
and 15 seconds, by default. And, it seems,
HA Proxy is similar. When the connection is
not reused by the client within the timeout,
it is simply closed.

This wakes us up and we get revents with
POLLIN, POLLHUP, and POLLERR (that is, 0x19).

This patch handles this case by recognizing
that we often get POLLHUP and the socket
is closed before doing read(2). That is,
read(2) *is* expected to fail with ECONNRESET.

We handle this case specifically by avoiding
reporting it as an error. We will still log
an error if we get ECONNRESET while reading,
when we haven't been idle, even though
technically that is also possible.
(Ideally, we would flag whether the socket
is reusable or not and also suppress a close
immediately after reading all the data.)

As such, this is *not* a functional change;
we only tollerate draining a dead connection
without noisily logging.

Here is a sample where socket #38 is used to
request a static file through HTTP GET:

16.150238 [ websrv_poll ] TRC  #38: Handling poll events of websrv_poll at index 4 (of 7): 0x1| net/Socket.cpp:558
16.150244 [ websrv_poll ] TRC  #38: Read 506 bytes in addition to 0 buffered bytes| net/Socket.hpp:1166
16.150246 [ websrv_poll ] TRC  #38: Incoming data buffer 506 bytes, read result: 506, events: 0x1 (not closed)| net/Socket.hpp:1350
16.150270 [ websrv_poll ] INF  #38: parseHeader: Client HTTP Request: GET, uri `/browser/ebaa6a6/images/lc_addcitationnote.svg` HTTP/1.1, sz[header 506, content -1], offset 506, chunked false, host: office.nc-presales.top / user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:131.0) Gecko/20100101 Firefox/131.0 / accept: image/avif,image/webp,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5 / accept-language: en,en-US;q=0.5 / accept-encoding: gzip, deflate, br, zstd / dnt: 1 / sec-fetch-dest: image / sec-fetch-mode: no-cors / sec-fetch-site: same-origin / sec-gpc: 1 / priority: u=5, i / x-real-ip: 176.145.78.103 / x-forwarded-for: 176.145.78.103| net/Socket.cpp:1340
16.150276 [ websrv_poll ] DBG  #38: Handling request: /browser/ebaa6a6/images/lc_addcitationnote.svg, closeConnection false| wsd/ClientRequestDispatcher.cpp:670
16.150287 [ websrv_poll ] TRC  Fileserver request: /browser/ebaa6a6/images/lc_addcitationnote.svg| wsd/FileServer.cpp:524
16.150303 [ websrv_poll ] TRC  #38: Sending compressed : file [/browser/dist/images/lc_addcitationnote.svg]: Date: Thu, 31 Oct 2024 12:25:16 / Server: COOLWSD HTTP Server 24.04.7.3 / Content-Type: image/svg+xml / Content-Encoding: gzip / Content-Length: 489 / Cache-Control: max-age=11059200 / ETag: "ebaa6a6" / X-Content-Type-Options: nosniff| wsd/FileServer.cpp:732
16.150317 [ websrv_poll ] TRC  #38: Wrote 248 bytes of 248 buffered data| net/Socket.hpp:1490
16.150323 [ websrv_poll ] TRC  #38: Wrote 489 bytes of 489 buffered data| net/Socket.hpp:1490
16.150326 [ websrv_poll ] DBG  #38: Handled request: /browser/ebaa6a6/images/lc_addcitationnote.svg, inBuf[sz 506 -> 0, rm 506], connection open true| wsd/ClientRequestDispatcher.cpp:952
16.150341 [ websrv_poll ] TRC  #38: setupPollFds getPollEvents: 0x1| net/Socket.hpp:880
16.201424 [ websrv_poll ] TRC  #38: Handling poll events of websrv_poll at index 4 (of 7): 0x0| net/Socket.cpp:558
16.201547 [ websrv_poll ] TRC  #38: setupPollFds getPollEvents: 0x1| net/Socket.hpp:880
16.211803 [ websrv_poll ] TRC  #38: Handling poll events of websrv_poll at index 4 (of 7): 0x0| net/Socket.cpp:558
16.211810 [ websrv_poll ] TRC  #38: setupPollFds getPollEvents: 0x1| net/Socket.hpp:880
27.991710 [ websrv_poll ] TRC  #38: Handling poll events of websrv_poll at index 4 (of 7): 0x0| net/Socket.cpp:558
27.991836 [ websrv_poll ] TRC  #28: setupPollFds getPollEvents: 0x1| net/Socket.hpp:880
27.991838 [ websrv_poll ] TRC  #38: setupPollFds getPollEvents: 0x1| net/Socket.hpp:880
27.991840 [ websrv_poll ] TRC  ppoll start, timeoutMicroS: 64000000 size 4| net/Socket.cpp:430
32.993143 [ websrv_poll ] TRC  Poll completed with 2 live polls max (64000000us)| net/Socket.cpp:448
32.993196 [ websrv_poll ] TRC  #38: Handling poll events of websrv_poll at index 3 (of 4): 0x19| net/Socket.cpp:558
32.993201 [ websrv_poll ] ERR  #38: Read failed, have 0 buffered bytes (ECONNRESET: Connection reset by peer)| net/Socket.hpp:1156
32.993210 [ websrv_poll ] TRC  #38: Incoming data buffer 0 bytes, read result: -1, events: 0x19 (closed)| net/Socket.hpp:1350
32.993212 [ websrv_poll ] DBG  #38: Closed after reading| net/Socket.hpp:1373
32.993213 [ websrv_poll ] TRC  #38: Closed. Firing onDisconnect.| net/Socket.hpp:1445

Notice how the socket remained open after
serving the file for over 15 seconds,
until we logged the error.

With this patch, the ERR line will become:
32.993201 [ websrv_poll ] TRC  #38: Read failed, because the connection is reset by peer, have 0 buffered bytes: ECONNRESET| net/Socket.hpp:1156

Change-Id: Id440fc25ff899b44bc4db0b267cde84cca78282d
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
  • Loading branch information
Ashod authored and mmeeks committed Nov 2, 2024
1 parent 23d1b15 commit 6bcced1
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions net/Socket.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1257,8 +1257,20 @@ class StreamSocket : public Socket,
last_errno = errno; // Save only on error.

if (len < 0 && last_errno != EAGAIN && last_errno != EWOULDBLOCK)
LOG_SYS_ERRNO(last_errno, "Read failed, have " << _inBuffer.size()
<< " buffered bytes");
{
if (last_errno == ECONNRESET && _inBuffer.empty())
{
// Unexpected, but often intentional on an idle connection.
LOGA_TRC(Socket, "Read failed because the connection is reset by peer, "
"have 0 buffered bytes: ECONNRESET");
}
else
{
// Unexpected read error while draining the read buffer.
LOG_SYS_ERRNO(last_errno, "Read failed, have " << _inBuffer.size()
<< " buffered bytes");
}
}
else if (len < 0)
LOGA_TRC(Socket, "Read failed (" << len << "), have " << _inBuffer.size()
<< " buffered bytes ("
Expand Down Expand Up @@ -1459,7 +1471,7 @@ class StreamSocket : public Socket,

if (events & POLLIN)
{
// readIncomingData returns false only if the read len is 0 (closed).
// readIncomingData returns 0 only if the read len is 0 (closed).
// Oddly enough, we don't necessarily get POLLHUP after read(2) returns 0.
const int read = readIncomingData();
const int last_errno = errno;
Expand All @@ -1486,6 +1498,7 @@ class StreamSocket : public Socket,
}
else if (read == 0 || (read < 0 && (last_errno == EPIPE || last_errno == ECONNRESET)))
{
// There is nothing more to read; either we got EOF, or we drained after ECONNRESET.
LOG_DBG("Closed after reading");
closed = true;
}
Expand Down

0 comments on commit 6bcced1

Please sign in to comment.