IORef based socket #303

kazu-yamamoto · 2018-02-02T06:47:19Z

This implements the idea described in #302.

kazu-yamamoto · 2018-02-02T06:48:17Z

vdukhovni · 2018-02-02T15:10:32Z

Network/Socket/Handle.hs

@@ -17,7 +18,7 @@ import Network.Socket.Types

 socketToHandle :: Socket -> IOMode -> IO Handle
 socketToHandle s mode = do
-    let fd = fromIntegral $ fdSocket s
+    fd <- fromIntegral <$> fdSocket s


This function should invalidate the socket's file descriptor making shoud avoid behaviour impossible. Basically, emulate close, but without closing the socket's file descriptor. Or if you prefer (where available) dup(2) the socket's file descriptor, and close the socket.

Actually, replace that should with a MUST. Given that Sockets (in this branch of the code) have finalizers that automatically close them, and doing so would then break the Handle, and trigger a second close outside the IORef guard in sockets, this is NOT optional. You must either invalidate the socket, or at least dup the file descriptor and pass that to the handle, but invalidation is I think more appropriate here, since continued I/O via the socket is not supported or expected.

Thanks.
And done in 7e6eea8

eborden

There are many operations in the library that now let you operate on a file descriptor of -1. Do we want to intercept the possible error here and display an error about the socket being closed?

Something like:

fdSocketOrThrow :: Socket -> IO CInt
fdSocketOrThrow s = do
  fd <- fdSocket s
  if fd == (-1)
    then throw ioError "yada yada attempting to operate on a closed socket"
    else pure fd

eborden · 2018-02-02T15:00:37Z

Network/Socket/Types.hsc

+unsafeFdSocket = unsafePerformIO . fdSocket
+{-# NOINLINE unsafeFdSocket #-}
+
+-- | Creating a socket form a file descriptor.


typo form to from

eborden · 2018-02-02T15:07:09Z

Network/Socket/Types.hsc

+close (Socket ref) = do
+    oldfd <- atomicModifyIORef' ref $ \cur -> (-1, cur)
+    when (oldfd /= -1) $
+        closeFdWith (void . c_close . fromIntegral) (fromIntegral oldfd)


Do we want to throw an error when attempting to close an already closed socket? I can see the argument for both ends of this.

The user might want to know when they are closing a socket that has already been closed. This is available in the underlying api:

close() returns zero on success. On error, -1 is returned, and errno is set appropriately.

Running close on an already closed socket is a safe enough operation. Throwing an exception seems overly explosive.

Maybe we need close', which returns a value indicating whether the socket has been closed or not. The possibilities are binary, but a Bool doesn't really carry enough context to be useful here.

If this close is to be safe for finalization, it MUST NOT throw exceptions. So if exceptions are to be thrown, there would need to be a second (need not be exported) close' for finalization.

vdukhovni · 2018-02-02T15:59:06Z

Network/Socket/Types.hsc

+newtype Socket = Socket (IORef CInt)
+
+instance Show Socket where
+    show s = "<socket " ++ show (unsafeFdSocket s) ++ ">"


Not sure that unsafeFdSocket s is a good idea here. Other than distinguishing different sockets, the actual value of fd is not actually that interesting.

To me, it would make more sense to show the Socket type and address family, and ideally if bound the local address, if connected remote address... But some of those properties mutate over time and require IO to query. Perhaps, to avoid the I/O barrier the "show" string could be computed once when the socket is created? Sockets are expensive enough to create that carrying an extra string should not be burdensome. The string could be a ByteString if that's more efficient to store.

<socket STREAM AF_INET _fd_> <socket DGRAM AF_INET6 _fd_> <socket STREAM UNIX _fd_> ...

Are there backwards compatibility requirements to retain the legacy Show format? Note, I'm not sure whether my alternative suggestion is wise.

Perhaps a minimal Show implementation is fine, and a more complete socket description could be provided as an IO action, that would also display the local address, the remote address (if connected), ...

displaySocket :: Socket -> IO String

Perhaps even a ShortByteString

Either way, I think it is best to avoid unsafePerformIO in the Show implementation. Precomputing the string also means that we can display it even for closed sockets (where it is the description of the socket when it was opened)

Precomputing is done in 40801af
I ensured that show string is exactly the same as the previous one.

vdukhovni · 2018-02-02T16:19:16Z

Is CInt representation of Sockets with -1 as the invalid sentinel value also to be used on Windows? Windows sockets are unsigned ints with INVALID_SOCKET instead of -1 as the sentinel value. INVALID_SOCKET happens to be (~0) thus same as (unsigned) -1 on 2's complement machines, so not sure whether we should do anything special here or not.

vdukhovni · 2018-02-05T05:46:47Z

Network/Socket/Handle.hs

    hSetBuffering h NoBuffering
    return h
+  where
+    err _ = ioError $ userError $ "socketToHandle: already a Handle"


A more accurate error message might be "socket is no longer valid", since it might also have been closed.

Yeah. But I kept the old error message.

Previously the mStat MVar distinguished between sockets that got converted to handles and other states (e.g. already closed). We no longer do that, so perhaps the message could be more general to avoid misleading error messages that don't match the symptoms... Not critical, but something to consider.

OK. Probably, I was too nervous. I changed it to "socket is no longer valid".

vdukhovni · 2018-02-05T05:51:37Z

Network/Socket/Types.hsc

+invalidateSocket (Socket ref) errorAction normalAction = do
+    oldfd <- atomicModifyIORef' ref $ \cur -> (-1, cur)
+    if oldfd == -1 then errorAction oldfd else normalAction oldfd
+
 -----------------------------------------------------------------------------

 -- | Close the socket. Sending data to or receiving data from closed socket
 --   may lead to undefined behaviour.


As it turns out, we no longer have that problem. When a socket is closed, any later I/O operations will encounter an invalid fd and raise an error (bad file descriptor or similar). The only issue is that in multi-thread programs if one thread is doing I/O and another is closing the socket, then the thread that got the fd just before it was closed, might be using a stale, still valid value.

This is in a semantic improvement. We now have safety for sequential operations. And only risk undefined behaviour in the unlikely case that a thread is closing a socket while another thread is using it.

vdukhovni · 2018-02-05T05:56:12Z

Network/Socket/Types.hsc

@@ -74,27 +73,24 @@ import Network.Socket.Imports
 -----------------------------------------------------------------------------

 -- | Basic type for a socket.
-newtype Socket = Socket (IORef CInt)
+data Socket = Socket (IORef CInt) String


I see you decided to go with String rather than ShortByteString or similar. Perhaps slightly more expensive, but easily changed later, without compatibility issues. Are there any module dependency barriers here (circular dependencies or dependencies we want to avoid) to changing this later? Am I wrong about String being a "more expensive" representation to carry along with the socket?

Of course this is unlikely to matter much...

GHC 7.6 uses bytestring-0.10.0.2 which does not provide ShortByteString unfortunately.

Does it make more sense to carry around an ascii encoded ByteString and pack it whenever it needs to be shown? Utilizing String here means we are introducing a fair amount of indirection and bloat for a Show instance.

That was my instinct also. (Though of course ByteString show is unpack, but that's just detail).
Perhaps the approach should instead be to store the minimal information needed to avoid having to use unsafePerformIO later. To that end, just the lazy byte string form of the fd is enough.
We could even store an originalFd as a Word32 value, whose different type should be enough to discourage misuse (bypassing the IORef).

data Socket = Socket (IORef CInt) Word32 ... mkSocket :: CInt -> IO Socket mkSocket fd = do ref <- newIORef fd let s = Socket ref $ fromIntegral fd void $ mkWeakIORef ref $ close s return s

and then in the show instance use that:

show (Socket _ wfd) = "<socket: " ++ show wfd ++ ">"

There's probably no need to pre-format the whole string (was my suggestion, sorry). We could keep in mind that Haskell is lazy, and so the (current) string would I think remain an unevaluated thunk until actually used. Not sure whether the thunk is cheaper to keep around than the string.

I changed it to data Socket = Socket (IORef CInt) CInt.

That's fine. The field is not exported, and the module in which it is visible is small. So the immutable original fd is unlikely to get misused.

vdukhovni · 2018-02-05T06:00:35Z

I no longer see any major issues, looks OK to me. At this point any wiser feedback should ideally come from folks deeply versed in Haskell issues, rather than my system programming background.

If anyone else is following along, what do you think of adding the finalizer? Is it a good idea? Or just unnecessary overhead? Either way a mutable fd is I think a win, as it makes sockets safer against various types of misuse. The finalizer could be a luxury, if everyone had been getting along without it all this time... I have no more answers, only questions. Thanks for taking care of the issues I found.

kazu-yamamoto · 2018-02-05T06:16:24Z

I no longer see any major issues, looks OK to me. At this point any wiser feedback should ideally come from folks deeply versed in Haskell issues, rather than my system programming background.

Since 2.7.0.0 deprecates many APIs, we need to wait for a long time to release 3.0.0.0. Maybe a half or full year.

kazu-yamamoto · 2018-02-05T06:30:39Z

There are many operations in the library that now let you operate on a file descriptor of -1. Do we want to intercept the possible error here and display an error about the socket being closed?

Something like:
fdSocketOrThrow :: Socket -> IO CInt
fdSocketOrThrow s = do
  fd <- fdSocket s
  if fd == (-1)
    then throw ioError "yada yada attempting to operate on a closed socket"
    else pure fd

I don't have a strong opinion. But when (-1) returned by fdSocket is used in following syscalls, another error occurs eventually. So, unless there are big problems, I would keep this at this moment.

vdukhovni · 2018-02-05T07:12:51Z

It is probably OK to let the OS return EBADF for now. If someone finds a compelling reason to check at the Haskell layer, this can be done later. The main win here is that we no longer try to use a previously closed socket's fd. That may even uncover the culprit in the open issue in:

snoyberg/http-client#252
haskell-tls/hs-tls#179

Since perhaps now attempts to use such sockets might be caught a bit earlier with a less confusing context. Or at least the hypothesis that closed sockets are being re-used might be confirmed.

kazu-yamamoto · 2018-02-06T01:43:18Z

@eborden All issues have been fixed, I believe. Please review this again.

eborden

👍

kazu-yamamoto · 2018-02-06T03:03:37Z

@eborden @vdukhovni Thank you for your comments and review. Merged.

vdukhovni · 2018-02-06T03:10:39Z

Thanks. You're welcome.

kazu-yamamoto added 6 commits February 2, 2018 13:35

IORef for Socket.

b1a37e4

fix for Windows.

a20d24a

fix for Linux.

3de7d80

moving close from Close to Types.

2a5c69a

renaming Close to Shutdown.

96d73e3

Safe closing and GCing Socket.

7916e72

kazu-yamamoto requested a review from eborden February 2, 2018 06:47

vdukhovni reviewed Feb 2, 2018

View reviewed changes

eborden requested changes Feb 2, 2018

View reviewed changes

vdukhovni reviewed Feb 2, 2018

View reviewed changes

kazu-yamamoto added 4 commits February 5, 2018 13:54

fix a typo.

610dcb3

invalidating Socket in socketToHandle.

7e6eea8

pre-calculating the show string of Socket.

40801af

defining invalidSocket.

d536bdd

vdukhovni reviewed Feb 5, 2018

View reviewed changes

kazu-yamamoto added 2 commits February 6, 2018 10:36

a proper error message for socketToHandle.

a529f96

keeping CInt instead of String in Socket.

d555e65

eborden approved these changes Feb 6, 2018

View reviewed changes

kazu-yamamoto added a commit to kazu-yamamoto/network that referenced this pull request Feb 6, 2018

Merge PR haskell#303.

b9f3463

kazu-yamamoto merged commit d555e65 into haskell:master Feb 6, 2018

kazu-yamamoto deleted the socket-ioref branch February 6, 2018 03:02

kazu-yamamoto mentioned this pull request Feb 7, 2018

GCing Sockets is a bad diea #302

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IORef based socket #303

IORef based socket #303

kazu-yamamoto commented Feb 2, 2018

kazu-yamamoto commented Feb 2, 2018

vdukhovni Feb 2, 2018

vdukhovni Feb 2, 2018

kazu-yamamoto Feb 5, 2018

eborden left a comment

eborden Feb 2, 2018

kazu-yamamoto Feb 5, 2018

eborden Feb 2, 2018

vdukhovni Feb 2, 2018

vdukhovni Feb 2, 2018

vdukhovni Feb 2, 2018

vdukhovni Feb 2, 2018

kazu-yamamoto Feb 5, 2018

vdukhovni commented Feb 2, 2018

vdukhovni Feb 5, 2018

kazu-yamamoto Feb 5, 2018

vdukhovni Feb 5, 2018

kazu-yamamoto Feb 6, 2018

vdukhovni Feb 5, 2018

vdukhovni Feb 5, 2018

kazu-yamamoto Feb 5, 2018 •

edited

Loading

eborden Feb 5, 2018

vdukhovni Feb 5, 2018 •

edited

Loading

kazu-yamamoto Feb 6, 2018

vdukhovni Feb 6, 2018

vdukhovni commented Feb 5, 2018

kazu-yamamoto commented Feb 5, 2018

kazu-yamamoto commented Feb 5, 2018

vdukhovni commented Feb 5, 2018 •

edited

Loading

kazu-yamamoto commented Feb 6, 2018

eborden left a comment

kazu-yamamoto commented Feb 6, 2018

vdukhovni commented Feb 6, 2018

IORef based socket #303

IORef based socket #303

Conversation

kazu-yamamoto commented Feb 2, 2018

kazu-yamamoto commented Feb 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eborden left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented Feb 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazu-yamamoto Feb 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni Feb 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdukhovni commented Feb 5, 2018

kazu-yamamoto commented Feb 5, 2018

kazu-yamamoto commented Feb 5, 2018

vdukhovni commented Feb 5, 2018 • edited Loading

kazu-yamamoto commented Feb 6, 2018

eborden left a comment

Choose a reason for hiding this comment

kazu-yamamoto commented Feb 6, 2018

vdukhovni commented Feb 6, 2018

kazu-yamamoto Feb 5, 2018 •

edited

Loading

vdukhovni Feb 5, 2018 •

edited

Loading

vdukhovni commented Feb 5, 2018 •

edited

Loading