-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IORef based socket #303
IORef based socket #303
Conversation
Cc: @vdukhovni |
Network/Socket/Handle.hs
Outdated
@@ -17,7 +18,7 @@ import Network.Socket.Types | |||
|
|||
socketToHandle :: Socket -> IOMode -> IO Handle | |||
socketToHandle s mode = do | |||
let fd = fromIntegral $ fdSocket s | |||
fd <- fromIntegral <$> fdSocket s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should invalidate the socket's file descriptor making shoud avoid
behaviour impossible. Basically, emulate close
, but without closing the socket's file descriptor. Or if you prefer (where available) dup(2)
the socket's file descriptor, and close the socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, replace that should
with a MUST. Given that Sockets (in this branch of the code) have finalizers that automatically close them, and doing so would then break the Handle, and trigger a second close outside the IORef guard in sockets, this is NOT optional. You must either invalidate the socket, or at least dup
the file descriptor and pass that to the handle, but invalidation is I think more appropriate here, since continued I/O via the socket is not supported or expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
And done in 7e6eea8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many operations in the library that now let you operate on a file descriptor of -1
. Do we want to intercept the possible error here and display an error about the socket being closed?
Something like:
fdSocketOrThrow :: Socket -> IO CInt
fdSocketOrThrow s = do
fd <- fdSocket s
if fd == (-1)
then throw ioError "yada yada attempting to operate on a closed socket"
else pure fd
Network/Socket/Types.hsc
Outdated
unsafeFdSocket = unsafePerformIO . fdSocket | ||
{-# NOINLINE unsafeFdSocket #-} | ||
|
||
-- | Creating a socket form a file descriptor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo form
to from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Network/Socket/Types.hsc
Outdated
close (Socket ref) = do | ||
oldfd <- atomicModifyIORef' ref $ \cur -> (-1, cur) | ||
when (oldfd /= -1) $ | ||
closeFdWith (void . c_close . fromIntegral) (fromIntegral oldfd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to throw an error when attempting to close an already closed socket? I can see the argument for both ends of this.
- The user might want to know when they are closing a socket that has already been closed. This is available in the underlying api:
close() returns zero on success. On error, -1 is returned, and errno is set appropriately.
- Running
close
on an already closed socket is a safe enough operation. Throwing an exception seems overly explosive.
Maybe we need close'
, which returns a value indicating whether the socket has been closed or not. The possibilities are binary, but a Bool
doesn't really carry enough context to be useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this close
is to be safe for finalization, it MUST NOT throw exceptions. So if exceptions are to be thrown, there would need to be a second (need not be exported) close'
for finalization.
Network/Socket/Types.hsc
Outdated
newtype Socket = Socket (IORef CInt) | ||
|
||
instance Show Socket where | ||
show s = "<socket " ++ show (unsafeFdSocket s) ++ ">" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that unsafeFdSocket s
is a good idea here. Other than distinguishing different sockets, the actual value of fd
is not actually that interesting.
To me, it would make more sense to show the Socket type and address family, and ideally if bound the local address, if connected remote address... But some of those properties mutate over time and require IO to query. Perhaps, to avoid the I/O barrier the "show" string could be computed once when the socket is created? Sockets are expensive enough to create that carrying an extra string should not be burdensome. The string could be a ByteString if that's more efficient to store.
<socket STREAM AF_INET _fd_>
<socket DGRAM AF_INET6 _fd_>
<socket STREAM UNIX _fd_>
...
Are there backwards compatibility requirements to retain the legacy Show format? Note, I'm not sure whether my alternative suggestion is wise.
Perhaps a minimal Show
implementation is fine, and a more complete socket description could be provided as an IO action, that would also display the local address, the remote address (if connected), ...
displaySocket :: Socket -> IO String
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps even a ShortByteString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way, I think it is best to avoid unsafePerformIO
in the Show
implementation. Precomputing the string also means that we can display it even for closed sockets (where it
is the description of the socket when it was opened)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precomputing is done in 40801af
I ensured that show string is exactly the same as the previous one.
Is |
Network/Socket/Handle.hs
Outdated
hSetBuffering h NoBuffering | ||
return h | ||
where | ||
err _ = ioError $ userError $ "socketToHandle: already a Handle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more accurate error message might be "socket is no longer valid", since it might also have been closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. But I kept the old error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the mStat
MVar distinguished between sockets that got converted to handles and other states (e.g. already closed). We no longer do that, so perhaps the message could be more general to avoid misleading error messages that don't match the symptoms... Not critical, but something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Probably, I was too nervous. I changed it to "socket is no longer valid".
Network/Socket/Types.hsc
Outdated
invalidateSocket (Socket ref) errorAction normalAction = do | ||
oldfd <- atomicModifyIORef' ref $ \cur -> (-1, cur) | ||
if oldfd == -1 then errorAction oldfd else normalAction oldfd | ||
|
||
----------------------------------------------------------------------------- | ||
|
||
-- | Close the socket. Sending data to or receiving data from closed socket | ||
-- may lead to undefined behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it turns out, we no longer have that problem. When a socket is closed, any later I/O operations will encounter an invalid fd
and raise an error (bad file descriptor or similar). The only issue is that in multi-thread programs if one thread is doing I/O and another is closing the socket, then the thread that got the fd
just before it was closed, might be using a stale, still valid value.
This is in a semantic improvement. We now have safety for sequential operations. And only risk undefined behaviour in the unlikely case that a thread is closing a socket while another thread is using it.
Network/Socket/Types.hsc
Outdated
@@ -74,27 +73,24 @@ import Network.Socket.Imports | |||
----------------------------------------------------------------------------- | |||
|
|||
-- | Basic type for a socket. | |||
newtype Socket = Socket (IORef CInt) | |||
data Socket = Socket (IORef CInt) String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you decided to go with String
rather than ShortByteString
or similar. Perhaps slightly more expensive, but easily changed later, without compatibility issues. Are there any module dependency barriers here (circular dependencies or dependencies we want to avoid) to changing this later? Am I wrong about String being a "more expensive" representation to carry along with the socket?
Of course this is unlikely to matter much...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GHC 7.6 uses bytestring-0.10.0.2 which does not provide ShortByteString
unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make more sense to carry around an ascii encoded ByteString
and pack
it whenever it needs to be shown? Utilizing String
here means we are introducing a fair amount of indirection and bloat for a Show
instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my instinct also. (Though of course ByteString show is unpack
, but that's just detail).
Perhaps the approach should instead be to store the minimal information needed to avoid having to use unsafePerformIO
later. To that end, just the lazy byte string form of the fd is enough.
We could even store an originalFd
as a Word32
value, whose different type should be enough to discourage misuse (bypassing the IORef).
data Socket = Socket (IORef CInt) Word32
...
mkSocket :: CInt -> IO Socket
mkSocket fd = do
ref <- newIORef fd
let s = Socket ref $ fromIntegral fd
void $ mkWeakIORef ref $ close s
return s
and then in the show
instance use that:
show (Socket _ wfd) = "<socket: " ++ show wfd ++ ">"
There's probably no need to pre-format the whole string (was my suggestion, sorry). We could keep in mind that Haskell is lazy, and so the (current) string would I think remain an unevaluated thunk until actually used. Not sure whether the thunk is cheaper to keep around than the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to data Socket = Socket (IORef CInt) CInt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. The field is not exported, and the module in which it is visible is small. So the immutable original fd
is unlikely to get misused.
I no longer see any major issues, looks OK to me. At this point any wiser feedback should ideally come from folks deeply versed in Haskell issues, rather than my system programming background. If anyone else is following along, what do you think of adding the finalizer? Is it a good idea? Or just unnecessary overhead? Either way a mutable |
Since 2.7.0.0 deprecates many APIs, we need to wait for a long time to release 3.0.0.0. Maybe a half or full year. |
I don't have a strong opinion. But when (-1) returned by |
It is probably OK to let the OS return EBADF for now. If someone finds a compelling reason to check at the Haskell layer, this can be done later. The main win here is that we no longer try to use a previously closed socket's fd. That may even uncover the culprit in the open issue in: snoyberg/http-client#252 Since perhaps now attempts to use such sockets might be caught a bit earlier with a less confusing context. Or at least the hypothesis that closed sockets are being re-used might be confirmed. |
@eborden All issues have been fixed, I believe. Please review this again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@eborden @vdukhovni Thank you for your comments and review. Merged. |
Thanks. You're welcome. |
This implements the idea described in #302.