Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task- and Span-based Socket.SendFile #45132

Closed
wants to merge 9 commits into from

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Nov 23, 2020

Fixes #42591
Fixes #43846

For the windows-implementation added a new internal type FileSendSocketAsyncEventargs.
As on unix SendFileAsync consists of three "send-calls" (one for prebuffer, one for the file, one for the postbuffer) at the moment I see no way to unify the use of FileSendSocketAsyncEventargs for both flavors.

@Dotnet-GitSync-Bot
Copy link
Collaborator

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@gfoidl
Copy link
Member Author

gfoidl commented Nov 23, 2020

@stephentoub PTAL -- I've never been so far down in the internals of networking IO, so I'm not sure if this approach is OK. This PR is created to get some initial feedback (and hopefully hints for the correct route).

@ghost
Copy link

ghost commented Nov 23, 2020

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #42591
Fixes #43846

For the windows-implementation added a new internal type FileSendSocketAsyncEventargs.
As on unix SendFileAsync consists of three "send-calls" (one for prebuffer, one for the file, one for the postbuffer) at the moment I see no way to unify the use of FileSendSocketAsyncEventargs for both flavors.

Author: gfoidl
Assignees: -
Labels:

area-System.Net.Sockets, new-api-needs-documentation

Milestone: -

@@ -443,6 +446,7 @@ public enum SocketAsyncOperation
Send = 7,
SendPackets = 8,
SendTo = 9,
SendFile = 10,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed or shall Send be used for SendFileAsync?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not exposing the SocketAsyncEventArgs-based overload publicly there is no point extending the public API with this option, so let's remove it I guess. (Adding a comment about the workaround when SocketAsyncOperation.Send is being assigned.) @geoffkizer agreed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SocketAsyncEventArgs already has SendPackets, which is effectively a more general version of SendFile. So we shouldn't be adding SendFile to this enum.

I suppose it does raise the question of whether we should have a Task-based SendPacketsAsync method, though...

}

// Send the file, if any
if (fileStream != null)
{
var tcs = new TaskCompletionSource<SocketError>();
errorCode = SocketPal.SendFileAsync(_handle, fileStream, (_, socketError) => tcs.SetResult(socketError));
AsyncValueTaskMethodBuilder<SocketError> taskBuilder = AsyncValueTaskMethodBuilder<SocketError>.Create();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the TaskCompletionSource this one is used and the callback (below) updated to take a state, so no closures need to be allocated.

}

private void EndSendFileInternal(IAsyncResult asyncResult)
{
TaskToApm.End(asyncResult);
}

internal sealed class FileSendSocketAsyncEventargs : SocketAsyncEventArgs, IValueTaskSource
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this one is needed here too, to allow

Interlocked.Exchange(ref _fileSendEventArgs, null)?.Dispose();

@scalablecory
Copy link
Contributor

As on unix SendFileAsync consists of three "send-calls" (one for prebuffer, one for the file, one for the postbuffer) at the moment I see no way to unify the use of FileSendSocketAsyncEventargs for both flavors.

Doesn't need to be in this PR, but consider setting the TCP_CORK option before sending, and removing it afterward (assuming it wasn't already set). This will make sending pre-/post- buffers more efficient.

@gfoidl
Copy link
Member Author

gfoidl commented Nov 24, 2020

to unify the use of FileSendSocketAsyncEventargs for both flavors

Windows has the (nice) API TransmitFile, which linux doesn't have. As idea the linux-pal could provide a similar api (that in turn does the three "send-calls"), so the managed side can be unified with the use of FileSendSocketAsyncEventargs.
(This should be done in a separate PR if it should be done anyway).

TCP_CORK

Good point. I'll have a look into this (later today).

@@ -2012,16 +2024,17 @@ public SocketError SendFileAsync(SafeFileHandle fileHandle, long offset, long co
return errorCode;
}

var operation = new SendFileOperation(this)
var operation = new SendFileOperation<TState>(this)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Unix we don't cache this operation.
Other operations are cached, the Windows-counterpart is cached too.

Shall it be cached here too or is (re-) creating just fine?

@antonfirsov
Copy link
Member

consider setting the TCP_CORK option before sending, and removing it afterward (assuming it wasn't already set)

@scalablecory how should this work if there are multiple overlapping SendFile-s? Don't we need some sort of semaphore logic to count the number of outstanding sends? What if TCP_CORK gets enabled in the middle of a send burst?

@@ -356,6 +359,31 @@ internal Task<int> SendToAsync(ArraySegment<byte> buffer, SocketFlags socketFlag
return tcs.Task;
}

public ValueTask SendFileAsync(string? fileName, CancellationToken cancellationToken = default)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comments are in gfoidl@feade37

Will merge them once the rest of the PR is OK to avoid unnecessary CI builds.

@scalablecory
Copy link
Contributor

consider setting the TCP_CORK option before sending, and removing it afterward (assuming it wasn't already set)

@scalablecory how should this work if there are multiple overlapping SendFile-s? Don't we need some sort of semaphore logic to count the number of outstanding sends? What if TCP_CORK gets enabled in the middle of a send burst?

Are overlapping SendFile valid? Do you mean queued?

I think sending pre- and post- buffer as separate I/Os has the same issue, right? You'd just have 5 ops to do as one serialized package rather than 3.

@@ -28,6 +28,9 @@ public partial class Socket
/// <summary>Cached instance for send operations that return <see cref="Task{Int32}"/>.</summary>
private TaskSocketAsyncEventArgs<int>? _multiBufferSendEventArgs;

/// <summary>Cached instance for file send operations that return <see cref="ValueTask"/>.</summary>
private FileSendSocketAsyncEventargs? _fileSendEventArgs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use one of the existing send instances (above) here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state of the PR no, as FileSendSocketAsyncEventargs is special to windows and potential parent classes are sealed. Unsealing them might change perf-characteristics or would this be OK?

But I'll try something to unifiy this. Adding the fields to SocketAsyncEventArgs and use AwaitableSocketAsyncEventArgs, then one field could be re-used.

[InlineData(false, true)]
[InlineData(true, false)]
[InlineData(true, true)]
public async Task SendFileAsync_NoFile_Succeeds(bool usePreBuffer, bool usePostBuffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of other socket tests (e.g. SendReceive, connect, etc) are generalized for different "modes" (sync, various async, span vs no span, etc) by deriving from SocketTestHelperBase and then instantiating appropriate versions of the class. We don't do this currently with SendFile, but since we are adding all these overloads, it probably makes sense to. That means adding SendFile to SocketHelperBase and providing relevant implementations on the derived classes, e.g. SocketHelperApm etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rebasing tests on SocketTestHelperBase is a must-have with this change.

Looks like most of the tests can be refactored to the base class (SendFile<T> : SocketTestHelperBase<T> where T : SocketHelperBase), the ones that are specific to an (APM/Sync/Task) implementation can be moved to the subclass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but can we do this in a different PR? There is quite some rebasing needed as

public class SendFileTest : FileCleanupTestBase
already inherits from FileCleanupTestBase (I guess this may be a reason why it wasn't done so far.)

}
}
}

if (errorCode != SocketError.Success)
{
UpdateSendSocketErrorForDisposed(ref errorCode);
UpdateStatusAfterSocketErrorAndThrowException(errorCode);

if (errorCode == SocketError.OperationAborted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking for OperationAborted, I think we should just check if the cancellationToken is cancelled and throw if it is (i.e. call cancellationToken.ThrowIfCancellationRequested)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though we may want to call UpdateStatusAfterSocketError too, not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to call UpdateStatusAfterSocketError too, not sure.

I'm not sure either, but I think we won't.

internal void UpdateStatusAfterSocketError(SocketError errorCode)
{
// If we already know the socket is disconnected
// we don't need to do anything else.
if (NetEventSource.Log.IsEnabled()) NetEventSource.Error(this, $"errorCode:{errorCode}");
if (_isConnected && (_handle.IsInvalid || (errorCode != SocketError.WouldBlock &&
errorCode != SocketError.IOPending && errorCode != SocketError.NoBufferSpaceAvailable &&
errorCode != SocketError.TimedOut)))
{
// The socket is no longer a valid socket.
if (NetEventSource.Log.IsEnabled()) NetEventSource.Info(this, "Invalidating socket.");
SetToDisconnected();
}
}
would disconnect the socket, and cancellation is no reason to disconnect, or?

@geoffkizer
Copy link
Contributor

If I'm understanding this correctly, you've taken two different approaches here for Windows and Unix. For Unix, you updated the existing async SendFile path to support Tasks directly. For Windows, you added a new SendFile operation to SocketAsyncEventArgs.

Both approaches have their merits, but we should use the same approach for all platforms.

This is all complicated by the fact that SocketAsyncEventArgs doesn't actually support a SendFile operation today. It does. however, support SendPackets, which can do everything SendFile does. On Windows this ends up calling TransmitPackets instead of TransmitFile, but I don't think we care about that distinction.

So I would suggestion taking one of the following approaches on both Windows and Linux:
(1) Just update the existing APM-based async paths to be Task-based (as you've done currently on Unix).
(2) Implement on top of SocketAsyncEventArgs (like you've done on Windows), but use SendPackets in the implementation.

I think (2) is better long-term, as it would leverage SAEA caching and remove some existing code; but (1) is simpler and addresses the immediate issue.

@geoffkizer
Copy link
Contributor

BTW, it might make sense to separate out the new sync overloads into a separate PR, just to keep each change more manageable; up to you.

@geoffkizer
Copy link
Contributor

I think sending pre- and post- buffer as separate I/Os has the same issue, right? You'd just have 5 ops to do as one serialized package rather than 3.

This is a good point -- I think we have an issue with the Unix impl of SendFile today, since there's no serialization here. If you do a concurrent Send, then that send could slip in between one of the parts of the SendFile and you won't get the expected results.

Does that sound right?

That said, we don't need to fix this as part of this PR.

@gfoidl
Copy link
Member Author

gfoidl commented Nov 25, 2020

we have an issue with the Unix impl of SendFile today, since there's no serialization here. If you do a concurrent Send, then that send could slip in between one of the parts of the SendFile and you won't get the expected results.

Does that sound right?

Yep, that's right. The current implementation doesn't have any serialization, nor does this PR.
As you said, this should be addressed separately. I didn't find any (related) issue for this. Create one?

This is all complicated by the fact that SocketAsyncEventArgs doesn't actually support a SendFile operation today. It does. however, support SendPackets, which can do everything SendFile does

With SendPackets I don't get the point on how to handle preBuffer and postBuffer.
On Windows TransmitFile is a handy API which supports that in a direct way.

If the unix-pal would provide a similar API than TransmitFile on Windows, we could unifiy the implementation quite easily.
I'll give it a try to move the differences down to the pal, so the "entry" can be unified with SAEA...hopefully that won't make things even more complicated.

(1) Just update the existing APM-based async paths to be Task-based (as you've done currently on Unix).

For Unix there was already SendFileInternalAsync, which got updated for the new APIs.
On Windows there was no matching counterpart, therefore FileSendSocketAsyncEventargs got introduced (as it's special to Windows). BeginSendFile, etc. doesn't have support for cancellation either (or I don't know how to plumb it in).

the new sync overloads into a separate PR, just to keep each change more manageable; up to you.

This is basicallay just adding span and forwarding the array-based ones to them. That's why it is in the same PR (it's the easy part). So I'd like to keep them here.

@gfoidl
Copy link
Member Author

gfoidl commented Nov 25, 2020

With SendPackets I don't get the point on how to handle preBuffer and postBuffer.

Possible is using multiple SendPacketsElement (one for the preBuffer, one for the file, and one for the postBuffer). But that's three allocations + the one for the array _sendPacketsElements (and another one for the internal clone and for pinning) that can be avoided by using TransmitFile and memories.

BTW: on Unix there's the same serialization issue.

@scalablecory
Copy link
Contributor

This is a good point -- I think we have an issue with the Unix impl of SendFile today, since there's no serialization here. If you do a concurrent Send, then that send could slip in between one of the parts of the SendFile and you won't get the expected results.

Does that sound right?

I don't think sync needs to be serialized, but async at least should be.

@geoffkizer
Copy link
Contributor

I don't think sync needs to be serialized, but async at least should be.

Why not sync?

@geoffkizer
Copy link
Contributor

As you said, this should be addressed separately. I didn't find any (related) issue for this. Create one?

Yeah, that would be great, thanks.

@geoffkizer
Copy link
Contributor

Possible is using multiple SendPacketsElement (one for the preBuffer, one for the file, and one for the postBuffer). But that's three allocations + the one for the array _sendPacketsElements (and another one for the internal clone and for pinning) that can be avoided by using TransmitFile and memories.

Yes, that's what I was suggesting. I'm not sure the allocations matter much, but if we had concerns about this, we could cache them. I'd rather see us using the existing SendPackets code than creating a new SendFile operation.

@geoffkizer
Copy link
Contributor

We need to add Memory support on SendPackets as well: #45267

@gfoidl
Copy link
Member Author

gfoidl commented Nov 27, 2020

Raised issue for the serialization problem --> #45274

(will comment on other points later...)

@gfoidl
Copy link
Member Author

gfoidl commented Nov 30, 2020

Re: SendPackets

If we decide to go this route, #45267 should be done first in order to use the memory-support.

Just had a quick look into the possibility on how to implement SendFileAsync via SendPackets. It's relatively straightforward for Windows and *nix.
But I don't like the (allocation) overhead that the SendPackets-method has (and can't be avoided in a easy way), whereas the specific sendfile-operation doesn't introduce these allocations.
So it could be the other way around: if SendPackets "pattern matches" to preBuffer, file, and postBuffer, so use SendFileAsync.

I don't have any feeling how much these APIs will be used and how much concern we should put into avoiding allocations. So maybe it's best to just go with the simplest approach, and improve if there's (real) demand for it?

@gfoidl
Copy link
Member Author

gfoidl commented Jan 14, 2021

update: based on recent discussion this PR should be finished on top of #46975

@antonfirsov
Copy link
Member

antonfirsov commented Jan 14, 2021

I'm not sure the allocations matter much, but if we had concerns about this, we could cache them. I'd rather see us using the existing SendPackets code than creating a new SendFile operation.

@geoffkizer I think we should be careful here. Many sources and discussions refer to sendfile / TransmitFile as a go-to operation for high-performance scenarios. Since it's more general, winsock TransmitPackets might introduce some perf penalty, if used instead of TransmitFile. This, together with extra allocations might be unacceptable in high-perf use-cases.

The question is if we want to be fast by default, or deliver a functionally complete solution first, and investigate/improve performance later. Is there any value in a Task-based SendFileAsync if it's not the zero-overhead API users expect it to be?

@geoffkizer
Copy link
Contributor

Since it's more general, winsock TransmitPackets might introduce some perf penalty, if used instead of TransmitFile.

I think this is quite unlikely -- unless we have some additional info, I think we should assume TransmitPackets is as efficient as TransmitFile.

The extra allocations for using the SendPackets operation could be a concern; I don't have a strong opinion here. If it's easy to avoid this extra allocation, that would be good. If it's a real pain, I think we could probably live with it as is.

@gfoidl
Copy link
Member Author

gfoidl commented Jan 20, 2021

To move forward my plan looks like:

  1. separate the fix for Add Span overloads for Socket.SendFile #43846 (span overloads) into it's own PR -- change is more unrelated than I first thought
  2. close this PR
  3. create PR for SendFileAsync based on TransmitPackets -- so the functionality is delivered
  4. try a PR for SendFileAsync based on TrasmitFile -- to avoid the allocations needed for TransmitPackets

Does this seem reasonable?

@antonfirsov
Copy link
Member

@gfoidl this plan sounds perfect to me, it seems to deal with all the concerns raised here, please go ahead!

@gfoidl
Copy link
Member Author

gfoidl commented Jan 20, 2021

@gfoidl gfoidl closed this Jan 20, 2021
@gfoidl
Copy link
Member Author

gfoidl commented Jan 20, 2021

Just started working on the TransmitPackets based version, and realized that there's no support for cancellation.
So plumbing cancellation in + the more allocations + on Unix there's already SendFileAsyncInternal that can be used makes the TransmitPackets version less attractive.

I need a bit more thought on this, but ATM it seems that TransmitFile looks better for windows.

@geoffkizer
Copy link
Contributor

As I said above,

So I would suggestion taking one of the following approaches on both Windows and Linux:
(1) Just update the existing APM-based async paths to be Task-based (as you've done currently on Unix).
(2) Implement on top of SocketAsyncEventArgs (like you've done on Windows), but use SendPackets in the implementation.

I'm fine to do (1) for now.

Longer-term, we want to achieve the following goals:
(A) Ensure we have a complete set of Socket APIs across the board (i.e., everything supports Task, Memory, Span, cancellation where appropriate)
(B) Ensure we have first-class implementations of all the Task-based APIs (i.e. not wrapped around old IAsyncResult implementations, as these add unnecessary overhead)
(C) Reimplement the old IAsyncResult APIs by wrapping the Task-based APIs. This allows us to get rid of a ton of code and some ugly dependencies we currently have on the old IAsyncResult infrastructure. Socket is one of the few remaining places in .NET Core that we haven't completed this.
(D) Implement the Task-based APIs using SAEA under the covers. This allows us to cache SAEA instances to avoid unnecessary allocations, and also reduces redundant code paths.

If we want to punt on (D) for now because it's easier to just implement SendFileAsync directly, that's fine; but eventually we will want to meet all the goals above for SendFileAsync, so let's ensure whatever we do is working towards them, or at least not working away from them.

@karelz karelz added this to the 6.0.0 milestone Jan 26, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Feb 25, 2021
@gfoidl gfoidl deleted the socket-sendfileasync branch May 3, 2021 17:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Span overloads for Socket.SendFile Add Task-based async API for Socket.SendFile
6 participants