-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #75272 - the8472:spec-copy, r=KodrAus
specialize io::copy to use copy_file_range, splice or sendfile Fixes #74426. Also covers #60689 but only as an optimization instead of an official API. The specialization only covers std-owned structs so it should avoid the problems with #71091 Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar. There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors. The test case executes the following: ``` [pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0 [pid 103776] write(4, "wxyz", 4) = 4 [pid 103776] write(4, "iklmn", 5) = 5 [pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5 ``` 0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted 𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`. 𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing. ## Benchmarks ``` OLD test io::tests::bench_file_to_file_copy ... bench: 21,002 ns/iter (+/- 750) = 6240 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 35,704 ns/iter (+/- 1,108) = 3671 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 57,002 ns/iter (+/- 4,205) = 2299 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 142,640 ns/iter (+/- 77,851) = 918 MB/s NEW test io::tests::bench_file_to_file_copy ... bench: 14,745 ns/iter (+/- 519) = 8889 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 6,128 ns/iter (+/- 227) = 21389 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 13,767 ns/iter (+/- 3,767) = 9520 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 26,471 ns/iter (+/- 6,412) = 4951 MB/s ```
- Loading branch information
Showing
11 changed files
with
930 additions
and
152 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
use crate::io::{self, ErrorKind, Read, Write}; | ||
use crate::mem::MaybeUninit; | ||
|
||
/// Copies the entire contents of a reader into a writer. | ||
/// | ||
/// This function will continuously read data from `reader` and then | ||
/// write it into `writer` in a streaming fashion until `reader` | ||
/// returns EOF. | ||
/// | ||
/// On success, the total number of bytes that were copied from | ||
/// `reader` to `writer` is returned. | ||
/// | ||
/// If you’re wanting to copy the contents of one file to another and you’re | ||
/// working with filesystem paths, see the [`fs::copy`] function. | ||
/// | ||
/// [`fs::copy`]: crate::fs::copy | ||
/// | ||
/// # Errors | ||
/// | ||
/// This function will return an error immediately if any call to [`read`] or | ||
/// [`write`] returns an error. All instances of [`ErrorKind::Interrupted`] are | ||
/// handled by this function and the underlying operation is retried. | ||
/// | ||
/// [`read`]: Read::read | ||
/// [`write`]: Write::write | ||
/// | ||
/// # Examples | ||
/// | ||
/// ``` | ||
/// use std::io; | ||
/// | ||
/// fn main() -> io::Result<()> { | ||
/// let mut reader: &[u8] = b"hello"; | ||
/// let mut writer: Vec<u8> = vec![]; | ||
/// | ||
/// io::copy(&mut reader, &mut writer)?; | ||
/// | ||
/// assert_eq!(&b"hello"[..], &writer[..]); | ||
/// Ok(()) | ||
/// } | ||
/// ``` | ||
#[stable(feature = "rust1", since = "1.0.0")] | ||
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64> | ||
where | ||
R: Read, | ||
W: Write, | ||
{ | ||
cfg_if::cfg_if! { | ||
if #[cfg(any(target_os = "linux", target_os = "android"))] { | ||
crate::sys::kernel_copy::copy_spec(reader, writer) | ||
} else { | ||
generic_copy(reader, writer) | ||
} | ||
} | ||
} | ||
|
||
/// The general read-write-loop implementation of | ||
/// `io::copy` that is used when specializations are not available or not applicable. | ||
pub(crate) fn generic_copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64> | ||
where | ||
R: Read, | ||
W: Write, | ||
{ | ||
let mut buf = MaybeUninit::<[u8; super::DEFAULT_BUF_SIZE]>::uninit(); | ||
// FIXME: #42788 | ||
// | ||
// - This creates a (mut) reference to a slice of | ||
// _uninitialized_ integers, which is **undefined behavior** | ||
// | ||
// - Only the standard library gets to soundly "ignore" this, | ||
// based on its privileged knowledge of unstable rustc | ||
// internals; | ||
unsafe { | ||
reader.initializer().initialize(buf.assume_init_mut()); | ||
} | ||
|
||
let mut written = 0; | ||
loop { | ||
let len = match reader.read(unsafe { buf.assume_init_mut() }) { | ||
Ok(0) => return Ok(written), | ||
Ok(len) => len, | ||
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue, | ||
Err(e) => return Err(e), | ||
}; | ||
writer.write_all(unsafe { &buf.assume_init_ref()[..len] })?; | ||
written += len as u64; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.