downloads/installs slowly #184

cgwalters · 2020-03-11T17:32:22Z

In kola testiso we spawn a local webserver and point qemu at it. The legacy installer downloads and writes to disk in about one second. This new one takes...about 4 minutes. Something must be going terribly wrong 😄

I didn't bother to try to debug this much yet, but it seems like the host webserver is spending a lot of time idle.

The text was updated successfully, but these errors were encountered:

jlebon · 2020-03-11T20:21:04Z

I'm going to guess two things: not using O_DIRECT anymore (via dd's oflag=direct). and possibly using a small block size (with dd we had bs=1M). std::io::copy does opportunistically use copy_file_range, but I'm not sure if this works for block devices. It may be falling back to a read & write loop in 8K blocks, which... will take a while. Probably worth using our own 1M read & write loop and possibly using O_DIRECT.

Edit: halfway through that comment I actually did check the Rust implementation but forgot to link to the relevant bit: https://github.com/rust-lang/rust/blob/c20d7eecbc0928b57da8fe30b2ef8528e2bdd5be/src/libstd/sys/unix/fs.rs#L1202

bgilbert · 2020-04-02T19:30:18Z

That's fs::copy. io::copy seems to be doing a straightforward copy loop.

jlebon · 2020-04-02T19:36:30Z

Ahh yup, you're right. So we're definitely falling back to 8k blocks then.

bgilbert · 2020-04-02T21:08:36Z

We could pretty easily reimplement copy() on top of copy_n() from #187.

cgwalters · 2020-04-02T21:48:08Z

I did this but it fails to compile because...well, Rust correctly points out that it wouldn't be threadsafe 🎉

diff --git a/Cargo.lock b/Cargo.lock
index 5f75a3a..594370f 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -119,6 +119,7 @@ dependencies = [
  "byte-unit 3.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
  "clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)",
  "cpio 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
+ "crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)",
  "error-chain 0.12.2 (registry+https://github.com/rust-lang/crates.io-index)",
  "flate2 1.0.14 (registry+https://github.com/rust-lang/crates.io-index)",
  "maplit 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -146,6 +147,16 @@ dependencies = [
  "cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
 ]
 
+[[package]]
+name = "crossbeam-utils"
+version = "0.7.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+dependencies = [
+ "autocfg 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
+ "cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
+ "lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
+]
+
 [[package]]
 name = "dtoa"
 version = "0.4.5"
@@ -1195,6 +1206,7 @@ dependencies = [
 "checksum core-foundation-sys 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)" = "e7ca8a5221364ef15ce201e8ed2f609fc312682a8f4e0e3d4aa5879764e0fa3b"
 "checksum cpio 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "f00177cda1bb5c5b0a05b1b506125c776411cdfc6a41085126881266e0cdc242"
 "checksum crc32fast 1.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ba125de2af0df55319f41944744ad91c71113bf74a4646efff39afe1f6842db1"
+"checksum crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)" = "c3c7c73a2d1e9fc0886a08b93e98eb643461230d5f1925e4036204d5f2e261a8"
 "checksum dtoa 0.4.5 (registry+https://github.com/rust-lang/crates.io-index)" = "4358a9e11b9a09cf52383b451b49a169e8d797b68aa02301ff586d70d9661ea3"
 "checksum encoding_rs 0.8.22 (registry+https://github.com/rust-lang/crates.io-index)" = "cd8d03faa7fe0c1431609dfad7bbe827af30f82e1e2ae6f7ee4fca6bd764bc28"
 "checksum error-chain 0.12.2 (registry+https://github.com/rust-lang/crates.io-index)" = "d371106cc88ffdfb1eabd7111e432da544f16f3e2d7bf1dfe8bf575f1df045cd"
diff --git a/Cargo.toml b/Cargo.toml
index 956524c..481d8a6 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -27,6 +27,7 @@ lto = true
 byte-unit = "^3.0"
 clap = "^2.33"
 cpio = "^0.2"
+crossbeam-utils = "0.7.2"
 error-chain = { version = "^0.12", default-features = false }
 flate2 = "^1.0"
 nix = "^0.17"
diff --git a/src/download.rs b/src/download.rs
index 45761be..38ea83c 100644
--- a/src/download.rs
+++ b/src/download.rs
@@ -13,12 +13,14 @@
 // limitations under the License.
 
 use byte_unit::Byte;
+use crossbeam_utils::thread;
 use error_chain::bail;
 use flate2::read::GzDecoder;
 use nix::unistd::isatty;
 use progress_streams::ProgressReader;
 use std::fs::{remove_file, File, OpenOptions};
-use std::io::{copy, stderr, BufRead, BufReader, Read, Seek, SeekFrom, Write};
+use std::io::{stderr, BufRead, BufReader, Read, Seek, SeekFrom, Write};
+use std::os::unix::io::FromRawFd;
 use std::num::NonZeroU32;
 use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
@@ -162,6 +164,35 @@ fn write_image_and_sig(
     Ok(())
 }
 
+fn threaded_pipe_copy<R : Read + Send>(src: R, dest: &mut File) -> Result<()> {
+    let mut buf = Vec::new();
+    buf.reserve_exact(64 * 1024 * 1024);
+    let mut buf = buf.as_mut_slice();
+
+    let (pipe_r, pipe_w) = nix::unistd::pipe()?;
+    let mut pipe_r = unsafe { std::fs::File::from_raw_fd(pipe_r) };
+    let mut pipe_w = unsafe { std::fs::File::from_raw_fd(pipe_w) };
+    Ok(thread::scope(|s| -> Result<()> {
+        let h1 = s.spawn(move |_| -> Result<()> {
+            loop {
+                let len = match src.read(buf) {
+                    Ok(0) => return Ok(()),
+                    Ok(len) => len,
+                    Err(ref e) if e.kind() == std::io::ErrorKind::Interrupted => continue,
+                    Err(e) => return Err(e.into()),
+                };
+                pipe_w.write_all(&buf[..len])?;
+            }
+        });
+        let h2 = s.spawn(move |_| -> Result<()> {
+            std::io::copy(&mut pipe_r, dest)?;
+            Ok(())
+        });
+        h1.join().expect("thread join")?;
+        Ok(h2.join().expect("thread join")?)
+    }).expect("scope join")?)
+}
+
 /// Copy the image to disk and verify its signature.
 pub fn write_image(
     source: &mut ImageSource,
@@ -286,7 +317,7 @@ pub fn write_image(
     // sparse-copy the image, falling back to non-sparse copy if hardware
     // acceleration is unavailable.  But BLKZEROOUT doesn't support
     // BLKDEV_ZERO_NOFALLBACK, so we'd risk gigabytes of redundant I/O.
-    copy(&mut decompress_reader, dest).chain_err(|| "decoding and writing image")?;
+    threaded_pipe_copy(&mut decompress_reader, dest).chain_err(|| "decoding and writing image")?;
 
     // verify_reader has now checked the signature, so fill in the first MiB
     dest.seek(SeekFrom::Start(0))

This is an interesting case because we definitely want to run decompression in a separate thread; but it's also intermixed with progress and GPG and...

cgwalters · 2020-04-02T22:06:18Z

Also, we should be using O_DIRECT right?

cgwalters · 2020-04-23T20:51:35Z

OK wow I think this got worse:

[core@ibm-p8-kvm-03-guest-02 ~]$ [   24.279163] coreos-installer[1543]: Read disk 55.2 MiB/854.7 MiB (6%)
[   25.282045] coreos-installer[1543]: Read disk 66.4 MiB/854.7 MiB (7%)
[   26.284139] coreos-installer[1543]: Read disk 75.7 MiB/854.7 MiB (8%)
[   27.508213] coreos-installer[1543]: Read disk 76.3 MiB/854.7 MiB (8%)
[   28.547356] coreos-installer[1543]: Read disk 76.3 MiB/854.7 MiB (8%)
[   30.954612] coreos-installer[1543]: Read disk 76.9 MiB/854.7 MiB (9%)
[   32.736845] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[   34.501298] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[   36.193813] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[   37.962886] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[   39.774410] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[   41.607017] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[   42.616462] coreos-installer[1543]: Read disk 77.2 MiB/854.7 MiB (9%)
[   44.969971] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[   46.725812] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[   48.567503] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[   49.973242] coreos-installer[1543]: Read disk 79.4 MiB/854.7 MiB (9%)
[   50.975088] coreos-installer[1543]: Read disk 94.3 MiB/854.7 MiB (11%)
[   53.773692] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[   55.649689] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[   57.514102] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[   59.439394] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[   61.198396] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[   62.947567] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[   64.696044] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[   66.244836] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[   68.015308] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[   69.766382] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[   71.554605] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[   73.299237] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[   75.035509] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[   76.786172] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[   78.529187] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[   80.278279] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[   82.031717] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[   83.782988] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[   85.560431] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[   87.320502] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[   89.070928] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[   90.853538] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[   92.603431] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[   94.353034] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[   95.387746] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[   96.390873] coreos-installer[1543]: Read disk 103.9 MiB/854.7 MiB (12%)
[   97.390964] coreos-installer[1543]: Read disk 107.8 MiB/854.7 MiB (12%)
[   98.395250] coreos-installer[1543]: Read disk 111.4 MiB/854.7 MiB (13%)
[   99.401100] coreos-installer[1543]: Read disk 115.4 MiB/854.7 MiB (13%)
[  100.416172] coreos-installer[1543]: Read disk 118.0 MiB/854.7 MiB (13%)
[  101.422613] coreos-installer[1543]: Read disk 120.9 MiB/854.7 MiB (14%)
[  102.429683] coreos-installer[1543]: Read disk 124.7 MiB/854.7 MiB (14%)
[  103.429702] coreos-installer[1543]: Read disk 128.5 MiB/854.7 MiB (15%)
[  104.433470] coreos-installer[1543]: Read disk 131.7 MiB/854.7 MiB (15%)
[  105.435846] coreos-installer[1543]: Read disk 135.8 MiB/854.7 MiB (15%)
[  106.437101] coreos-installer[1543]: Read disk 138.7 MiB/854.7 MiB (16%)
[  107.437893] coreos-installer[1543]: Read disk 143.0 MiB/854.7 MiB (16%)
[  108.444530] coreos-installer[1543]: Read disk 146.3 MiB/854.7 MiB (17%)

That's ~1.08 MB/s which is absolutely awful.

dustymabe · 2020-04-23T22:31:38Z

I've noticed it being really really slow lately too. Like REALLY slow.

cgwalters · 2020-04-24T20:40:41Z

Here's a profile from current git master:

https://perfht.ml/2xLdVoJ

Basically I think what this is saying is that we are losing out by doing decompression and I/O on the same core - this type of thing is where classical Unix pipelines like curl | gunzip > /dev/disk win.

cgwalters · 2020-04-24T20:42:08Z

BTW I learned about the cool profiler GUI above from this: https://www.reddit.com/r/rust/comments/g0wkp0/til_firefox_profiler_makes_an_awesome_gui_for_perf/

Attempt at fixing coreos#184 but doesn't seem to help.

cgwalters · 2020-04-24T21:10:31Z

OK and here's some code I wrote to try to do the decompression/write in separate threads:
https://github.com/cgwalters/coreos-installer/tree/copy-faster
But...it seems not to help not too much...reduces 25 seconds to 20 or so:
https://profiler.firefox.com/public/7aa5ff8e8b9225854fcbb3cf793874fb9c2363da/calltree/?globalTrackOrder=0&localTrackOrderByPid=1936-0-1~&thread=0&v=4

lucab · 2020-04-24T21:12:26Z

@cgwalters your initial profile also had a lot of trips through read and write, are we possibly missing some buffering? Also, does anything change if you switch the underlying implementation in flate2?

cgwalters · 2020-04-24T21:20:41Z

Hmm you know, a raw

[root@toolbox ~]# time /bin/sh -c 'gunzip < /host/var/mnt/workdir-tmp/fedora-coreos-31.20200327.dev.2-metal.x86_64.raw.gz > /dev/vdb'

real	0m21.823s
user	0m10.501s
sys	0m1.403s

Is basically the same order as the installer, as well as:

[root@toolbox ~]# time /bin/sh -c 'gunzip < /host/var/mnt/workdir-tmp/fedora-coreos-31.20200327.dev.2-metal.x86_64.raw.gz | dd of=/dev/vdb iflag=fullblock bs=8M oflag=direct'
341+0 records in
341+0 records out
2860515328 bytes (2.9 GB, 2.7 GiB) copied, 21.8208 s, 131 MB/s

real	0m21.823s
user	0m10.621s
sys	0m1.622s
[root@toolbox ~]#

In fact I'm being stupid here, 20s is nowhere near the minutes I've seen elsewhere. So the problem may actually lie somewhere else in testiso.go somehow?

cgwalters · 2020-04-24T21:28:46Z

And of course for my tests I should have fetched over HTTP...I just thought the problem was lower level than that. Here's a http fetch profile:

https://profiler.firefox.com/public/5c8ac1a2603e6d04abe1eb8101b389f164fdc0e2/calltree/?globalTrackOrder=0&localTrackOrderByPid=2366-0-1-2~&thread=0&v=4

dustymabe · 2020-04-27T22:02:30Z

I've noticed better performance using --image-file:

$ time sudo coreos-installer install /dev/sda --ignition-file example.ign --insecure --image-file ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz
Copying image from ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz
Reading signature from ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz.sig
Couldn't read signature file: No such file or directory (os error 2)
Signature not found; skipping verification as requested
> Read disk 712.8 MiB/712.8 MiB (100%)   
Writing Ignition config
Install complete.

real    0m53.368s
user    0m10.889s
sys     0m4.245s

This is opposed to 8+ minutes at times over local http (via python3 -m http.server).

bgilbert · 2020-07-07T13:58:10Z

#290 switched to a copy() implementation with a larger buffer.

bgilbert · 2020-07-26T02:33:52Z

I spent some more time on this and got nowhere, except to find that #290 was incorrect. #315 should be more correct, but may not help much either.

cgwalters added a commit to cgwalters/coreos-installer that referenced this issue Apr 24, 2020

WIP: run decompression/progress in separate thread from writes

e57c5d1

Attempt at fixing coreos#184 but doesn't seem to help.

dustymabe mentioned this issue May 11, 2020

PXE: coreos-installer error "end of file before message length reached" coreos/fedora-coreos-tracker#439

Closed

cgwalters mentioned this issue Jul 27, 2020

Drop copy() reimplementation; use buffered I/O with larger buffers #315

Merged

bgilbert mentioned this issue Mar 19, 2021

cargo: refresh lockfile #490

Closed

bgilbert mentioned this issue Jul 26, 2021

cargo: optimize gunzip inner loops in dev builds #579

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

downloads/installs slowly #184

downloads/installs slowly #184

cgwalters commented Mar 11, 2020

jlebon commented Mar 11, 2020 •

edited

Loading

bgilbert commented Apr 2, 2020

jlebon commented Apr 2, 2020

bgilbert commented Apr 2, 2020

cgwalters commented Apr 2, 2020

cgwalters commented Apr 2, 2020

cgwalters commented Apr 23, 2020 •

edited

Loading

dustymabe commented Apr 23, 2020

cgwalters commented Apr 24, 2020 •

edited

Loading

cgwalters commented Apr 24, 2020

cgwalters commented Apr 24, 2020

lucab commented Apr 24, 2020

cgwalters commented Apr 24, 2020

cgwalters commented Apr 24, 2020

dustymabe commented Apr 27, 2020 •

edited

Loading

bgilbert commented Jul 7, 2020

bgilbert commented Jul 26, 2020

downloads/installs slowly #184

downloads/installs slowly #184

Comments

cgwalters commented Mar 11, 2020

jlebon commented Mar 11, 2020 • edited Loading

bgilbert commented Apr 2, 2020

jlebon commented Apr 2, 2020

bgilbert commented Apr 2, 2020

cgwalters commented Apr 2, 2020

cgwalters commented Apr 2, 2020

cgwalters commented Apr 23, 2020 • edited Loading

dustymabe commented Apr 23, 2020

cgwalters commented Apr 24, 2020 • edited Loading

cgwalters commented Apr 24, 2020

cgwalters commented Apr 24, 2020

lucab commented Apr 24, 2020

cgwalters commented Apr 24, 2020

cgwalters commented Apr 24, 2020

dustymabe commented Apr 27, 2020 • edited Loading

bgilbert commented Jul 7, 2020

bgilbert commented Jul 26, 2020

jlebon commented Mar 11, 2020 •

edited

Loading

cgwalters commented Apr 23, 2020 •

edited

Loading

cgwalters commented Apr 24, 2020 •

edited

Loading

dustymabe commented Apr 27, 2020 •

edited

Loading