Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: concurrent disk image write. #2395

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

iwanbk
Copy link
Member

@iwanbk iwanbk commented Aug 13, 2024

Write it concurrently to speed it up from the previous sequential write.

Description

Change image DiskWrite from sequential to concurrent, to make it faster.

Changes

  • change to concurrent write

Related Issues

Fixes :

Checklist

  • Tests included -> manual test
  • Build pass
  • Documentation
  • Code format and docstring

Write it concurrently to speed it up from the previous sequential write.
Copy link
Member

@muhamadazmy muhamadazmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this looks good. But I would love to see a benchmarking that compares the performance of the 2 approaches.

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

@muhamadazmy
The benchmarking already in the linked issue.
Including here because here is indeed the better place

I've tried to paralelize the io.Copy with this environment:

  • zos node: In Indonesia, run on my old 2016 PC on qemu. (indonesia is close to Australia)
  • os image: nixos

the results:

  • original io.Copy: 1 hour (much worse than the reported 20 mins :))
  • 5 goroutines: 20 mins
  • 10 goroutines: 20 mins
  • 15 goroutines: 24 mins

@muhamadazmy
Copy link
Member

That's cool.

One thing i had to add. Flists with raw images are obsolete and should not be used anymore. Zos still supports it for backwards compatibility only but definitely new workloads with this kind of images should not be allowed.

@muhamadazmy
Copy link
Member

Forgot to say that make sure to clean up the cache between the benchmarks runs. Since rfs caches the downloaded content in zos-cache. Means that second run will always go faster than first run since it doesn't have to download the image again

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

Forgot to say that make sure to clean up the cache between the benchmarks runs. Since rfs caches the downloaded content in zos-cache. Means that second run will always go faster than first run since it doesn't have to download the image again

sure, it only took ~2 mins with the cache.

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

actually i not only deleted the cache, but delete all the qemu cow disks because i don't know how to delete it.
I guess it is under /var/cache/modules/flistd?

var/cache/modules/flistd # ls
cache       flist       log         mountpoint  pid         ro

pkg/storage/disk.go Outdated Show resolved Hide resolved
// the sequential write is slow because the data source is from the remote server.
var (
// use errgroup because there is no point in continuing if one of the goroutines failed
group = new(errgroup.Group)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be usable on the zero value or with a context if required

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be usable on the zero value

could you elaborate more on this?

or with a context if required

not possible for now:

  • the func doesn't have context
  • io.Copy itself doesn't have context.

I think make it context aware will need separate ticket

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think make it context aware will need separate ticket

DiskWrite handled by storaged, called by provisiond through zbus.
So, plain Go context will not really work here.
But of course we can start with context inside storaged.

As a side note, i noticed that DiskWrite keep going even tho provisiond already cancel the deployment/contract.
So, we already have real cancellation issue over zbus.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please mention the cancellation problem as a known issue on the top of PR to be easily found later on

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please mention the cancellation problem as a known issue on the top of PR to be easily found later on

Maybe create separate issue because it is not really specific to this PR

}
wr := io.NewOffsetWriter(file, start)
rd := io.NewSectionReader(source, start, len)
_, err = io.Copy(wr, rd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using io.CopyBuffer could avoid temporal buffers https://pkg.go.dev/io#CopyBuffer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, no need to call file.Sync?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using io.CopyBuffer could avoid temporal buffers https://pkg.go.dev/io#CopyBuffer

the difference between io.Copy and CopyBuffer is : CopyBuffer lets us choose our own buffer.
io.Copy will also use buffer.
The slowness here is because of downloading the file from the hub, so i don't think that the buffer will make big differences here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, no need to call file.Sync?

i don't think so, we're copying to rfs anyway.
CMIIW

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io.CopyBuffer uses a buffer your provide instead of io.Copy keeping allocating them during every call (it's still an io to save imo)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of io.Copy keeping allocating them during every call

No, the buffer used by io.Copy also only allocated once.
See https://github.com/golang/go/blob/master/src/io/io.go#L417-L427

if what you mean is creating global buffer, outside the DiskWrite func then it would be lot of works for very small benefit.

@xmonader
Copy link
Collaborator

All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?

can you elaborate more on this?
Is it because HDD only node will be slow?
And how is provisiond behavior regarding this?

@xmonader
Copy link
Collaborator

All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?

can you elaborate more on this? Is it because HDD only node will be slow? And how is provisiond behavior regarding this?

zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency

@xmonader
Copy link
Collaborator

xmonader commented Aug 13, 2024

on another note I'd also add the concept of retries to the code if possible

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency

This is true if we write to to regular file, but this is not the case here:

  1. the destination is rfs, and rfs doesn't the image in a single regular file.
  2. also need to be aware that the slowness is on the downloading side (from hub) , not on the write side to the disk image

As a side note, current code is favoring SSD disks

@iwanbk
Copy link
Member Author

iwanbk commented Aug 13, 2024

on another note I'd also add the concept of retries to the code if possible

why not handle it inside rfs?

@iwanbk
Copy link
Member Author

iwanbk commented Aug 16, 2024

zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency

I assume it won't happen because the slowness from remote rfs will cover it.
But because it is hard to test, i think disable it on HDD-only is safer.

fixed it on caf83ac @xmonader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants