Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve dag/import #9721

Merged
merged 2 commits into from
Mar 22, 2023
Merged

feat: improve dag/import #9721

merged 2 commits into from
Mar 22, 2023

Conversation

MichaelMure
Copy link
Contributor

  • don't bypass the CoreApi
  • don't use a goroutine and return channel for importWorker, when what's happening is really just a synchronous call
  • only PinLock() when we are going to pin
  • use cid.Set instead of an explicit map
  • fail the request early if any pinning fail, no need to try to pin more if the request failed already

@MichaelMure MichaelMure force-pushed the better-dag-import branch 2 times, most recently from 5178319 to d62b713 Compare March 14, 2023 18:00
@Jorropo Jorropo self-requested a review March 14, 2023 18:02
@Jorropo
Copy link
Contributor

Jorropo commented Mar 14, 2023

That way too clean looking it's suspicious.
Does --progress still work ? (I think that why it was using a goroutine in the first place)

- don't bypass the CoreApi
- don't use a goroutine and return channel for `importWorker`, when what's happening is really just a synchronous call
- only `PinLock()` when we are going to pin
- use `cid.Set` instead of an explicit map
- fail the request early if any pinning fail, no need to try to pin more if the request failed already
@MichaelMure
Copy link
Contributor Author

There is no --progress on that command.

@Jorropo
Copy link
Contributor

Jorropo commented Mar 14, 2023

Yes my bad, idk how the code managed to be so bad before.

@MichaelMure MichaelMure marked this pull request as ready for review March 21, 2023 15:37
Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, previously it would concurrently import the dag and pin it in two different goroutines. This cause a situation where we reread from the blockstore what we just wrote to the blockstore.
Also if the rereads are faster than the writes we would try to fetch over bitswap blocks that we will receive in the car for a split second.

Now you do that serialy which makes the latency worst because the rereads are not parallised with the writes.

I think this could be solved by using a new consumer driven API for the pinning, that means you start a pin, the pinner code maintains internal states (cid.Set and friends) and it has some magic callback where the consumer code (ipfs dag import) can add more block to the pin state.

So I would expect something like this (pseudo code) (? is error handling omited):

bs := api.Blockstore()

c, roots, ? := car.NewReader(req.Body)
pinstate := api.Pin().AddClientDriven(bs, roots, options.Pin.Recursive(true)) // we need bs in case we have blocks out of order
forLoop:
for {
  block, err := c.Next()
  switch err {
  case nil:
  case io.EOF:
    break forLoop
  default:
    // Handle error
  }
  
  ? = bs.Put(block)
  ? = pinstate.Add(block)
}

? = pinstate.Commit() // ensure that the pin state is complete

This still have perf similar to the two goroutine solution but does not double read.
Implementing AddClientDriven sounds somewhat straightforward (implement a tricoloring (generator or event loop) pattern)

@MichaelMure
Copy link
Contributor Author

Correct me if I'm wrong, previously it would concurrently import the dag and pin it in two different goroutines. This cause a situation where we reread from the blockstore what we just wrote to the blockstore. Also if the rereads are faster than the writes we would try to fetch over bitswap blocks that we will receive in the car for a split second.

No. Because of https://github.com/ipfs/kubo/blob/master/core/commands/dag/import.go#L55, that goroutine is nothing more than a function call. And yes, all the blocks would be written, then read again.

Now you do that serialy which makes the latency worst because the rereads are not parallised with the writes.

As per above, I'm not changing that, merely making it explicit.

The underlying problem really is that there is no guarantee that the input CAR file has all the blocks required for the pin. (Nor btw, that there is no extra blocks, but that's another matter).

To make things a little more funny, it seems like kubo accept multiple CAR files in the same request (is there really a point?) so the DAG can be spread over those multiple files. This is why I have to wait for all CAR to be imported before starting to check the DAGs.

This is why this process need to 1) accept and write the blocks and 2) check the pinned DAG completeness. It'd be great to have a way to do that on the fly while writing, but I don't think that exist?

I think this could be solved by using a new consumer driven API for the pinning, that means you start a pin, the pinner code maintains internal states (cid.Set and friends) and it has some magic callback where the consumer code (ipfs dag import) can add more block to the pin state.

As per above, that doesn't make sense imho, but if you go in that direction I would very much for you to consider ipfs/boxo#378. It would be so much better to have that command, where there is knowledge on the context, what block are being written and so on, to check that DAG and finally trigger PinWithMode. The Pinner definitely does not need to be part of the equation.

Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this all make sense now.
It does sounds like https://github.com/ipfs/go-ipfs-pinner/issues/25 is what I wanted actually, I just didn't knew about it.

Thx for the refactor, the code is much better now.

@Jorropo
Copy link
Contributor

Jorropo commented Mar 22, 2023

Unrelated pubsub sharness failure. (I should really fix this flaky CI)

@Jorropo Jorropo merged commit 1457b4f into ipfs:master Mar 22, 2023
galargh added a commit that referenced this pull request Mar 23, 2023
@hsanjuan
Copy link
Contributor

@MichaelMure if you're worried about slow dag import, it is mostly stuff mentioned in #9678. One of the main things you can do is write blocks directly without going through the IPLD layer batch commit thing, which forces decoding and encoding blocks for nothing and commits in tiny batches that slow things down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants