Streaming data to/from IndexedDB #419

dumbmatter · 2024-04-19T12:20:20Z

Now that the Streams API is widely supported, would it make sense to have some built-in IndexedDB API for streaming data to/from IndexedDB?

The problem now is that it is somewhat difficult and inefficient to write such functionality on your own. For example, if you want to create a ReadableStream that outputs all of the data in a giant object store, you can't just naively iterate over a cursor in ReadableStream.pull because the transaction will automatically close at some point. So you wind up kind of fighting against the stream trying to only read part of the data into memory at once, and IndexedDB closing a transaction when it's no longer active. Something like this:

const makeReadableStream = (db, store) => {
  let prevKey;

  return new ReadableStream({
    async pull(controller) {
      const range = prevKey !== undefined
        ? IDBKeyRange.lowerBound(prevKey, true)
        : undefined;

      const MIN_BATCH_SIZE = 100;
      let batchCount = 0;

      let cursor = await db.transaction(store).store.openCursor(range);
      while (cursor) {
        controller.enqueue(`${JSON.stringify(cursor.value)}\n`);
        prevKey = cursor.key
        batchCount += 1;

        if (controller.desiredSize > 0 || batchCount < MIN_BATCH_SIZE) {
          cursor = await cursor.continue();
        } else {
          break;
        }
      }

      console.log(`Done batch of ${batchCount} object`);

      if (!cursor) {
        // Actually done with this store, not just paused
        console.log("Completely done");
        controller.close();
      }
    },
  }, {
    highWaterMark: 100,
  });
};

In addition to that code being a little complicated to write, it's also probably slower than it needs to be due to creating many transactions over the course of a large stream.

I wrote a blog post about this a few years ago and if I search I still can't find anyone else talking about doing stuff like this, but I do get a couple people finding that article in Google every day and every now and again someone emails me about it, so I'm not literally the only person interested in this. Although I admit it's probably a niche use case. I do have hundreds of users every day exporting large amounts of data from IndexedDB in my video games, and that uses code similar to what I wrote in that blog post.

What would be better is maybe an API equivalent to getAll - a method on IDBObjectStore and IDBIndex that takes an IDBKeyRange and returns a stream of all matching records. And then maybe also an equivalent API for writing data to an object store.

The text was updated successfully, but these errors were encountered:

asutherland · 2024-04-19T17:12:00Z

xref #34 on explicit transaction lifetime control.

SteveBeckerMSFT · 2024-10-01T21:59:49Z

TPAC 2024: We discussed streaming large values with IDB reads and writes. We pointed out that this can be accomplished today using File and Blob, which can then be stored in IDB. However, this potentially forces developers to implement a two-phase commit between the large value in File/Blob and the IDB transaction.

Perhaps there is still an opportunity to improve IndexedDB's API ergonomics when interacting with the Streams API to reduce the amount of boiler plate code required by the example above. Also, as noted above, providing developers with explicit transaction lifetime control may reduce the number of transactions required when streaming data from IDB.

SteveBeckerMSFT added the TPAC2024 Topic for discussion at TPAC 2024 label Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming data to/from IndexedDB #419

Streaming data to/from IndexedDB #419

dumbmatter commented Apr 19, 2024

asutherland commented Apr 19, 2024

SteveBeckerMSFT commented Oct 1, 2024

Streaming data to/from IndexedDB #419

Streaming data to/from IndexedDB #419

Comments

dumbmatter commented Apr 19, 2024

asutherland commented Apr 19, 2024

SteveBeckerMSFT commented Oct 1, 2024