You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As found in the unit tests1, to open a fragment we require:
4 IOPS per data file
(optionally) 1 IOP for a deletion file
If we started storing the size of data files in the manifest, we could skip the first IOP to get the size of the file.
The remaining IOPS are retrieving data from the end of the file. I haven't fully investigated, but I bet they are all within the last block for many file, so we might be able to get them all in a single IOP.
With these two optimizations, that would bring our IOPS per fragment open down to 1 per data file plus (maybe) 1 for the deletion file.
As found in the unit tests1, to open a fragment we require:
If we started storing the size of data files in the manifest, we could skip the first IOP to get the size of the file.
The remaining IOPS are retrieving data from the end of the file. I haven't fully investigated, but I bet they are all within the last block for many file, so we might be able to get them all in a single IOP.
With these two optimizations, that would bring our IOPS per fragment open down to 1 per data file plus (maybe) 1 for the deletion file.
Footnotes
https://github.com/lancedb/lance/blob/d6476fd4fe09737104540d5d05d25a2cde3f643b/rust/lance/src/dataset/fragment.rs#L2418-L2425 ↩
The text was updated successfully, but these errors were encountered: