You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.
I'm thinking we can come up with a plan to yield a better .append implementation that defers stitching together arrays until it's actually needed for computations.
We can do this by having a virtual pandas::Table interface that will consolidate fragmented columns only when they are requested. Will think some more about this
The text was updated successfully, but these errors were encountered:
The obvious alternative is to allow pandas objects to backed by dynamic arrays. This is possible now that we require arrays to 1D and contiguous.
This has the advantage of still using eager evaluation, so you don't need to build machinery for differed evaluation. Also, you still get predictable performance, even if you inspect the array in between appends. I would guess looking at DataFrames being appended piece-by-piece is pretty common, even if only to check the size.
The downside is that this wouldn't really work with the current interface, because such appends need to in-place. Also, dynamic arrays reduce speed and increase memory requirements by small constant multiples.
Maybe it would make sense to deprecate DataFrame.append and instead make an alternative DynamicDataFrame (sub?)class that does an in-place append?
I'm thinking we can come up with a plan to yield a better .append implementation that defers stitching together arrays until it's actually needed for computations.
We can do this by having a virtual
pandas::Table
interface that will consolidate fragmented columns only when they are requested. Will think some more about thisThe text was updated successfully, but these errors were encountered: