-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define data model for representing points and trajectories #12
Comments
In a discussion with @lochhh, we agreed to first try using the SLEAP data model. |
Just adding some more thoughts on this -- we've gone back and forth a lot on the appropriate data structure for pose data. SLEAP's object-oriented model is clean and Pythonic (it's basically a bunch of The downside is that it's not always the most efficient depending on the access pattern. When you're doing labeling, random access creation of a single point or instance is necessary since users label one animal at a time. But imagine repeated serialization/deserialization -- if you have a Python object for every point, you're going to be instantiating hundreds of thousands to millions of little objects! When you're doing complex queries, it's super inefficient. Consider the use case where you want to ask for all the frames in which there are N animals with body part pairs A and B within distance K of each other. This now requires a full iteration over all T frames (where T >> 1e6 oftentimes), and every instance within the frame, resulting in a O(T * N) operation -- assuming the labels are stored sequentially and not hashed by something else (like in multi-video projects). I think the best of both worlds -- and what we'd eventually like to have in If you're going down the object-oriented model route, consider using a framework like attrs or plain dataclasses for readability and reducing boilerplate. See also these considerations with regards to performance and usability: [1] [2] [3] In any case, give it a go for your test cases, benchmark it, and feel free to reach out if you need any feedback or have any for us! |
Thank you for chiming in on this @talmo. The SLEAP data model appealed to us precisely because of the flexibility you mentioned (and a desire to not reinvent the wheel), but the performance considerations may indeed become a bottleneck. Not so much for our envisioned alpha product (import, smooth and plot tracks) but definitely for more complex kinematic analyses like the example you mentioned. I am keen to stay in touch and follow the developments over at For now, we will likely try adopting the Just out of curiosity, have you given dask much thought? We have benefited from Also thanks for the |
After some research and internal discussions, we decided to try using
Multiple
I'lle give it a try and see if we can discover some unknown "cons" before we fully commit to it as a backend. |
Would definitely recommend If using it for training data, I'd advise checking out some of the discussions in rly/ndx-pose#9 for workflow-specific considerations. Basically, you may not want to over-optimize for timeseries since most annotation for pose is done in single images that are explicitly not consecutive in time. |
Thanks for the input! Most of the things we want to do will operate on the prediction results only. |
Define custom classes for representing points (animal body parts) and series of points (animal trajectories) in space.
These could be sub-classes of
np.record
andnp.recarray
respectively, to access fields (e.g. 'x'. 'y', 'name', 'confidence') as attributes.This is the approach SLEAP takes. We could also directly use or subclass the SLEAP objects.
The text was updated successfully, but these errors were encountered: