You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pandas had/has an open bug where dtypes passed into the read_csv call are ignored. pandas#9435
If you have a buggy version of pandas all the fields are marked as f4 and this makes it impossible to correctly use particle_id as an index as you endup with collisions due to truncation.
Ex: 369295375619592193 vs 369295375602810880
This silently fails and will lead to a lot of confusion.
The text was updated successfully, but these errors were encountered:
AFAIK, it is still an open issue and my brief investigation showed no consensus.
I'm using Python 3.5.0, pandas, 0.17.0, numpy 1.9.3, on Win10.
I had to modifly _load_event_data() to not pass in dtypes in order to get full precision particle_id.
If you don't see this behavior I can try to make a pull for you.
After further digging, this is actually related to pandas#11617
slicing the entire table by rows using at/loc/iterrows coerces the dtypes of the sliced rows. 0.23.1 still exhibits this behavior.
You can see this by:
hits, cells, particles, truths = load_event(...)
particles.iloc[0].particle_id.dtype
Why it downcasts to float32 I don't know. but changing the other dtypes to f8/i8 as needed will cause everything to be cast to float64 which is a workaround but will only provide 53bits of precision for particle_id.
pandas had/has an open bug where dtypes passed into the read_csv call are ignored. pandas#9435
If you have a buggy version of pandas all the fields are marked as f4 and this makes it impossible to correctly use particle_id as an index as you endup with collisions due to truncation.
Ex: 369295375619592193 vs 369295375602810880
This silently fails and will lead to a lot of confusion.
The text was updated successfully, but these errors were encountered: