-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance of median
and iqr
compared to python libraries
#3462
Comments
This has nothing to do with DataFrames. You want an algo for median which runs in O(n). You need an algo which uses the median of medians concept. The implementation you use in Statistics.jl does not seem to be O(n), but I could be incorrect. |
@stensmo I was hoping for a function from DataFrames.jl that would be comparable in performance to |
In Julia DataFrames, you can apply (almost) any function, including your own. The median function does not belong to DataFrames, but it is a standard function in Julia. Writing your own functons and applying them to a DataFrame is super easy in Julia. That is why you will love it, but it takes some time to get used to. You can apply the standard Julia median function to a DataFrame or a superfast implementation, that you find someone else did, or do it yourself. |
@nalimilan - this issue should be migrated to Statistics.jl but I do not have privileges to do so. Could you please do it? Thank you! |
I can't either. Apparently that's only possible between repos of the same org. Anyway it seems this is already a known problem, and you had even made a PR for it? JuliaStats/Statistics.jl#91 EDIT: specifically, computing the IQR seems very similar to JuliaStats/Statistics.jl#84 |
I'd like to efficiently and in parallel compute the median = q_0.5 and IQR = q_0.75 - q_0.25 of each column in a dataframe. Let's compare the 3 most used libraries:
pandas:
polars:
DataFrames.jl + Julia:
Is there a better, more efficient way to compute medians and IQRs in Julia?
The text was updated successfully, but these errors were encountered: