-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move more logic into Axis type? #15
Comments
Another way we could simplify things by making Axis objects more capable would be to make them an |
I still like the idea of making the |
Yes, checking upon construction seems like a great idea. Supporting unsorted axes seems like a nightmare. But that raises a question: do we allow any monotonic sequence, or only increasing? I'm fine with purely increasing. |
I had initially restricted this to purely increasing, but we decided relatively early on that monotonicity was sufficient. Part of what enables this is that we error upon scalar indexing by a duplicated axis value. |
Thanks for the clarification, I hadn't thought of the repeated value issue. Can you point me to some background? Esp. that might clarify what "use" repeated indices have? I should add a clarification of my own: my real question was regarding monotonically decreasing sequences. I don't see the need to support them personally, but the basic concept of fast searching applies to any monotonic sequence, so I thought I'd ask. |
I'm not sure how much of my thinking here made it onto public venues… or still remains in my head for that matter (see my edit above). Here's a bit of a brain dump… which should probably be split across multiple issues. This comes down to how I've categorized axis types into Dimensional or Categorical. It determines two things: which indexing-by-axis-value methods are supported and how lookup is performed.
So: I think a decreasing sequence could work just as well as an increasing one (assuming it implements the search interface properly), but I've not given that case much thought. At a minimum, It'll most likely break my The restriction on Scalar indexing in Dimensional It's worth noting that Panda's indexing scheme currently allows this in some regards: if an "axis" is float or integer, then by-axis-value lookup takes priority over location-based indexing in their default A better example is xarray, where they have dramatically simplified this scheme: As far as repeated axis values go, this was implemented in #16 for a use-case in Sims.jl. It's not hard to check for multiple values upon scalar lookup and error. And to be honest, I was surprised at how little impact this had on the code. Pandas allows duplicates but is "type-unstable" here, returning a scalar or vector if there are dupes. Interestingly, xarray also allow dupes and always returns xarrays from indexing… even scalar indexing by integer locations. |
Now that the axis information is stored as proper Axis types, one possible way to both allow an unchecked constructor and fix #14 would be to move the sort check into the Axis constructor. We could even go further, and make Axis an abstract super type of all these traits, with the Axis constructor returning the appropriate type. This could include, for example, a SortedDimensionalAxis type.
This would, of course, make the abstract Axis constructor type-unstable. It also moves more of the axis trait complexity into the foreground.
The text was updated successfully, but these errors were encountered: