Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concat operation #6

Closed
tomwhite opened this issue May 6, 2022 · 5 comments
Closed

Add concat operation #6

tomwhite opened this issue May 6, 2022 · 5 comments

Comments

@tomwhite
Copy link
Member

tomwhite commented May 6, 2022

Needed to implement https://data-apis.org/array-api/latest/API_specification/generated/signatures.manipulation_functions.concat.html.

This could implemented as a Zarr view.

@tomwhite
Copy link
Member Author

Rather than implementing it as a view, it could be written as a map_blocks, like here. Requires #24

@tomwhite tomwhite changed the title Add concat primitive Add concat operation Jun 16, 2022
@tomwhite tomwhite added the core label Jun 16, 2022
@tomwhite
Copy link
Member Author

Use map_direct introduced in 8e6aaf5

@tomwhite
Copy link
Member Author

Fixed in 3d35cca

@TomNicholas
Copy link
Member

Hey, we (@rabernat and I) have been following this project, and we're really excited by the possibilities here 😁

I would really like to try cubed out on some common pangeo / xarray-type workloads soon, and we will hopefully be trying that in a SciPy hack at the end of this week.

I'm assuming / hoping that this concat PR going in means that all the basic array operations we need to test simple array workloads are now available? I'm also thinking of what operations would need to be present before I could try wrapping a cubed array with xarray...

@tomwhite
Copy link
Member Author

Hi @TomNicholas - thanks for reaching out!

That sounds like a great idea for a SciPy hack. Let me know if/how I can help.

Most simple array operations are now available, there is a table here: https://github.com/tomwhite/cubed/blob/main/api_status.md. Basic indexing works, but note that you can't currently mix integers and slices across different dimensions (so x[1:,:] works, but x[1,:] doesn't, for example).

I started looking at xarray integration with the array API (see tomwhite/xarray@c72a1c4, and tomwhite/xarray@929812a), which you may find useful as a starting point. Xarray has quite an extensive API itself, so there will likely be some work in wiring it up to use the array API. (Ref: pydata/xarray#3232 (comment))

In terms of runtime and scaling, once you have something working locally (on a tiny dataset, to check the API is providing what you need), then I would suggest trying Lithops AWS (https://github.com/tomwhite/cubed/tree/main/examples#lithops-aws-lambda-s3), since it is the runtime I've spent most time with. There are other options too - happy to discuss in more detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants