Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork arrow2 and get rid of polars #4789

Closed
emilk opened this issue Jan 11, 2024 · 6 comments · Fixed by #4883
Closed

Fork arrow2 and get rid of polars #4789

emilk opened this issue Jan 11, 2024 · 6 comments · Fixed by #4883
Assignees
Labels
🏹 arrow concerning arrow blocked can't make progress right now dependencies concerning crates, pip packages etc enhancement New feature or request 📉 performance Optimization, memory use, etc
Milestone

Comments

@emilk
Copy link
Member

emilk commented Jan 11, 2024

While we want to migrate from the arrow2 crate to arrow (#3741), it is a big task that we would rather punt on right now. It is technical debt, but the debt is not going to grow significantly. The gains don't justify the potential rabbit hole of paint it could turn into.

One of the major reasons to migrate away from arrow2 is because DataType has a huge overhead, especially when cloned.

We have a PR to fix it (jorgecarleitao/arrow2#1469) but it is unmerged, because arrow2 in unmaintained.

So: we fork arrow2 as re_arrow2, merge our PR, and solve our immediate memory issue.

Since polars require arrow2, we need to stop using it. We only have it for a few tests.


We should revisit the migrating away from arrow2 when we start exposing arrow things to the users (e.g. support data queries in the SDK) and/or when we want to interface with a some data frame crate.

@emilk emilk added enhancement New feature or request 🏹 arrow concerning arrow 📉 performance Optimization, memory use, etc dependencies concerning crates, pip packages etc labels Jan 11, 2024
@emilk emilk added this to the 0.13 milestone Jan 11, 2024
@emilk emilk self-assigned this Jan 12, 2024
@emilk
Copy link
Member Author

emilk commented Jan 12, 2024

We should replace our polars use with our own DataTable

@kylebarron
Copy link

fwiw, Polars also forked arrow2 into polars-arrow, and I don't think it has a dependency on arrow2 anymore

@emilk
Copy link
Member Author

emilk commented Jan 12, 2024

We want to get rid of polars anyhow because of its huge dependency tree

@emilk emilk removed their assignment Jan 15, 2024
@emilk emilk added the blocked can't make progress right now label Jan 15, 2024
@emilk
Copy link
Member Author

emilk commented Jan 15, 2024

Removing polars blocked on landing caching to avoid conflicts

emilk added a commit that referenced this issue Jan 15, 2024
### What
* Part of #4789

We're using polars for some tests.
This PR replaces polars with our own data tables for the tests.


https://github.com/rerun-io/rerun/blob/main/crates/re_data_store/tests/data_store.rs
is left, and a pain in the ass.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using newly built examples:
[app.rerun.io](https://app.rerun.io/pr/4801/index.html)
* Using examples from latest `main` build:
[app.rerun.io](https://app.rerun.io/pr/4801/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[app.rerun.io](https://app.rerun.io/pr/4801/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG

- [PR Build Summary](https://build.rerun.io/pr/4801)
- [Docs
preview](https://rerun.io/preview/5f0bfe1996e12b974d6ed306977628e8e9572386/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/5f0bfe1996e12b974d6ed306977628e8e9572386/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

---------

Co-authored-by: Clement Rey <cr.rey.clement@gmail.com>
@emilk emilk self-assigned this Jan 15, 2024
@emilk
Copy link
Member Author

emilk commented Jan 15, 2024

Forking of arrow2 happening in rerun-io/re_arrow2#2

I suggest we use it as arrow2 = { package = "re_arrow2", version = "0.18" }. That way we can keep using arrow2 in the code for a smaller diff.

@emilk
Copy link
Member Author

emilk commented Jan 15, 2024

re_arrow2 is not live at https://github.com/rerun-io/re_arrow2 with

  • New name
  • Arcification of DataType
  • Updated deps

Once polars is gone we can publish re_arrow2 to crates.io and start using it

@emilk emilk removed their assignment Jan 15, 2024
@teh-cmc teh-cmc self-assigned this Jan 19, 2024
teh-cmc added a commit that referenced this issue Jan 23, 2024
All the grunt work left to get rid of polars.

- Remove all helpers and APIs built specifically for polars'
`DataFrame`.
- Refactor tests that rely on dataframe joins to not require join
semantics in the first place (`re_data_store` has no knowledge of those
anyway).
- The one test that does require join semantics has moved over to
`re_query`, where join semantics belong.
- All `polars-*` dep have been removed.

Don't look at the commit log as it makes no sense: i changed strategies
a bunch of times on the way.

---

- Part of #4789
- DNR: requires #4856

---

Part of the tiny datatype deduplication PR series:
- #4880
- #4883
teh-cmc added a commit that referenced this issue Jan 24, 2024
Grunt work to switch to `re_arrow2` and all the breaking changes that
come with it.

- Fixes #4789

---

Part of the tiny datatype deduplication PR series:
- #4880
- #4883
emilk added a commit that referenced this issue Jul 5, 2024
## What
* Closes #5315

## egui changelog

### ⭐ Added
* Add `Image::uri()` [#4720](emilk/egui#4720) by
[@rustbasic](https://github.com/rustbasic)

### 🔧 Changed
* Better documentation for `Event::Zoom`
[#4778](emilk/egui#4778) by
[@emilk](https://github.com/emilk)
* Hide tooltips when scrolling
[#4784](emilk/egui#4784) by
[@emilk](https://github.com/emilk)
* Smoother animations [#4787](emilk/egui#4787)
by [@emilk](https://github.com/emilk)
* Hide tooltip on click [#4789](emilk/egui#4789)
by [@emilk](https://github.com/emilk)

### 🐛 Fixed
* Fix default height of top/bottom panels
[#4779](emilk/egui#4779) by
[@emilk](https://github.com/emilk)
* Show the innermost debug rectangle when pressing all modifier keys
[#4782](emilk/egui#4782) by
[@emilk](https://github.com/emilk)
* Fix occasional flickering of pointer-tooltips
[#4788](emilk/egui#4788) by
[@emilk](https://github.com/emilk)

## eframe changelog
* Web: only capture clicks/touches when actually over canvas
[#4775](emilk/egui#4775) by
[@lucasmerlin](https://github.com/lucasmerlin)

## Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6785?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6785?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!

- [PR Build Summary](https://build.rerun.io/pr/6785)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
emilk added a commit that referenced this issue Jul 5, 2024
## What
* Closes #5315

## egui changelog

### ⭐ Added
* Add `Image::uri()` [#4720](emilk/egui#4720) by
[@rustbasic](https://github.com/rustbasic)

### 🔧 Changed
* Better documentation for `Event::Zoom`
[#4778](emilk/egui#4778) by
[@emilk](https://github.com/emilk)
* Hide tooltips when scrolling
[#4784](emilk/egui#4784) by
[@emilk](https://github.com/emilk)
* Smoother animations [#4787](emilk/egui#4787)
by [@emilk](https://github.com/emilk)
* Hide tooltip on click [#4789](emilk/egui#4789)
by [@emilk](https://github.com/emilk)

### 🐛 Fixed
* Fix default height of top/bottom panels
[#4779](emilk/egui#4779) by
[@emilk](https://github.com/emilk)
* Show the innermost debug rectangle when pressing all modifier keys
[#4782](emilk/egui#4782) by
[@emilk](https://github.com/emilk)
* Fix occasional flickering of pointer-tooltips
[#4788](emilk/egui#4788) by
[@emilk](https://github.com/emilk)

## eframe changelog
* Web: only capture clicks/touches when actually over canvas
[#4775](emilk/egui#4775) by
[@lucasmerlin](https://github.com/lucasmerlin)

## Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6785?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6785?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!

- [PR Build Summary](https://build.rerun.io/pr/6785)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow blocked can't make progress right now dependencies concerning crates, pip packages etc enhancement New feature or request 📉 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants