Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK DataLoaders 3: barebones C and C++ support #5330

Merged
merged 9 commits into from
Feb 29, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Feb 28, 2024

Exposes raw DataLoaders to the C++ SDK through 2 new methods: RecordingStream::log_file_from_path & RecordingStream::log_file_from_contents.

Everything streams asynchronously, as expected.

There's no way to configure the behavior of the DataLoader at all, yet. That comes next.

⚠️ I'm carrying bytes around in a string_view (see rr_bytes) and related, and the internet has been unhelpful to tell e whether I should or shouldn't.

rec.log_file_from_path(filepath);

image


Part of series of PR to expose configurable DataLoaders to our SDKs:

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!

@teh-cmc teh-cmc added do-not-merge Do not merge this PR 🌊 C++ API C/C++ API specific include in changelog 🪵 Log & send APIs Affects the user-facing API for all languages labels Feb 28, 2024
Copy link

Size changes

Name cmc/sdk_dataloader_1_barebones_rust 5330/merge Change
re_sdk --no-default-features 117 crates 187 crates 70 crates

@teh-cmc teh-cmc force-pushed the cmc/sdk_dataloader_1_barebones_rust branch from 6439524 to 48498ce Compare February 28, 2024 16:21
@teh-cmc teh-cmc force-pushed the cmc/cmc/sdk_dataloader_3_barebones_cpp branch from b997a7f to 66144c4 Compare February 28, 2024 17:01
@teh-cmc teh-cmc marked this pull request as ready for review February 28, 2024 17:15
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't looked through the entire pr yet but to pre-empt the question: using std::string_view to pass around non-string bytes is super confusing and strange, let's not do that. I wish the obvious next thing to say was "just use array view" but that's std::span and it's C++20. So std::byte* data, size_t data_size is the way to go I fear..

@teh-cmc teh-cmc force-pushed the cmc/sdk_dataloader_1_barebones_rust branch from bcd19c1 to bcf4aa7 Compare February 29, 2024 08:40
@teh-cmc teh-cmc force-pushed the cmc/cmc/sdk_dataloader_3_barebones_cpp branch from 4e2bd3d to c5ecc9c Compare February 29, 2024 08:41
@teh-cmc teh-cmc force-pushed the cmc/sdk_dataloader_1_barebones_rust branch from bcf4aa7 to 9e28fc9 Compare February 29, 2024 09:00
@teh-cmc teh-cmc force-pushed the cmc/cmc/sdk_dataloader_3_barebones_cpp branch from c5ecc9c to cc8f15d Compare February 29, 2024 09:00
Base automatically changed from cmc/sdk_dataloader_1_barebones_rust to main February 29, 2024 12:02
teh-cmc added a commit that referenced this pull request Feb 29, 2024
Exposes raw `DataLoader`s to the Rust SDK through 2 new methods:
`RecordingStream::log_file_from_path` &
`RecordingStream::log_file_from_contents`.

Everything streams asynchronously, as expected.

There's no way to configure the behavior of the `DataLoader` at all,
yet. That comes next.

```rust
rec.log_file_from_path(filepath)?;
```

![image](https://github.com/rerun-io/rerun/assets/2910679/fac1514a-a00b-40c3-8950-df076080e1fc)

This highlighted some brittleness on the shutdown path, see:
- #5335 

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
@teh-cmc teh-cmc force-pushed the cmc/cmc/sdk_dataloader_3_barebones_cpp branch from cc8f15d to fc4d655 Compare February 29, 2024 12:04
@teh-cmc teh-cmc removed the do-not-merge Do not merge this PR label Feb 29, 2024
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feeling fairly strongly about the api for raw bytes not using a string_view, otherwise (minus typos) lgtm

crates/rerun_c/src/ptr.rs Show resolved Hide resolved
crates/rerun_c/src/rerun.h Outdated Show resolved Hide resolved
crates/rerun_c/src/rerun.h Outdated Show resolved Hide resolved
rerun_cpp/src/rerun/recording_stream.cpp Outdated Show resolved Hide resolved
rerun_cpp/src/rerun/recording_stream.cpp Outdated Show resolved Hide resolved
rerun_cpp/src/rerun/recording_stream.cpp Outdated Show resolved Hide resolved
teh-cmc added a commit that referenced this pull request Feb 29, 2024
Exposes raw `DataLoader`s to the Python SDK through 2 new methods:
`RecordingStream.log_file_from_path` &
`RecordingStream.log_file_from_contents`.

Everything streams asynchronously, as expected.

There's no way to configure the behavior of the `DataLoader` at all,
yet. That comes next.

```python
rr.log_file_from_path(filepath)
```

![image](https://github.com/rerun-io/rerun/assets/2910679/0fe2d39c-f069-44a6-b836-e31001b3adaa)


---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
teh-cmc and others added 2 commits February 29, 2024 15:40
Co-authored-by: Andreas Reich <andreas@rerun.io>
Co-authored-by: Andreas Reich <andreas@rerun.io>
@teh-cmc teh-cmc requested a review from Wumpf February 29, 2024 15:06
@teh-cmc teh-cmc merged commit 990a51b into main Feb 29, 2024
14 of 20 checks passed
@teh-cmc teh-cmc deleted the cmc/cmc/sdk_dataloader_3_barebones_cpp branch February 29, 2024 15:36
teh-cmc added a commit that referenced this pull request Feb 29, 2024
Introduces `RecordingStream::clone_weak`, so `DataLoader` threads
started from the SDK don't prevent the recording from flushing once
`main` goes out of scope, all the while making sure that `DataLoader`s
run to completion.


![image](https://github.com/rerun-io/rerun/assets/2910679/94bf7b16-cf9b-40ce-a190-d255328af3f8)


- Mitigates #5335 

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
teh-cmc added a commit that referenced this pull request Feb 29, 2024
Adds new `RecommendedLoadSettings` that gets passed to all `DataLoader`s
-- builtin and external -- in order to customize their behaviors.

This includes:
- A recommended recording ID
- The ID of the currently opened recording in the viewer (not
implemented)
  - Related: #5350
- A recommended entity path prefix
- A recommended timepoint

```bash
cargo r -p rerun-loader-rust-file -- run_wasm/src/main.rs  --recording-id this-one --entity-path-prefix a/b/c  --time sim_time=1000 --time wall_time=1709204046 --sequence sim_frame=42 | rerun -
```

![image](https://github.com/rerun-io/rerun/assets/2910679/631d9798-c198-4e86-b6f8-32d0b849e7b2)


Checks:
- [x] external loader ran manually (`loader | rerun`)
- [x] external loader via rerun (`rerun xxx.rs`)
- [x] log_file with external loader (`log_file xxx.rs`)

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
teh-cmc added a commit that referenced this pull request Mar 1, 2024
Introduces the new `DataLoaderSettings` business to Python and update
examples accordingly (`external_data_loader` & `log_file`).

```bash
python examples/python/external_data_loader/main.py --recording-id this-one --entity-path-prefix a/b/c  --time sim_time=1000 --time wall_time=1709204046 --sequence sim_frame=42 examples/python/dna/main.py | rerun -
```

![image](https://github.com/rerun-io/rerun/assets/2910679/bfda567d-3d16-42cd-be8e-8b1a0767a784)



Checks:
- [x] external loader ran manually (`python loader | rerun`)
- [x] external loader via rerun (`rerun xxx.py`)
- [x] log_file with external loader (`log_file xxx.py`)

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
- #5361
teh-cmc added a commit that referenced this pull request Mar 4, 2024
This makes the `log_file` APIs behave more like the standard `log` APIs;
i.e. they inherit the state of their associated `RecordingStream`
(app_id, rec_id, timepoint, etc...).

Also inherit the application ID while we're at it.

Makes the API much nicer to use _and_ much more consistent with the
rest.

Checks:
- [x] external loader ran manually (`python loader | rerun`)
- [x] external loader via rerun (`rerun xxx.py`)
- [x] log_file with external loader (`log_file xxx.py`)
- [x] external loader ran manually (`loader | rerun`)
- [x] external loader via rerun (`rerun xxx.rs`)
- [x] log_file with external loader (`log_file xxx.rs`)

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
- #5379
- #5361
- #5388

---------

Co-authored-by: Andreas Reich <andreas@rerun.io>
teh-cmc added a commit that referenced this pull request Mar 4, 2024
Introduces the new `DataLoaderSettings` business to C++ and update
examples accordingly (`external_data_loader` & `log_file`).

```bash
./build/debug/examples/cpp/log_file/example_log_file --recording-id this-one --entity-path-prefix a/b/c  --time sim_time=1000 --time wall_time=1709204046 --sequence sim_frame=42 rerun_cpp/tests/main.cpp | rerun -
```

![image](https://github.com/rerun-io/rerun/assets/2910679/b979a24c-29b6-473b-91b1-de3832bea436)


Checks:
- [x] external loader ran manually (`loader.exe | rerun`)
- [x] external loader via rerun (`rerun xxx.cpp`)
- [x] log_file with external loader (`log_file xxx.cpp`)

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
- #5379
- #5361
- #5388
teh-cmc added a commit that referenced this pull request Mar 6, 2024
I guess that's good enough 🤷. I don't know, my brain has been
completely friend by C++ non-sense all day.

This includes a fix to make sure that a viewer that was spawned from the
python SDK is still allowed to spawn dataloaders implemented in python
(`RERUN_APP_ONLY` shenaniganeries).

- Fixes #4526 

---

Part of series of PR to expose configurable `DataLoader`s to our SDKs:
- #5327 
- #5328 
- #5330
- #5337
- #5351
- #5355
- #5379
- #5361
- #5388
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌊 C++ API C/C++ API specific include in changelog 🪵 Log & send APIs Affects the user-facing API for all languages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants