Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arrow2-convert migration 1: TUID, RowId, TableId #3853

Merged
merged 13 commits into from
Oct 16, 2023

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Oct 13, 2023

This PR marks start of the migration away from arrow2-convert.

We start with our control columns, and more specifically our TUID-based control columns: TUID, RowId, TableId.

This introduces a new convention: the rerun.controls. prefix (e.g. rerun.controls.RowId).
This likely means that this PR completely breaks old rrd files (I haven't tested, but that would make sense).

These control components now go through the same deserialization stack as everything else (re_types::Loggable).
A nice side-effect is that this is generally 2x to 3x faster than going through arrow2-convert:

arrow/serialize/arrow2_convert
                        time:   [428.60 ns 428.89 ns 429.29 ns]
                        thrpt:  [2.3294 Melem/s 2.3316 Melem/s 2.3332 Melem/s]
arrow/serialize/arrow2  
                        time:   [176.92 ns 177.03 ns 177.18 ns]
                        thrpt:  [5.6440 Melem/s 5.6486 Melem/s 5.6524 Melem/s]

arrow/deserialize/arrow2_convert
                        time:   [119.56 ns 119.75 ns 120.08 ns]
                        thrpt:  [8.3281 Melem/s 8.3506 Melem/s 8.3641 Melem/s]
arrow/deserialize/arrow2
                        time:   [70.062 ns 70.118 ns 70.200 ns]
                        thrpt:  [14.245 Melem/s 14.262 Melem/s 14.273 Melem/s]

arrow2-convert migration PR series:


Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested demo.rerun.io (if applicable)
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

@teh-cmc teh-cmc added 🏹 arrow concerning arrow 🚀 performance Optimization, memory use, etc 🚜 refactor Change the code, not the functionality exclude from changelog PRs with this won't show up in CHANGELOG.md labels Oct 13, 2023
arrow/serialize/arrow2_convert
                        time:   [428.60 ns 428.89 ns 429.29 ns]
                        thrpt:  [2.3294 Melem/s 2.3316 Melem/s 2.3332 Melem/s]
arrow/serialize/arrow2  time:   [176.92 ns 177.03 ns 177.18 ns]
                        thrpt:  [5.6440 Melem/s 5.6486 Melem/s 5.6524 Melem/s]

arrow/deserialize/arrow2_convert
                        time:   [119.56 ns 119.75 ns 120.08 ns]
                        thrpt:  [8.3281 Melem/s 8.3506 Melem/s 8.3641 Melem/s]
arrow/deserialize/arrow2
                        time:   [70.062 ns 70.118 ns 70.200 ns]
                        thrpt:  [14.245 Melem/s 14.262 Melem/s 14.273 Melem/s]
@teh-cmc teh-cmc force-pushed the cmc/arrow2convert_be_gone branch from 6e7d628 to f5f01d6 Compare October 13, 2023 12:25
@teh-cmc teh-cmc marked this pull request as ready for review October 13, 2023 12:32
crates/re_format/src/arrow.rs Outdated Show resolved Hide resolved
## Enable converting Tuid to arrow2
arrow2_convert = ["dep:arrow2", "dep:arrow2_convert"]
## Enable (de)serialization using Arrow.
arrow = ["dep:re_types", "dep:arrow2", "dep:thiserror"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having re_tuid depend on re_types feels very backwards to me. re_tuid is a very low-level building block. This would be like adding a dependency on re_types to the uuid crate. I think it makes much more sense to impl re_types::Loggable for Tuid in re_types instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess if we start to add these kinds of external integrations into re_types, we might as well just split the core traits into their own crate already and avoid the intermediate headache.

I'll do the split in a follow up.

@teh-cmc teh-cmc merged commit a96b190 into main Oct 16, 2023
30 checks passed
@teh-cmc teh-cmc deleted the cmc/arrow2convert_be_gone branch October 16, 2023 13:02
teh-cmc added a commit that referenced this pull request Oct 18, 2023
… out dependencies (#3897)

This introduces the `rust.attr.override_crate` attribute, which is
necessary for the serde-codegen story.

This attribute also allows us to generate some fundamental types such as
`InstanceKey` directly into `re_types_core` rather than `re_types`,
making it possible to reduce the number of crates that depend on the
giant `re_types`.

- The `AutoSpaceViews` & `PanelView` components are now back into their
respective home (`re_viewport` & `re_viewer`, respectively).
They were temporarily moved it because we had no support for
`custom_crate`.
They will be joined by their more complicated siblings in the next PR,
which implements the necessary serde-codegen support.

- `InstanceKey`, `Clear` and `ClearIsRecursive` are now generated into
`re_types_core`.
This allows `re_query`, `re_arrow_store` and `re_data_store` to drop
their dependency on `re_types`.

- Generated code now picks up arrow from `re_types_core::external`
instead of importing it directly.
This means crates hosting generated code are not forced into import
`arrow2` directly.

---


`arrow2-convert` migration PR series:
- #3853 
- #3855
- #3897
- #3902
teh-cmc added a commit that referenced this pull request Oct 19, 2023
…3902)

This introduces the `rust.attr.serde_type` attribute, allowing you to
use any `serde`-compatible Rust types in our IDL.
```
table ViewportLayout (
  "attr.rust.derive": "PartialEq",
  "attr.rust.override_crate": "re_viewport"
) {
  space_view_keys: [ubyte] (order: 100, "attr.rust.serde_type": "std::collections::BTreeSet<re_viewer_context::SpaceViewId>");

  tree: [ubyte] (order: 101, "attr.rust.serde_type": "egui_tiles::Tree<re_viewer_context::SpaceViewId>");

  auto_layout: bool (order: 102);
}
```

This unblocks further blueprint experimentations, and is the last
blocker to sunset `arrow2-convert`.

- `SpaceViewComponent`, `SpaceViewMaximized` & `ViewportLayout` are now
all implemented that way.
- `re_viewport` is now free of `arrow2-convert`.

---

`arrow2-convert` migration PR series:
- #3853 
- #3855
- #3897
- #3902
- #3917
teh-cmc added a commit that referenced this pull request Oct 19, 2023
The end of our wonderful journey.

- `NumInstances` control column now has an actual dedicated component
type.
- `EntityPath` is now a component.
- `Into<Cow<Self>>` impls have been cleaned up to generate way less
code.
- `arrow2_convert` is fully gone.


---

`arrow2-convert` migration PR series:
- #3853 
- #3855
- #3897
- #3902
- #3917
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow exclude from changelog PRs with this won't show up in CHANGELOG.md 🚀 performance Optimization, memory use, etc 🚜 refactor Change the code, not the functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants