Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust): Allow setting custom object_store client/cloud options #21007

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

PrettyWood
Copy link

@PrettyWood PrettyWood commented Jan 30, 2025

We currently use Polars in our Rust web service and need to support custom certificates in our file connector.
While this functionality already exists in object_store via the ClientOptions struct (which allows setting root certificates through the with_root_certificate method), Polars does not yet expose this struct.
This PR aims to add this capability to Polars.

Polars CloudOptions implements PartialEq, Eq and Hash, which is not the case of ClientOptions in object_store. Furthermore the defaults in polars for ClientOptions were not the same as object_store
(see get_client_options function)

So I decided to create a new type PlClientOptions with those defaults and only add what we want to set: the 3 previous fields (pure refactoring) + certificates (feature)

Some questions:

  • is the overall approach good for you?
  • is it ok if we ignore those certificates for PartialEq, Eq and Hash? If yes do you prefer the current approach with custom implems or the use of derivative crate with #[derivative(PartialEq="ignore")], which is more explicit and readable but adds a new dependency.
  • should I simplify #[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))] with #[cfg(feature = "cloud")]?

We can test this PR on AWS but not for GCP and Azure but since the code is the same, this test should be enough!

Thank you so much :)

@PrettyWood PrettyWood force-pushed the feat/custom-client-options branch from d6de45d to e71cff5 Compare January 30, 2025 14:25
@PrettyWood PrettyWood changed the title feat(rust): allow setting custom client options feat(rust): Allow setting custom client options Jan 30, 2025
@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature rust Related to Rust Polars and removed title needs formatting labels Jan 30, 2025
@PrettyWood PrettyWood force-pushed the feat/custom-client-options branch 2 times, most recently from 26b7468 to ad25eb8 Compare January 30, 2025 15:56
@PrettyWood PrettyWood force-pushed the feat/custom-client-options branch from ad25eb8 to b2ac3e9 Compare January 30, 2025 16:25
@PrettyWood PrettyWood marked this pull request as ready for review January 30, 2025 16:31
Copy link

codecov bot commented Jan 30, 2025

Codecov Report

Attention: Patch coverage is 72.22222% with 15 lines in your changes missing coverage. Please review.

Project coverage is 79.88%. Comparing base (96a2d01) to head (be39376).
Report is 106 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-io/src/cloud/client_options.rs 74.35% 10 Missing ⚠️
crates/polars-io/src/cloud/options.rs 61.53% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #21007      +/-   ##
==========================================
+ Coverage   79.34%   79.88%   +0.53%     
==========================================
  Files        1579     1594      +15     
  Lines      224319   227689    +3370     
  Branches     2573     2600      +27     
==========================================
+ Hits       177976   181879    +3903     
+ Misses      45755    45213     -542     
- Partials      588      597       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexander-beedie
Copy link
Collaborator

@nameexhaustion: Any thoughts on this one? I imagine you may already have an approach in mind, or have an opinion how it might best integrate with all the (very useful!) related work you've been doing :)

@alexander-beedie alexander-beedie changed the title feat(rust): Allow setting custom client options feat(rust): Allow setting custom object_store client/cloud options Feb 8, 2025
@nameexhaustion
Copy link
Collaborator

I'm ok with having this PR. It is relatively simple, and currently it is not possible to set these options from the Rust side.

is it ok if we ignore those certificates for PartialEq, Eq and Hash

We should ensure the Hash/PartialEq take all fields into account, otherwise we may end up with caching bugs. If the underlying struct does not support this, I can recommend taking an approach similar to what we do with PlCredentialProvider, where we wrap the field in an Arc<> and take the pointer value.

@@ -82,6 +83,79 @@ pub struct CloudOptions {
#[cfg(feature = "cloud")]
#[cfg_attr(feature = "serde", serde(deserialize_with = "deserialize_or_default"))]
pub(crate) credential_provider: Option<PlCredentialProvider>,
#[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))]
#[cfg_attr(feature = "serde", serde(skip))]
pub(crate) client_options: Option<PlClientOptions>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note above:

/// Note: This is mainly used by Rust users. Python client options go through `CloudConfig`

@PrettyWood
Copy link
Author

@nameexhaustion Thank you so much for your valuable remarks! 🙏 Just pushed a new commit. Feedback welcome

Copy link
Contributor

@lukapeschke lukapeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small remark from me, thank you!

Comment on lines +67 to +69
for certificate in pl_opts.root_certificates.0.iter() {
opts = opts.with_root_certificate(certificate.clone());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're taking ownership of pl_opts, could you skip the .clone() ?

Suggested change
for certificate in pl_opts.root_certificates.0.iter() {
opts = opts.with_root_certificate(certificate.clone());
}
for certificate in pl_opts.root_certificates.0 {
opts = opts.with_root_certificate(certificate);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't do what you're suggesting because it's behind an Arc<…>, so I have to use .0.iter(). This gives me a reference to Certificate, which does not implement Copy, so I have to clone it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah all right, nevermind

@@ -83,6 +89,9 @@ fn url_and_creds_to_key(url: &Url, options: Option<&CloudOptions>) -> Vec<u8> {
config: Option<CloudConfig>,
#[cfg(feature = "cloud")]
credential_provider: usize,
#[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))]
#[cfg_attr(feature = "serde", serde(skip))]
client_options: Option<PlClientOptions>,
Copy link
Collaborator

@nameexhaustion nameexhaustion Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of serde(skip), can you derive Serialize/Deserialize for the ClientOptions? If the Certificate type doesn't allow for serialization, you can manually implement serde for RootCertificates:

  • In Serialize, check that self.0 is empty in Serialize, otherwise panic explaining that it cannot be serialized
  • In Deserialize, just return an empty vec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants