-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rust): Allow setting custom object_store
client/cloud options
#21007
base: main
Are you sure you want to change the base?
Conversation
d6de45d
to
e71cff5
Compare
26b7468
to
ad25eb8
Compare
ad25eb8
to
b2ac3e9
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #21007 +/- ##
==========================================
+ Coverage 79.34% 79.88% +0.53%
==========================================
Files 1579 1594 +15
Lines 224319 227689 +3370
Branches 2573 2600 +27
==========================================
+ Hits 177976 181879 +3903
+ Misses 45755 45213 -542
- Partials 588 597 +9 ☔ View full report in Codecov by Sentry. |
@nameexhaustion: Any thoughts on this one? I imagine you may already have an approach in mind, or have an opinion how it might best integrate with all the (very useful!) related work you've been doing :) |
object_store
client/cloud options
I'm ok with having this PR. It is relatively simple, and currently it is not possible to set these options from the Rust side.
We should ensure the Hash/PartialEq take all fields into account, otherwise we may end up with caching bugs. If the underlying struct does not support this, I can recommend taking an approach similar to what we do with |
@@ -82,6 +83,79 @@ pub struct CloudOptions { | |||
#[cfg(feature = "cloud")] | |||
#[cfg_attr(feature = "serde", serde(deserialize_with = "deserialize_or_default"))] | |||
pub(crate) credential_provider: Option<PlCredentialProvider>, | |||
#[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))] | |||
#[cfg_attr(feature = "serde", serde(skip))] | |||
pub(crate) client_options: Option<PlClientOptions>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note above:
/// Note: This is mainly used by Rust users. Python client options go through `CloudConfig`
@nameexhaustion Thank you so much for your valuable remarks! 🙏 Just pushed a new commit. Feedback welcome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small remark from me, thank you!
for certificate in pl_opts.root_certificates.0.iter() { | ||
opts = opts.with_root_certificate(certificate.clone()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're taking ownership of pl_opts
, could you skip the .clone()
?
for certificate in pl_opts.root_certificates.0.iter() { | |
opts = opts.with_root_certificate(certificate.clone()); | |
} | |
for certificate in pl_opts.root_certificates.0 { | |
opts = opts.with_root_certificate(certificate); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't do what you're suggesting because it's behind an Arc<…>, so I have to use .0.iter(). This gives me a reference to Certificate, which does not implement Copy, so I have to clone it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah all right, nevermind
@@ -83,6 +89,9 @@ fn url_and_creds_to_key(url: &Url, options: Option<&CloudOptions>) -> Vec<u8> { | |||
config: Option<CloudConfig>, | |||
#[cfg(feature = "cloud")] | |||
credential_provider: usize, | |||
#[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))] | |||
#[cfg_attr(feature = "serde", serde(skip))] | |||
client_options: Option<PlClientOptions>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of serde(skip)
, can you derive Serialize/Deserialize for the ClientOptions? If the Certificate
type doesn't allow for serialization, you can manually implement serde for RootCertificates
:
- In
Serialize
, check thatself.0
is empty inSerialize
, otherwise panic explaining that it cannot be serialized - In
Deserialize
, just return an empty vec.
We currently use Polars in our Rust web service and need to support custom certificates in our file connector.
While this functionality already exists in
object_store
via theClientOptions
struct (which allows setting root certificates through thewith_root_certificate
method), Polars does not yet expose this struct.This PR aims to add this capability to Polars.
Polars
CloudOptions
implementsPartialEq
,Eq
andHash
, which is not the case ofClientOptions
inobject_store
. Furthermore the defaults in polars forClientOptions
were not the same asobject_store
(see
get_client_options
function)So I decided to create a new type
PlClientOptions
with those defaults and only add what we want to set: the 3 previous fields (pure refactoring) + certificates (feature)Some questions:
PartialEq
,Eq
andHash
? If yes do you prefer the current approach with custom implems or the use ofderivative
crate with#[derivative(PartialEq="ignore")]
, which is more explicit and readable but adds a new dependency.#[cfg(any(feature = "aws", feature = "gcp", feature = "azure", feature = "http"))]
with#[cfg(feature = "cloud")]
?We can test this PR on AWS but not for GCP and Azure but since the code is the same, this test should be enough!
Thank you so much :)