Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add attachment support #160

Merged
merged 3 commits into from
Dec 1, 2021
Merged

Add attachment support #160

merged 3 commits into from
Dec 1, 2021

Conversation

mythmon
Copy link
Contributor

@mythmon mythmon commented Nov 23, 2021

This is still in progress. It's not documented well, or tested at all (even manual testing). That being said, I think it would be valuable to get feedback at this point.

@leplatrem Does this fit will into the library? Should I move it around or change the format?

@Dexterp37 Does this look useful for integrating into Merino?

Fixes #145

Copy link
Collaborator

@leplatrem leplatrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🐎

It seems like a great starting point 👌

I liked how you exposed both metadata and the attachment from a record.

let bytes = record.get_attachment(client).await;
let metadata = match record.attachment {... };

FWIW on Firefox Desktop we have a different model, where the record is passed to a downloader, attached to the client.

let bytes = await client.attachments.download(record, options);
let metadata = record.attachment;

There may be pros and cons for each approach. I'm happy if we pick the one that matches your use case better. Ideally, we just want to be able to override all the parts that are not related to the interaction with the server and its responses. My first impression is that the composition of a downloader on the client makes it easier, but I could be convinced otherwise :)

@@ -28,6 +29,7 @@ tokio = { version = "1.8.2", features = ["macros"] }
remote-settings-client = { path = ".", features = ["viaduct_client"] }

[dependencies]
anyhow = "1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/client.rs Outdated Show resolved Hide resolved
src/client.rs Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated
where T: TryFrom<Vec<u8>, Error = E>,
E: 'static + Send + Sync + std::error::Error
{
let key = format!("attachment:{}", metadata.hash);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the hash seems a reasonable approach to lookup the local cache 👌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there is any need to delete old attachments? Merino isn't going to keep a cache that lives longer than a single node, so I'm not worried about it for that case.

src/client.rs Outdated
}

Err(err) => Err(ClientError::StorageError(err)),
}?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! I meant to ask: Is there a preferred way to calculate that hash already in RS? If not I can pull in something from crates.io easily enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After implementing this, I think we shouldn't verify size, only the hash. Attachments of different sizes will (be likely to) have different hashes, and Merino wants to be able to make all content checks optional. The alternative would be something like having the verifier check the size, or just check the entire content. Since checking size along side signatures is such a weak check, I don't think that's worth the extra complexity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For us it was only a way to bypass hash computation if sizes differ.

Copy link
Collaborator

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the above changes, I'm able to pull attachments from remote settings from Merino.

src/client.rs Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
@Dexterp37
Copy link
Collaborator

@mythmon after your last commits and after rebasing this PR, I'm able to correctly use the remote-settings-client in Merino. See mozilla-services/merino#232. All the tests pass locally.

@mythmon mythmon marked this pull request as ready for review November 29, 2021 22:52
@mythmon
Copy link
Contributor Author

mythmon commented Nov 29, 2021

Rebased onto master.

Copy link
Collaborator

@leplatrem leplatrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!
Made a few comments, but could not spot blockers :)

Regarding the storage of attachments. What if we prefix the key with the bucket/collection? (eg. attachment:main/quicksuggest:<hash>). This way if we want to purge all attachments of a specific collection, we can delete attachment:main/quicksuggest:*.

Also what would be a good reason for not enabling the attachments feature?
We don't introduce crazy dependencies, maybe we could just always ship it? What do you think?

src/client.rs Outdated
use url::Url;

#[cfg(feature = "attachments")]
use anyhow::{anyhow, Context};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would enabling anyhow for the whole crate be an issue right now? (ref #143)

src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
src/client.rs Outdated Show resolved Hide resolved
@mythmon
Copy link
Contributor Author

mythmon commented Nov 30, 2021

Regarding the storage of attachments. What if we prefix the key with the bucket/collection? (eg. attachment:main/quicksuggest:). This way if we want to purge all attachments of a specific collection, we can delete attachment:main/quicksuggest:*.

That's a good idea. Originally I left it unprefixed so we could de-duplicate attachments in the rare case they are in multiple collections, but that is a quite weak reason. Having the extra collection metadata is a much better reason to influence the format of the key.

Also what would be a good reason for not enabling the attachments feature?
We don't introduce crazy dependencies, maybe we could just always ship it? What do you think?

Sounds good to me!

@mythmon mythmon changed the title Add optional attachment support Add attachment support Nov 30, 2021
@leplatrem leplatrem merged commit 6bb0b0d into master Dec 1, 2021
@leplatrem leplatrem deleted the attachments branch December 1, 2021 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for downloading attachments
3 participants