-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse: add feature to parse and upload msg files #233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work getting this up and running, it looks excellent to me. Honestly I don't think there's anything stopping us from just backporting this code into the api, so we can support msg
files on the server side there (if we actually want to do that).
Nothing from me - just style/nits which I've pushed review commits for
cli/src/commands/parse/msgs.rs
Outdated
clean_headers_string = CONTENT_TYPE_MIME_HEADER_RX | ||
.clone() | ||
.replace(&headers_string, "") | ||
.to_string(); | ||
|
||
clean_headers_string = CONTENT_TRANSFER_ENCODING_MIME_HEADER_RX | ||
.clone() | ||
.replace(&clean_headers_string, "") | ||
.to_string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need .clone()
here - .replace(...)
returns a Cow
which when you call to_string()
, will only call clone()
internally if necessary.
clean_headers_string = CONTENT_TYPE_MIME_HEADER_RX | |
.clone() | |
.replace(&headers_string, "") | |
.to_string(); | |
clean_headers_string = CONTENT_TRANSFER_ENCODING_MIME_HEADER_RX | |
.clone() | |
.replace(&clean_headers_string, "") | |
.to_string(); | |
clean_headers_string = CONTENT_TYPE_MIME_HEADER_RX | |
.replace(&headers_string, "") | |
.to_string(); | |
clean_headers_string = CONTENT_TRANSFER_ENCODING_MIME_HEADER_RX | |
.replace(&clean_headers_string, "") | |
.to_string(); |
cli/src/commands/parse/msgs.rs
Outdated
.filter_map(|path| { | ||
if let Ok(path) = path { | ||
if !path.path().extension().is_some_and(|msg| msg == "msg") { | ||
None | ||
} else { | ||
Some(path) | ||
} | ||
} else { | ||
None | ||
} | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Written slightly strangely - nested if
normally indicates you can use a combinator, in this case .ok()
. Also you negate the if statement for some reason, which makes it harder to read?
.filter_map(|path| { | |
if let Ok(path) = path { | |
if !path.path().extension().is_some_and(|msg| msg == "msg") { | |
None | |
} else { | |
Some(path) | |
} | |
} else { | |
None | |
} | |
}) | |
.filter_map(|path| { | |
let path = path.ok()?; | |
if path.path().extension().is_some_and(|msg| msg == "msg") { | |
Some(path) | |
} else { | |
None | |
} | |
}) |
cli/src/commands/parse/msgs.rs
Outdated
upload_batch_of_documents( | ||
client, | ||
&source, | ||
&documents, | ||
transform_tag, | ||
*no_charge, | ||
&statistics, | ||
)?; | ||
documents.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you call this more than once (again after the loop), you can easily pull this out into a closure:
let send = |documents: &mut Vec<Document>| -> Result<()> {
upload_batch_of_documents(
client,
&source,
documents,
transform_tag,
*no_charge,
&statistics,
)?;
documents.clear();
Ok(())
};
cli/src/commands/parse/msgs.rs
Outdated
compound_file: &mut CompoundFile<File>, | ||
) -> Result<AttachmentMetadata> { | ||
let mut attachment_name_path = attachment_path.clone(); | ||
attachment_name_path.push(STREAM_PATH_ATTACHMENT_FILENAME.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't have to clone here
attachment_name_path.push(STREAM_PATH_ATTACHMENT_FILENAME.clone()); | |
attachment_name_path.push(&*STREAM_PATH_ATTACHMENT_FILENAME); |
*
dereferences your Lazy
instance to the inner value, then &
borrows the PathBuf
into a Path
(which is all you need for push
here)
cli/src/commands/parse/msgs.rs
Outdated
} | ||
|
||
fn read_unicode_stream_to_string( | ||
stream_path: PathBuf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in a few other places you don't actually need a PathBuf
, just a Path
, which allows you to remove a bunch of clone()
s elsewhere.
Adds a feature to parse unicode msg files and upload them directly to a source. Extracts the minimum information we require from the msg using
cfb
crate