Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-12709 Added ability to get more attributes for zip files as well as created new attributes to get for both tar and zip files. #9122

Closed
wants to merge 8 commits into from

Conversation

dan-s1
Copy link
Contributor

@dan-s1 dan-s1 commented Jul 29, 2024

Summary

Three attributes file.lastMetadataChange, file.lastAccessTime and file.size have been added and other values for existing attributes have been made available (file.creationTime and file.permissions) . Below is a chart of all attributes the UnpackContent processor now has and the ones in bold are the ones which have been added. For tar files the file.creationTime before this PR used the file.lastModifiedTime but now if the actual file.creationTime is available it is used and only defaults to using the file.lastModifiedTime if the file.creationTime is not available (this same logic is now employed for unencrypted zip files). For enrypted zip files the value for file.lastModifiedTime is used for the attribute file.lastModifiedTime and file.creationTime.

Java Variable Attribute Name Tar Unencrypted Zip Encrypted Zip
FILE_LAST_MODIFIED_TIME_ATTRIBUTE file.lastModifiedTime Y Y Y
FILE_CREATION_TIME_ATTRIBUTE file.creationTime Y Y Y
FILE_LAST_METADATA_CHANGE_ATTRIBUTE file.lastMetadataChange Y N N
FILE_LAST_ACCESS_TIME_ATTRIBUTE file.lastAccessTime Y Y N
FILE_OWNER_ATTRIBUTE file.owner Y N N
FILE_GROUP_ATTRIBUTE file.group Y N N
FILE_SIZE_ATTRIBUTE file.size Y Y Y
FILE_PERMISSIONS_ATTRIBUTE file.permissions Y Y N
FILE_ENCRYPTION_METHOD_ATTRIBUTE file.encryptionMethod N Y Y

Currently Zip4j the implementation used for encrypted zip files does not have access methods for file.permissions file.creationTime and file.lastAccessTime. A request has been made to add these.

NIFI-12709

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvements @dan-s1. The changes look straightforward in general, I just noted one recommendation to make the code a bit clearer.

One additional note, it looks like it should be possible to update the unit test to check for additional attributes, such as file.size, and some of the TAR attributes.

@dan-s1 dan-s1 requested a review from exceptionfactory August 9, 2024 20:38
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @dan-s1. This looks closer to completion. In the course of closer review, it seems worth reconsidering the introduction of the new record for metadata. Instead of creating a new object for every entry, what do you think about creating separate methods for creating the new Map of attributes? Common properties could be passed as arguments, and then the standard ZipEntry could set additional values, like file mode and access time, avoiding the intermediate object creation for every entry.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the adjustments @dan-s1, the latest version looks good. +1 merging

@dan-s1
Copy link
Contributor Author

dan-s1 commented Aug 16, 2024

@exceptionfactory Thanks for accepting this PR and merging. I just wanted to note I was careful to make sure to instantiate one java.util.Map for creating attributes for both tar and zip entries to be mindful of the issue brought up in NIFI-13288

@exceptionfactory
Copy link
Contributor

@exceptionfactory Thanks for accepting this PR and merging. I just wanted to note I was careful to make sure to instantiate one java.util.Map for creating attributes for both tar and zip entries to be mindful of the issue brought up in NIFI-13288

Thanks, I noted that when reviewing the implementation, that's helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants