Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/Replace Compression & Decompression Filters #195

Merged
merged 30 commits into from
Dec 4, 2024

Conversation

gsteel
Copy link
Member

@gsteel gsteel commented Nov 8, 2024

Q A
Documentation not yet
BC Break yes

Description

The old Compress and Decompress filters were a mess. Different adapters had different or unknown capabilities. Disregarding the now removed formats such as rar and lzf, the Tar and Zip adapters cannot simply compress arbitrary strings but can of course be used to compress and decompress files/archives, and the BZ2 and GZ adapters aren't suited to working with archives but are capable of compressing arbitary strings.

Instead of attempting to do all of these things with all the adapters, I've split the adapters into 2 interfaces:

  • ArchiveAdapterInterface
  • StringCompressionAdapterInterface

and replaced the Compress and Decompress filters with

  • CompressString
  • CompressToArchive
  • DecompressString
  • DecompressArchive

The file related filter will now accept paths, directories, PSR uploads and PHP $_FILES arrays.

The tools added here will also be useful for the other file related filters.

I've introduced the concept of an ArchiveAdapterResolver, shipping concrete implementations by default, that first scan the mime-type of the file to see if we have a matching adapter and then guess the filetype from the filename extension.

TODO:

  • Documentation and migration
  • A PR for the deprecations

@gsteel
Copy link
Member Author

gsteel commented Nov 8, 2024

@froschdesign - I'd appreciate a review on the general direction of this patch when you get the chance, before I do more work on the outstanding items.

Reviews welcome from anyone else too :)

cc @alexmerlin, @ramchale, @Xerkus, @samsonasik

src/Compress/Bz2.php Outdated Show resolved Hide resolved
Copy link
Contributor

@ramchale ramchale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good

Added a couple of suggestions that I think mitigate the need to add the dev dependency for laminas/laminas-diactoros, but it depends how much mocking you want to use.

test/File/FileInformationTest.php Show resolved Hide resolved
test/CompressToArchiveTest.php Show resolved Hide resolved
Adds a new interface for string compression adapters with implementations for GZ and BZ2 along with dedicated string compression and decompression filters.

Further refactoring of related filters will include more adapters and filters for dealing with files explicitly

Signed-off-by: George Steel <[email protected]>
Also enables removal of the legacy `Compression Algorithm` interface and abstract

Signed-off-by: George Steel <[email protected]>
Signed-off-by: George Steel <[email protected]>
Signed-off-by: George Steel <[email protected]>
- Bz2 -> Bz2Adapter
- Gz -> GzAdapter
- Tar -> TarAdapter
- Zip -> ZipAdapter

Signed-off-by: George Steel <[email protected]>
Signed-off-by: George Steel <[email protected]>
This is so that matchers have more information to use to determine an appropriate adapter

Signed-off-by: George Steel <[email protected]>
In order to recognise the type of archive when it's an uploaded file, and does not have a file name extension, the mime type matcher can be used to guess the mime type and match to the appropriate adapter.

The aggregate matcher allows chaining of matchers so that multiple strategies can be attempted.

Signed-off-by: George Steel <[email protected]>
@gsteel gsteel force-pushed the de-compress-refactor branch from 7589e98 to 5d5ddf8 Compare November 19, 2024 21:59
@gsteel gsteel marked this pull request as ready for review December 4, 2024 10:27
@gsteel gsteel requested a review from froschdesign December 4, 2024 10:27
@gsteel gsteel force-pushed the de-compress-refactor branch from 1f0b00f to 4e70264 Compare December 4, 2024 10:29
Copy link
Member

@froschdesign froschdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvements!

Signed-off-by: George Steel <[email protected]>
@gsteel gsteel force-pushed the de-compress-refactor branch from 4e70264 to 5d3e4e3 Compare December 4, 2024 21:35
Copy link
Member Author

@gsteel gsteel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some notes to the PR for posterity!

use function array_values;
use function sprintf;

final class AggregateArchiveAdapterResolver implements ArchiveAdapterResolverInterface
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "Aggregate Adapter Resolver" can be used to compose a chain of concrete adapter resolvers. The first successful resolver wins.

* blocksize?: int<1, 9>|null,
* }
*/
final class Bz2Adapter implements StringCompressionAdapterInterface
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BZ2 adapter now only supports string content.

In future, it will be possible to introduce a FileCompressionInterface that this adapter could also implement so that we can filter an arbitrary string, or a single un-compressed file to somefile.bz2


use const PATHINFO_EXTENSION;

final class FileExtensionArchiveAdapterResolver implements ArchiveAdapterResolverInterface
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply matches whatever.zip to use the Zip archive adapter based on filename extension. By default it is composed last in the chain, preferring mime-magic based mime detection.

* mode?: 'deflate'|'compress',
* }
*/
final class GzAdapter implements StringCompressionAdapterInterface
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the BZ2 adapter, the GZ adapter now only supports string content.

In future, it will be possible to introduce a FileCompressionInterface that this adapter could also implement so that we can filter an arbitrary string, or a single un-compressed file to somefile.gz


use function sprintf;

final class MimeTypeArchiveAdapterResolver implements ArchiveAdapterResolverInterface
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map the detected mime-type of a file to an archive adapter - the default method of adapter selection


use const FILEINFO_MIME_TYPE;

final class FileInformation
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General purpose file information utility. This would have been marked as @internal, but we use it to match unknown files to archive adapters in the interface. Marking it as @internal would have been a pain in the ass for anyone who wants to implement a different detection strategy

@gsteel gsteel self-assigned this Dec 4, 2024
@gsteel gsteel merged commit 46b5d30 into laminas:3.0.x Dec 4, 2024
16 of 17 checks passed
@gsteel gsteel deleted the de-compress-refactor branch December 4, 2024 22:14
This was referenced Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants