-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Add buffers as input #74
base: main
Are you sure you want to change the base?
Conversation
t.is(diffPercentage, 2.85952484323); | ||
}); | ||
|
||
test.skip("Accepts buffer as input for base and compare image at the same time", async (t) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is currently failing, as we can't distinguish from nodes stdin stream, where the first image ends and the second one starts.
@@ -21,7 +21,7 @@ | |||
"scripts": { | |||
"run": "esy x ODiffBin", | |||
"test": "esy x RunTests.exe", | |||
"test-js": "esy ava", | |||
"test-js": "esy ava --verbose", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the verbose flag to better see, which tests are failing and which are passing.
Also, without this flag, ava does not report that some tests timed out.
/** The image type of the base image. This has to be set to the corresponding image format when using a buffer as input */ | ||
baseImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath'; | ||
/** The image type of the compare image. This has to be set to the corresponding image format when using a buffer as input */ | ||
compareImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like, that you are currently able to provide a buffer as input and at the same time add "filepath" as the image type.
Maybe this can be prevented with some typescript magic, but I am really not experienced in ts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way too quickly read codec info from buffer? I understand why this is done looks like it may hurt performance to add a lookup over all the codecs.
I think this can be done by simply duplicating overrides like this
declare function compare(
baseImage: Buffer,
compareImage: buffer,
baseCodec: "..",
compareCodec: ".."
)
declare function compare(
baseImage: string,
compareImage: string,
)
and then in function check the type of baseImage && compareImage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can of course try to read in the first 4 bytes and identify the magic number. Here is a list of the valid magic numbers for different image formats: https://gist.github.com/leommoore/f9e57ba2aa4bf197ebc5.
TIFF and JPG do seem to have multiple valid ones, But I think it would be simple (and probably reasonably fast) to check it.
@@ -10,15 +26,55 @@ let diffPath = | |||
let base = | |||
Arg.( | |||
value | |||
& pos(0, file, "") | |||
& info([], ~docv="BASE", ~doc="Path to base image") | |||
& pos(0, underscore_or(non_dir_file), "_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are now able to add either a file or an _
(underscore) to accept a raw image buffer.
I explicitly added the non_dir_file
converter, so directories are not a valid input.
let baseType = | ||
Arg.( | ||
value | ||
& opt(enum([("auto", `auto), ...supported_formats]), `auto) | ||
& info( | ||
["base-type"], | ||
~docv="FORMAT", | ||
~doc= | ||
Printf.sprintf( | ||
"The type of the base image (required to be not auto when a buffer is used as input).\nSupported values are: auto,%s", | ||
supported_formats |> List.map(fst) |> String.concat(","), | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are now able to explicitly set the type of the provided input. This is required for buffer inputs.
We could also try to read the type from the images header, but I think this would be too much work (and it would hurt the performance).
/* We use 65536 because that is the size of OCaml's IO buffers. */ | ||
let chunk_size = 65536; | ||
let buffer = Buffer.create(chunk_size); | ||
let rec loop = () => { | ||
Buffer.add_channel(buffer, stdin, chunk_size); | ||
loop(); | ||
}; | ||
try(loop()) { | ||
| End_of_file => Buffer.contents(buffer) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are reading from stdin until there is no more data provided.
If we wanted to support both base and compare image as buffer inputs, we would need to implement a check for some kind of separator in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is basically why I didn't implement this from scratch. Looks incredibly tough to correctly read stdin and avoid memory copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the performance is not bad. The only issue is knowing when the first image ends and the second one starts.
I will need to see what I can come up with to solve this problem.
@dmtrKovalenko |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your design is really good, I am just wondering about reading stdin into ocaml buffer which involves a very big memory chunk being copied and leaved for ocaml GC which may impact performance
module IO2 = (val getIOModule(img2Path)); | ||
let img1Type = | ||
switch (img1Type) { | ||
| `auto when img1 == "_" => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be named stdin instead of auto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto
is the one which automatically gets the type from the filename (the currently published behaviour). The explicit types (png
, jpg
, ...) are the types given with the stdin input.
I think we can get rid of this though, if we are reading in the format from the magic number.
/** The image type of the base image. This has to be set to the corresponding image format when using a buffer as input */ | ||
baseImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath'; | ||
/** The image type of the compare image. This has to be set to the corresponding image format when using a buffer as input */ | ||
compareImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way too quickly read codec info from buffer? I understand why this is done looks like it may hurt performance to add a lookup over all the codecs.
I think this can be done by simply duplicating overrides like this
declare function compare(
baseImage: Buffer,
compareImage: buffer,
baseCodec: "..",
compareCodec: ".."
)
declare function compare(
baseImage: string,
compareImage: string,
)
and then in function check the type of baseImage && compareImage
/* We use 65536 because that is the size of OCaml's IO buffers. */ | ||
let chunk_size = 65536; | ||
let buffer = Buffer.create(chunk_size); | ||
let rec loop = () => { | ||
Buffer.add_channel(buffer, stdin, chunk_size); | ||
loop(); | ||
}; | ||
try(loop()) { | ||
| End_of_file => Buffer.contents(buffer) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is basically why I didn't implement this from scratch. Looks incredibly tough to correctly read stdin and avoid memory copying.
Makes sense, but I would really check if there would be possible to read images directly from stdin instead of creating additional ocaml buffer. |
Fixes #53