Draft: Add buffers as input #74

eWert-Online · 2022-12-21T20:28:12Z

Fixes #53

eWert-Online · 2022-12-23T22:11:11Z

test/node-binding.test.cjs

+  t.is(diffPercentage, 2.85952484323);
+});
+
+test.skip("Accepts buffer as input for base and compare image at the same time", async (t) => {


This test is currently failing, as we can't distinguish from nodes stdin stream, where the first image ends and the second one starts.

eWert-Online · 2022-12-23T22:13:08Z

package.json

@@ -21,7 +21,7 @@
  "scripts": {
    "run": "esy x ODiffBin",
    "test": "esy x RunTests.exe",
-    "test-js": "esy ava",
+    "test-js": "esy ava --verbose",


Added the verbose flag to better see, which tests are failing and which are passing.
Also, without this flag, ava does not report that some tests timed out.

eWert-Online · 2022-12-23T22:17:44Z

bin/node-bindings/odiff.d.ts

+  /** The image type of the base image. This has to be set to the corresponding image format when using a buffer as input */
+  baseImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath';
+  /** The image type of the compare image. This has to be set to the corresponding image format when using a buffer as input */
+  compareImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath';


I don't really like, that you are currently able to provide a buffer as input and at the same time add "filepath" as the image type.
Maybe this can be prevented with some typescript magic, but I am really not experienced in ts.

Is there any way too quickly read codec info from buffer? I understand why this is done looks like it may hurt performance to add a lookup over all the codecs.

I think this can be done by simply duplicating overrides like this

declare function compare( baseImage: Buffer, compareImage: buffer, baseCodec: "..", compareCodec: ".." ) declare function compare( baseImage: string, compareImage: string, )

and then in function check the type of baseImage && compareImage

We can of course try to read in the first 4 bytes and identify the magic number. Here is a list of the valid magic numbers for different image formats: https://gist.github.com/leommoore/f9e57ba2aa4bf197ebc5.

TIFF and JPG do seem to have multiple valid ones, But I think it would be simple (and probably reasonably fast) to check it.

eWert-Online · 2022-12-23T22:20:31Z

bin/ODiffBin.re

@@ -10,15 +26,55 @@ let diffPath =
 let base =
  Arg.(
    value
-    & pos(0, file, "")
-    & info([], ~docv="BASE", ~doc="Path to base image")
+    & pos(0, underscore_or(non_dir_file), "_")


You are now able to add either a file or an _ (underscore) to accept a raw image buffer.
I explicitly added the non_dir_file converter, so directories are not a valid input.

eWert-Online · 2022-12-23T22:24:17Z

bin/ODiffBin.re

+let baseType =
+  Arg.(
+    value
+    & opt(enum([("auto", `auto), ...supported_formats]), `auto)
+    & info(
+        ["base-type"],
+        ~docv="FORMAT",
+        ~doc=
+          Printf.sprintf(
+            "The type of the base image (required to be not auto when a buffer is used as input).\nSupported values are: auto,%s",
+            supported_formats |> List.map(fst) |> String.concat(","),
+          ),
+      )


You are now able to explicitly set the type of the provided input. This is required for buffer inputs.
We could also try to read the type from the images header, but I think this would be too much work (and it would hurt the performance).

eWert-Online · 2022-12-23T22:25:53Z

bin/Main.re

+  /* We use 65536 because that is the size of OCaml's IO buffers. */
+  let chunk_size = 65536;
+  let buffer = Buffer.create(chunk_size);
+  let rec loop = () => {
+    Buffer.add_channel(buffer, stdin, chunk_size);
+    loop();
+  };
+  try(loop()) {
+  | End_of_file => Buffer.contents(buffer)
+  };


We are reading from stdin until there is no more data provided.
If we wanted to support both base and compare image as buffer inputs, we would need to implement a check for some kind of separator in here.

Yeah, that is basically why I didn't implement this from scratch. Looks incredibly tough to correctly read stdin and avoid memory copying.

I mean the performance is not bad. The only issue is knowing when the first image ends and the second one starts.
I will need to see what I can come up with to solve this problem.

eWert-Online · 2023-02-21T14:04:16Z

@dmtrKovalenko
What do you think of the design so far? I may have some time to work on this this week, but I want to make sure I am on the right path before continuing 🙂

dmtrKovalenko

Your design is really good, I am just wondering about reading stdin into ocaml buffer which involves a very big memory chunk being copied and leaved for ocaml GC which may impact performance

dmtrKovalenko · 2023-02-21T14:05:57Z

bin/Main.re

-  module IO2 = (val getIOModule(img2Path));
+  let img1Type =
+    switch (img1Type) {
+    | `auto when img1 == "_" =>


I think it should be named stdin instead of auto

auto is the one which automatically gets the type from the filename (the currently published behaviour). The explicit types (png, jpg, ...) are the types given with the stdin input.
I think we can get rid of this though, if we are reading in the format from the magic number.

dmtrKovalenko · 2023-02-21T14:10:55Z

bin/node-bindings/odiff.d.ts

+  /** The image type of the base image. This has to be set to the corresponding image format when using a buffer as input */
+  baseImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath';
+  /** The image type of the compare image. This has to be set to the corresponding image format when using a buffer as input */
+  compareImageType?: 'filepath' | 'jpg' | 'png' | 'bmp' | 'tiff' = 'filepath';


Is there any way too quickly read codec info from buffer? I understand why this is done looks like it may hurt performance to add a lookup over all the codecs.

I think this can be done by simply duplicating overrides like this

declare function compare( baseImage: Buffer, compareImage: buffer, baseCodec: "..", compareCodec: ".." ) declare function compare( baseImage: string, compareImage: string, )

and then in function check the type of baseImage && compareImage

dmtrKovalenko · 2023-02-21T14:14:05Z

bin/Main.re

+  /* We use 65536 because that is the size of OCaml's IO buffers. */
+  let chunk_size = 65536;
+  let buffer = Buffer.create(chunk_size);
+  let rec loop = () => {
+    Buffer.add_channel(buffer, stdin, chunk_size);
+    loop();
+  };
+  try(loop()) {
+  | End_of_file => Buffer.contents(buffer)
+  };


Yeah, that is basically why I didn't implement this from scratch. Looks incredibly tough to correctly read stdin and avoid memory copying.

dmtrKovalenko · 2023-02-21T14:40:01Z

Makes sense, but I would really check if there would be possible to read images directly from stdin instead of creating additional ocaml buffer.

WIP: add buffers as input

424d537

eWert-Online marked this pull request as draft December 21, 2022 20:28

read buffer input from stdin

19517f9

eWert-Online commented Dec 23, 2022

View reviewed changes

dmtrKovalenko reviewed Feb 21, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Add buffers as input #74

Draft: Add buffers as input #74

eWert-Online commented Dec 21, 2022

eWert-Online Dec 23, 2022

eWert-Online Dec 23, 2022

eWert-Online Dec 23, 2022

dmtrKovalenko Feb 21, 2023

eWert-Online Feb 21, 2023

eWert-Online Dec 23, 2022

eWert-Online Dec 23, 2022 •

edited

Loading

eWert-Online Dec 23, 2022 •

edited

Loading

dmtrKovalenko Feb 21, 2023

eWert-Online Feb 21, 2023

eWert-Online commented Feb 21, 2023

dmtrKovalenko left a comment

dmtrKovalenko Feb 21, 2023

eWert-Online Feb 21, 2023

dmtrKovalenko Feb 21, 2023

dmtrKovalenko Feb 21, 2023

dmtrKovalenko commented Feb 21, 2023

Draft: Add buffers as input #74

Are you sure you want to change the base?

Draft: Add buffers as input #74

Conversation

eWert-Online commented Dec 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eWert-Online Dec 23, 2022 • edited Loading

Choose a reason for hiding this comment

eWert-Online Dec 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eWert-Online commented Feb 21, 2023

dmtrKovalenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmtrKovalenko commented Feb 21, 2023

eWert-Online Dec 23, 2022 •

edited

Loading

eWert-Online Dec 23, 2022 •

edited

Loading