-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RERO 1 (Olive) - Incorrect coordinates to rescale #126
Comments
After further investigation, the following conclusions have been made: For each title, the issues were separated into two groups: issues to be rescaled, and issues to investigate further. Based on each title's problems the following issues (originally part of issues to investigate further) will be patched:
In particular, another problem has been identified, that concerns several titles, but especially EXP. As a result, rescaling the coordinates for more recent EXP issues would not prove useful in fixing the coordinates issues present. |
After further discussion, and identifying also significant issues in the LES data, it has been decided that
However, for LES, it has also been identified that the full text of some articles is missing all its spaces. |
This issue is part of the various patches planned and done as part of the March-April 2024 release.
More info on the patches can be found here, in issue #117, issue #74 and here.
For the RERO 1 (Olive) data, it has been found that a number of issues presented wrongly scaled coordinates, as described here.
Upon a closer look at the issue, it was identified that problem originated during the conversion of the image files to jpg 2000.
Based on the available images, several approaches/strategies existed, among which the 'png_highest' strategy, which consisted in selecting the image with the highest resolution among various options, where the resolution was present in the filename (eg.
['1/Img/Pg001.png', '1/Img/Pg001_157.png', '1/Img/Pg001_180.png', '2/Img/Pg002.png', '2/Img/Pg002_157.png', '2/Img/Pg002_180.png']
forLCS-1830-08-02-a
).Unfortunately, it happened that the selected image was NOT the one with the largest resolution, leading in the wrong scaling.
The chosen fixing approach was thus to:
[issue-id]-image-info.json
file created during scaling) was NOT the highest one available (in the corresponding issue's Document.zip archive containing all the images used as source).dest_res/curr_res
) where dest_rest was the smaller resolution (used to create the jp2 files) and curr_res was the largest one available.curr_res
), which is not the one used to generate the jp2 image, and according to which they need to be rescaled (dest_res
).However, upon implementation of this approach it has been found that:
[issue-id]-image-info.json
, detailing which image resolution was actually used.Currently, the titles for which we know that coordinates need rescaling are:
DLE
,EXP
,LBP
,LES
,LTF
,LCG
.The titles which have issues with missing information that might or might not need rescaling are:
DLE
,EXP
,LBP
,LES
,LTF
,LCG
,LES
,LNF
,LSE
,LCR
,LCS
,JDF
,LVE
.To be noted that for issues for which we do not know for sure (not part of the first 5), no example of issues requiring rescaling has yet been found. However, examples of issues not needing coordinate rescaling, but being part of the first 5 have been found.
More investigating as to which rescaling could be applied in the uncertain cases is ongoing.
The text was updated successfully, but these errors were encountered: