Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ja/ignore tmp dir #985

Closed
wants to merge 38 commits into from
Closed

Ja/ignore tmp dir #985

wants to merge 38 commits into from

Conversation

aurambaj
Copy link
Collaborator

No description provided.

hylstonnb and others added 30 commits August 14, 2023 13:33
…pi. Bumped acu4j from v62.1 (which comes with okapi v0.36) to v64.2 which is the same version defined in webapp

typo fix
Introduce a delta-pull option to the third party sync command, when enabled text unit imports will only occur for batches that contain updates. The implementation stores the checksum of a translated file pulled from a third party provider for a locale, on a subsequent sync compare the checksum of the current file with the stored checksum.

If checksums match no changes have occurred and we will exit processing early for that locale.
* Added config value to enable/disable Quartz scheduler. added docker compose cluster to test API & worker nodes config
This PR adds functionality to allow the configuration of multiple Quartz schedulers and introduces configuration properties to allow jobs to be scheduled to a specified Quartz scheduler.

Updates the docker-compose file docker-compose-api-worker.yml to spin up a single db & mojito api container alongside two mojito worker containers used to execute the Quartz jobs which can be utilised for testing.
…rsion supporting java 8 (#12)

extractNoteFromXMLComment fix - replaced textUnit.getProperty to textUnitUtils.getNote method

getNote fix - return textUnit.getProperty(NoteAnnotation.LOC_NOTE) if the property is not null

adding empty line at the end of the yaml files. The source file has the empty line at the end, so okapi version upgrade should have fixed it.

XLIFFNoteAnnotation was deprecated and removed. NoteAnnotation class came as a replacement for the old XLIFFNoteAnnotation class.

XLIFFNote was deprecated and removed. Note class came as a replacement for the old XLIFFNote class.

Property.NOTE string constant was deprecated and removed. It as replaced by NoteAnnotation.LOC_NOTE. The value of the string constant is no longer the same. Property.NOTE value was 'note' and NoteAnnotation.LOC_NOTE value is 'developer'
If the text unit dto cache contains text units with the same md5 then the map creation will fail on duplicated keys.

The cache is now considered as corrupted and a valid list of text units is fetched from the database and presited in the cache.

This is an edge case, but the duplicate md5s can happen when restoring a database snapshot from prod to dev environment where the text unit id have diverged.
This should have been removed before push the change
This comment is not relevant anymore, removing.
…ptor circular dependency fix (#20)

moving method getURIForResource to RestTemplateUtil new class
Measure the time from job scheduled to execution, from execution to finish and from scheduled to finish
Measure the time it takes to get the branc tm text unit ids, and the time ti takes to get the text unit dtos

log if the branch is larger than 100 text units.
This tracks the number of tasks that are marked as zombie tasks
The goal of these generators is to generate stable ids that points to the same source content in between asset revisions. Specifically, when the asset structure only is modified or small source edits are made, it avoids generating a brand-new set of text units that will require full re-processing.

They should also reference translations in a stable way, and in case of duplicate sources allow for different translations to be provided.

CompareGeneratorsTest showcases different attributes of the generators.
Can test the extraction process in isolation
getSourceAsCodedHtml() takes a takes unit and return a string where the placeholders are replaced by HTML element. The encoded string is meant to be provided to translation services (MT, or human translation) to preserve the original code.

fromCodedHTML() is the oppositve opereation: it re-creates a text fragment with proper codes from a translation that contains HTML placeholders and the text unit from which it reads the original source and codes.
…steps

- Changes to JSON filter to support setting code finder, and to enable HTML codes
- Use JSON filter to test the change to AbstractMd5ComputationStep and TranslateStep
- Add HtmlFilter which extends the Okapi filter. Support text unit name generation using near stable ids. Also the new HTML coded placeholder to support nested images in the paragraphs.
- This is in alpha to be able to make changes without having to support backward compatibilities. Registers the file type as alpha and use filter config id override to load the HTML filter
- Use locale in the file name for the localized files
…he HTML filter

- Add generic mechanism to localize properties of a document part. The filter adds the annotation DocumentPartPropertyAnnotation to mark a property as localizable, and to get it processed in the extraction and translation steps .
- Use it in the HTML filter to process the "image src attribute". That attribute does not require translation but might need adaptation per locale. This is allows to do it in Mojito without any other sort form of post-processing on the HTML. A comment is used to indicate that the text unit should not be translated. We can have an improved cue later depending on the needs.
- Add filter options to HTML filter. Use it to enable/disable "image src" attribute processing.
This is not used anymore and is remainings of previous implementation.
maallen and others added 8 commits October 2, 2023 16:14
…ty sources

In this context, empty source means the empty string or a nbsp character. The "isTranslatable" field is set on
the text units that have an empty source by the HTML filter. The side effect is that those text units
will be skipped during the extraction process.

The option is "emptyAndNbspNotTranslatable" and can be passed as a regular filter option. It is "true" by default,
meaning the text unit with an empty source will be skipped.
Keep only the new notification code moving forward. If remaining old branch exists, notification won't be sent anymore
setting indent xml value to 0 as the default value changed from 0 to 4 in java 11
This will help debuging by quickly looking at the input of jobs
@aurambaj aurambaj closed this Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants