-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support workflow to facilitate translation of Metanorma documents #349
Comments
Professional translators do start with automated tools, it is true; but (a) they don't finish with them, and (b) they want a workflow where they can see both. They're going to want to use their workflow tools, you can't just dump gibberish Japanese in an XML document and have them sort it out later in ASCII. We can, per what you suggest, insert Japanese for cleanup as duplications of context clauses. But this is a very big ask, and you should be talking to the professional translator who's actually going to do this, to work out reasonable tool support. |
And someone else is going to have to work on this. The text interleaving would be duplicating tags, using the |
"Professional" or not, this is something that is needed by someone who is translating the language. The label "professional" is a distraction. In this case here, we are talking about standards authoring. The workflow tool for authoring standards documents by "professional standards authors" is Metanorma. The Japanese author should be able to use Metanorma to:
|
Ronald, you are not understanding what I am saying. Professional translations is emphatically NOT a distraction, if the workflow is of a translator using machine translation as a starting point to do bulk translation. Such translators will use a translation workbench tool, such as (to pick the first instance I've googled) https://www.memsource.com/translation-software/ . Such a tool will include memorised custom equivalents that the translator has keyed in, templates, technical dictionaries, and whatever else the translator has put in place to make their life easier. A professional translator's environment is going to be that workbench. That is the environment they are going to work in. Metanorma is NOT a translation environment, it is an authoring tool, and their translation environment is going to have to integrate with Metanorma, in some way you will need to work out. What you are proposing is to do machine translation drop in into Metanorma XML outside of the translator's workbench tool, and make them do all their refinements manually. I am telling you, professional translators will not find that adequate: you will be taking them away from their shortcuts and their technical dictionaries, which are normally integrated into their editor. So you will need to investigate further, how translators go about translating marked up documents preserving markup in their tools. I think it is quite likely that this is a solved problem for their workbench tools; and if it is a solved problem, that is all the more reason for us to use the existing tools' way of solving the problem, rather than imposing our own solution on them. I think us doing our own solution is going to duplicate existing effort, and do a bad job of it, that such translators cannot use. And that is why I make a point of saying "professional" translators, translators that routinely use translation workbench tools. A non-professional translator, a subject matter expert for example, will quite happily follow the workflow you propose, of refining a machine translation manually, since they don't have existing workbench tools; they'll be quite happy, for that matter, to eyeball original and machine translated target in two separate windows of an editor, rather than a more integrated environment, where they could do things like mouseover words to get dictionary lookup. And for all I know, OGC may be translating their documents in such an ad hoc way. But if an SDO employs a professional translator, using translation tools, to do translations, then Metanorma will need to integrate with their workflow. And:
|
OK, given that the workflow envisioned is not one of a professional translator using a workbench:
|
I respectfully disagree:
i.e. We should use the Metanorma Semantic XML for translation purposes. |
The talk about "professional translators" is irrelevant to our task at hand right now. Here are the facts:
We just have to do whatever possible with these. |
Google will skip HTML but not non-HTML XML markup (behaviour varies between languages). Serialising the Asciidoctor parse tree into pseudo-HTML is itself a major venture, requiring a new parser, and the Asciidoctor parse tree cannot be relied on as stable. The alternative is likely going to be quite lossy: source Asciidoctor > source Metanorma XML > source Metanorma Pseudo-HTML (substituting arbitrary HTML tags for Metanorma tags) > translated Metanorma Pseudo-HTML > translated Metanorma XML > translated Asciidoctor Indeed, it'll be lossy enough that any translator is going to need to have two text windows side by side, source Asciidoctor, and output Asciidoctor --- and they're going to have to do a lot of repair of the latter copying from the former. If the document is clean (not much markup), this might be good enough. It's not a given that it will. In XML, the provisos above become:
|
Unassigning myself, I won't have time to do this, and I've outlined what needs doing |
I found that LibreTranslate is a pretty good model that can be run locally. |
DeepL also |
Discussed with OGC Staff on 2023-11-06. More research needed before identifying a path forward. |
Processing the input text in Asciidoctor format using coradoc is a more effective way forward. |
@opoudjis to look into providing an example from some prior work. |
The work is the samples of metanorma-jis that we have done, just to show that we support i18n for Japanese. The documents are jis-z-5999 and jis-z-8301-2019. Gobe would like to show these to his Japanese colleagues as proof of concept, but only if they are public documents. @ronaldtse please clarify status of documents. |
Compiled an OGC standard using the JIS flavour of Metanorma with Japanese language for metalanguage, and sent to @ghobona as proof of concept. |
Deprioritise, will depend on new infrastructure emerging, including translation tools. |
OGC wishes to produce a Japanese translation of the CityGML 2.0 document encoded in Metanorma. (metanorma/ogc-citygml2#1).
I thought about it and the following workflow makes most sense. The challenge is to only translate "content", not syntax.
This is a preliminary workflow that nonetheless require some thinking to realize.
The text was updated successfully, but these errors were encountered: