-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Japanese numerals #228
Comments
very interesting to see the vertical layout. thanks for all the work on this @Intelligent2013 ! i dont work with vertical layout much but the third image above looks more correct than the second image. the layout of the kanji numbers in the first image appears correct for the main clause numbers, but with the sub-clause numbering, the vertical style of ' |
Thank you @ReesePlews ! Yes you are right that the Japanese "e-Gov" website has all the Japanese laws. For example, this is the Constitution of Japan: For vertical layout, they have 3 options: 1 column, 2 columns and 4 columns This is the law that establishes JIS: For space savings, this is a screenshot of the 4 column (so it's not too tall to show here). It uses the list style:
The list style only uses a single full width space indentation to separate list levels. UPDATE: It seems that when Paragraphs are labeled, in the e-Gov website the paragraph label for the first paragraph is omitted, and subsequent paragraph labels exist. Not sure why the list item "1" is missing though. This doesn't seem to be an East Asian tradition. |
The 1st post updated - added 'edition number'. |
There's two elements to this. The first is to support Japanese numerals, and I can do that, sure: that's merely The second is to work out where to use Japanese numerals instead of Arabic numerals. This should not be being done on an ad hoc basis, and it should not be being done independently in HTML and PDF: there needs to be a rule as to where it happens, and it needs to be done in Presentation XML. I have the bad feeling that this is going to end up as a document attribute. |
You mean the specification of list bullet styles per level being configurable? I'd (everyone would) love that. |
I don't even know if I can do that in HTML. Not without a lot of pain. And you need to say a lot more about where Japanese numbers are meant to show up. Numbering is done in code; I can make the xref counter output Japanese instead of Arabic numerals, but that means initialising each counter instance in isodoc, one for every block type and clause (figures, tables, requirements, etc etc etc). Without a coherent statement, you are not getting anything. |
@Intelligent2013 I just noticed this since @opoudjis raised it. They are meant to be in Japanese numerals too. |
PER LEVEL?! No you are not getting random list level specification PER LEVEL. ISO HTML CSS has 30 lines of custom code just to insert ")" after list numbers. No, what you're going to get is:
Ordered lists will rely on the Presentation XML feature of I am considering this nothing more than a proof of concept. |
I'm going to realise this with the document attribute
|
@ronaldtse wants to generalise this to Arabic, Chinese, and Amharic. I have little inclination to do so, and this does not address the very real problem of what types of block are going to be Arabic and what local. But:
The nightmare scenario is:
I will not be implementing that. |
To make counters more configurable, I'm going to eventually set up configuration of all counters—starting value and style. But for now, I'm only going to expose that for clauses and lists. |
I've got a problem: I want to assign config to counter classes based on config in the xref class (which knows about numbering styles from the Presentation XML metadata), but I don't want to redefine all the classes invoking them. So to exploit inheritance, I'm going to have to define these counter classes with methods invoked from the xref class. |
Not working yet... |
Also we need to support Japanese numerals in the publication date. I've updated the initial post. |
I am providing Japanese numbering in the Presentation XML, but there is a nightmare scenario where you provide Japanese numbering for page numbers. If you do need them, and if XSL:FO is not clever enough to do that automatically, I'll need to dump the numbers 1–1,000 in the localization strings. Let's not action that yet though... I'd be surprised if XSL:FO doesn't provide that natively somewhere. |
@opoudjis Apache FOP has the extension |
We need to localise the clause number delimiter, from half-width to full-width full stop, if Japanese numbering is used. And I'm going to use this as the opportunity to implement a fix to CJK punctuation called on in relaton/relaton-render#52, which I have not implemented to date because of @ronaldtse ’s indefensible notion that
is desirable punctuation. It is not, I reject with utmost vehemence any claim that it is (and so has Reese) and I am pressing ahead with the correct solution. Regardless of the document main language, punctuation localisation will convert punctuation from half-width to full-width only if at the characters on either side are CJK. So:
I am also going to bite the bullet and move Japanese number rendering to isodoc for xref counters; they already support Roman at top level. |
@Intelligent2013 The edition numbering works in testing, so I will need to investigate that. The list numbering will also be complicated. |
Reese, the point of what I have written is the following:
二.二 => 二。二 ( although it looks like I will need to override this with middle-dot anyway) |
@opoudjis the Japanese "middle dot" delimiter is not the "full stop", they are different symbols. |
No, that's not what I asked for. The default for bibliographic entries is to be rendered in a suitable style, i.e. English in English, Japanese in Japanese. We could have Japanese in English or English in Japanese but that should not be the default. |
Bibliographic entries will routinely be mixed-language, with things like Japanese authors and English titles. The notion of a bibliographic entry being "just Japanese" or "just English" is naive and inflexible. It is also is a nuisance on top of trying to work out what the language of a bibliographic entry is to begin with. (You think users are going to be marking it up as [lang=ja]? And then mark up titles individually as exceptions? When we can work out the script automatically through Regex?) That's why working out whether to apply CJK punctuation contextually, rather than based solely on a language tag, has ALWAYS been the right way to proceed, and I am proceeding with it. Rereading, the default is indeed going to be CJK, but it will be overridden when the immediate context shows that full-width punctuation makes no sense (the surrounding characters are Latin). And I simply cannot trust users to exhaustively mark up references (let alone individual bits of references) to indicate language explicitly. |
As I have just acknowledged, which is why I am doing the refactoring. |
You're looking at the wrong file: I am generating
in the Japanese numbering version. You'll have a refresh soon. |
…se numbering in Japanese dates: #228
This is an update to JIS. JIS has Alphabetic numbering on its first level of ordered lists, and Arabic numbering on subsequent levels. I don't know what the provenance of the PDF sample is, and I do not care: I am not overriding JIS list numbering for some unasked-for proof of concept. I am implementing Japanese numbering to replace Arabic numbering in ordered lists ONLY where JIS sanctions that. |
As warned: HTML right now has no idea what to do with custom list labels. @Intelligent2013 The following should have now everything you need for this proof of concept. |
Ok. please note I need just |
Yuck, that's really adhoc. OK... |
@Intelligent2013 Here you go. |
@opoudjis the edition number is ok also. Thanks! I've updated the initial post for notes, examples numbers:
|
I will not be actioning this at this time, because I need evidence that clients actually want this behaviour, and I am reasonably sure they won't be consistent about it. |
So, rather than get into a protracted discussion: I am closing this ticket as complete. The additional requirement stated for custom numbering of notes, examples, requirements, formulas, term notes, term examples, annexes, admonitions, ordered lists (as distinct from list items), definition lists, figures, subfigures, tables, could be satisfied in one of two ways:
The second approach is the only respectful way to engage with customers. It is also 200-300 lines of code for what is, at this stage, a proof of concept that nobody external has actually asked for, and that no external agency is exercising QA over. It is therefore not going to be a priority for me to work on until some agency actually does ask for it, and can articulate authoritatively how whether they want each of their notes, examples, requirements, formulas, term notes, term examples, annexes, admonitions, ordered lists (as distinct from list items), definition lists, figures, subfigures, tables to be numbered Japanese or Arabic. I will create a ticket for this, and I will demote it to medium priority. |
Source issue: #226
Support Japanese numerals in
clause numbers
Example:
ordered list items
Example:
edition number
currently, there are two elements in the Presentation XML:
Example: 令和元年七月二十二日
Current Presentation XML:
<date type="published">令和元年7月22日</date>
If this task is complicated, then I'll find how to do this via XSLT extensions on Java.
@ronaldtse does we need to support two number formats - Arabic (1, 2, 3, ...) for usual documents and Japanese (一, ...) for vertical layout documents? Or only Japanese numbers?
Note: I don't know the reason, but the notes numbers should be Arabic:
UPDATE after the comment
The text was updated successfully, but these errors were encountered: