-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special latex characters in alt-text #804
Comments
well tilde and superscript are a bit special, but beside this:
|
OK, I tried |
Arguably, standard commands like |
Some publishers now requires to add alternative texts in their tex files and authors who don't care much about tags writes just Alternative
where Another idea that maybe |
hm, no. |
There are a few dozen such commands, but for what it is worth they are standard input methods and if they appear in, say, a heading they should be replaced by something suitable when the text is moved to the book mark, for example. So I think there is some argument for it that the purify handles them. On the other hand it is a noticeable overhead for a marginal use case If you could get all special chars using |
I can certainly adjust A related issue for me is that currently |
TLC3 I-768 to I776 is what is documented as encoding specific commands. It's a mouthful already plus packages (eg babel) might add more so you would also need an interface to add to whatever list is automatically handled. Redefining all of them is not feasible. Instead, I think what should be done is to define a PU encoding and then provide definitions in that encoding and during purifying you change to that encoding. That then uses the encoding change approach to avoid doing all the redefinitions and and only do them on the fly when they are actually show up in the input (handwaving, may not work with the purify approach easily) |
@FrankMittelbach Surely that's not much worse than loading |
much worse. puenc.def is loaded once, but the redefinitions for the encoding specific commands would happen each time purify is done. In contrast a text encoding specific command checks if it already has a definition suitable for the current encoding and if not chenges to the one in the right encoding, but that happens only for those encoding specific ocmmands that are actually used so only a few if any not a few hundred each time and this is all done expandably. |
Note that special chars are already covered: \text_declare_purify_equivalent:Nn \\ { }
\tl_map_inline:nn
{ \{ \} \# \$ \% \_ }
{ \text_declare_purify_equivalent:Ne #1 { \cs_to_str:N #1 } } |
but this is what I mean you do these mappings each time even if none of the commands show up. not a problem for 5 but a bit different if a few hundred. In contrast if we are in a PU encoding the expansion of \{ would check find it is T1 so changes to \PU\{ and runs that |
@FrankMittelbach I have a feeling we are talking at cross-purposes here! When you pass something like |
ah ok, so your purify does something similar to what the encoding specific command mechanism does (makes me wonder if it could have used that mechanism in the first place -- probably not as you have to get rid of other stuff) |
No, it's more-or-less the same as PU. There, we have 100s of \DeclareTextCommand ... which store the data (once) and then are looked up in the hash table. For |
Yes, very similar to encoding and of course even more similar to what |
I don't think that we need to cover all latex input in alt text. Tex in |
In alt text we need some encoding where we can hide any character from tex (like HTML has entities where every character can be typed as unicode).
For example what will be if I want to describe brace characters:
I can hide it with
\{
but in alt text backslash is useless and need to be striped.Now I tried directly to add common special characters and I see that:
~
is changed to space%
is gone (ofcourse)#
is doubledexample:
The text was updated successfully, but these errors were encountered: