Switching to UTF-8 #29
Replies: 2 comments 6 replies
-
I think moving to UTF-8 for this repo is a good move already now. Beta8 is not far away. |
Beta Was this translation helpful? Give feedback.
-
UTF-8 Conversion: Done!@thoni56, the conversion worked out really nice. Now both the English and Spanish libraries (and any other ALAN files) are all in UTF-8. For the ISO to UTF-8-BOM conversion I used PowerGREP 5, a commercial GUI tool which I've purchased some time ago, and which I just discovered it can also handle encoding conversions and validations. I couldn't find any free tool that can convert adding a BOM (inconv doesn't support BOM in conversions to UTF). If you want me to tweak you Swedish branch I can quickly convert all ALAN files to UTF-8-BOM in a breeze, since the tool has already been setup for ALAN specific operations. In that case, I'd be adding a new commit to your dev branch (and could also check if I can manage to delete the Let me know. PS: This change from ISO to UTF-8 sources is a monumental step in ALAN history, and it's so nice to take part in it. |
Beta Was this translation helpful? Give feedback.
-
@thoni56, I've finally managed to create a Ruby method that can invoke ARun and generate a transcript with the same filename as the solutions file (instead of the storyfile) by redirecting ARun's output to file.
The method strips the BOM from the solution file before feeding it to ARun, so the previous problem of the BOM leaking into generated transcript is solved, and without extra file operations — basically Ruby abhors the BOM, so it automatically strips any BOM at read time (doesn't even support natively writing a BOM to file), so I didn't really have to do much, just read the solution and pipe it ARun, the rest is done with the invocation parameters and the redirection.
Now, I've update locally the project Rakefile to use this new method, which makes it better and even slimmer. The problem is that I had to implement the ISO version of the method, since this project is still using ISO encoded files.
I was thinking that we should migrate all ALAN files to UTF-8, so the project will be ready for the upcoming ALAN release, which supports UTF-8 — surely ALAN Beta8 will be released before any of these libraries reaches v1.0.0, and switching to UTF-8 will allow us to start using the same Rake modules in other repositories too (starting with the StdLib repo, now that I finally solved the transcripts problem).
I've also moved the various ALAN and AsciiDoc helpers to external Ruby files, so they can be shared with other repos, independently of their project specific
Rakefile
(updating these modules will only require copying them to each repo, which is not much work really).Are you OK with moving on to using UTF-8 here? If you give me the OK, I'll update the repository configurations and re-encode all the ALAN sources and solutions in the repository, but then we'll have to do the same in our dev branches for the Italian and Swedish libraries too (since after rebasing on
main
they would stop working, most likely).Beta Was this translation helpful? Give feedback.
All reactions