Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature to parse text in the lesson definition. #202

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

odeleongt
Copy link

Backward compatible changes which allow to process lesson definitions which request some marker (e.g. -_-) to be replaced with text generated with an R expression upon loading the lesson.

Written to fix an issue due to locale-specific translation of character strings with strptime.

Fixes #201.

Lesson specification should be used as in odeleongt/swirl_courses@53ff493.

@odeleongt
Copy link
Author

I suppose using markers such as _1_, _2_ ... _n_ would be interesting, resulting in:

  • No need to define the Parse marker in the metadata.
  • The ability to define up to n markers for a single question and replace them with one of 1..n elements defined in the question's Parse field.

It could even be implemented with a named vector of replacements in the question's Parse field, to ease question definition readability and tracking the replacements during design, like:

# The question definition would include a Parse field with an associative array
# of the replacement markers (keys) and the expressions to use as replacement
# (values), such as "Parse: {two: 2, one: 1}" (inline, or as block)
temp <- yaml::yaml.load("- Class: cmd_question\n  Output: Test\n  Parse: \n    two: sqrt(4)\n    one: 3-2\n")
expressions <- temp[[1]]$Parse

# Evaluate the expressions
expressions <- sapply(expressions, function(x) eval(parse(text=x)))

# Each field would be searched for the replacement markers
text <- "This is test number _one_, this one is number _two_. _one_ comes again."
text

# Find the markers
found <- gregexpr('_([^_]*)_', text)
markers <- gsub('_', '', unlist(regmatches(text, found)), fixed=TRUE)

# Fill in with the results of the expressions
parsed <- expressions[markers]
regmatches(text, found) <- list(parsed)

text

Probably _ is not the best pick for marker delimiter, but I can't come up with a better one now. Regardless, it could be customized in the Parse field in the metadata.

@seankross
Copy link
Member

I like this idea however I think we should use whisker as a template engine.

We could specify a localization file that would look something like this:

- locale: en_US.UTF-8
  hello: Good morning!

- locale: de_DE.UTF-8
  hello: Guten morgen!

And then the question in the lesson.yaml file would look like this:

- Class: text
  Output: {{{hello}}}

We would render the lesson.yaml file with the correct localization info right before we load the lesson into swirl.

@ncarchedi
Copy link
Member

@odeleongt Thanks for the pull request! I think this is a really cool idea, but I agree with @seankross that it's worth using the whisker package if we're going to do it.

@odeleongt
Copy link
Author

Yeah, I thought about whisker but I have never used it, so just munched this code to record the idea.

Whisker looks great to manage translations (i.e. seems easy enough to extract the language from the locale), but it looks like it would be difficult to design a lesson which needs to consider locales and take into account everything.

I think it is really out of scope to try and provide a static and comprehensive localization info, for instance to work with dates/times, currency or numeric representations (i.e. many options related to ?locale, including system specific naming conventions). That is why I thought of using a location independent representation (e.g. x = as.POSIXlt("1986-10-17 08:24")) to define the value to use, and let the system generate the locale specific representation (e.g. format(x, format = "%B %d, %Y %H:%M")) at load time.

If this is to be implemented, whisker seems like the correct choice for templating, but I do think it would be useful to include the option to parse code at startup in addition to defining locale/language specific values.

@odeleongt
Copy link
Author

I updated the pull request to use whisker to process the lesson definition file.

It checks if a locale.yaml file exists within the lesson path (code in R/Menu.R/loadLesson.default). If not, it continues as normal.

If the file exists, it loads the file, loads the lesson, replaces the text, saves the lesson as a temporary file and returns a new file path for the lesson (code in R/parse_content.R/localize_lesson). Then it continues as normal.

There is only a placeholder to process the localization file. ¿How would you go about it? Defining text replacements based on one of Sys.getlocale(category = "LC_ALL") or even Sys.getenv(x = "LANGUAGE") seems cumbersome and prone to failure, due to the variety of locale names used in different systems.

@odeleongt
Copy link
Author

I suppose that the first element of the locale.yaml file could be used to define rules on how to select the rest of the elements. Another option (which I like the most) is to use each element for a marker to replace, and include marker specific rules to pick a value.

Given the variety of locale naming conventions, I think that most uses would come from evaluating R expressions instead of defining localization rules. I think it would be a nice feature (and would help solve the problem of localization by defining locale independent expressions which evaluate to the correct locale dependent representation), but I don't know if it is within the goals of swirl and worth the hassle.

@ncarchedi
Copy link
Member

@odeleongt Thanks again for putting this together. I have a full plate at the moment and may not have a chance to fully review/test this for another week or two. Perhaps @seankross and @WilCrofter could give their blessing in the meantime.

@WilCrofter
Copy link
Contributor

@ncarchedi Haven't been following but will do what I can as time permits.

@ncarchedi
Copy link
Member

Where do we stand on this? I apologize for being so out of the loop.

@odeleongt
Copy link
Author

Thanks for the follow up. I've been thinking about this, and it seems that the work I proposed would be useful only for minor tweaking of a lesson as it is displayed to the user.

What happens if someone needs to set up a ´swirl´ training environment in a non-English speaking location? For instance, my employer is requesting some very specific R/epi training modules, and I would like to use swirl to provide self-paced practice materials and also enable them to access the suggested courses; the problem is that the audience exclusively speaks Spanish.

I'm planning to translate the suggested courses (no big deal) and would prepare the specific training modules in Spanish, but given that the intended audience is also new to R / scripting I think that even presenting the messages and menus in English could deter them from effectively following the course.

To fix this, two things would be needed:

  • Provide for automatic selection of alternative lesson definition files based on the session locale (or user defined language options, which could be pre-configured if needed). That would allow to include the suggested lessons in multiple languages.
  • Internationalization of the swirl package. I think this would be a lot more cumbersome given the variety of ways that swirl passes messages to the console.

If you are interested, and we can work out an organized plan to do it, I could get my employer to allow me some time to work on it.

@WilCrofter
Copy link
Contributor

Earlier in development, we considered internationalization of swirl menus etc. The prototype is in swirl's languages branch. The effort has not been a priority for a very long time, and I doubt the branch is current. I could try to update it if anyone is interested. Let me know.

The general idea was that menu content be specified as key:value pairs in yaml files. Keys were beginnings of the en values, enough to identify the specific item. I used Google Translate to create a Spanish yaml file which, I'm sure, is inadequate but demonstrated the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue due to locale-specific translation of character strings with strptime
4 participants