diff --git a/usage-rmd-and-github.Rmd b/usage-rmd-and-github.Rmd index 5224e40..97753bb 100644 --- a/usage-rmd-and-github.Rmd +++ b/usage-rmd-and-github.Rmd @@ -1,6 +1,8 @@ # Test drive R Markdown {#rmd-test-drive} -We will author an R Markdown document and render it to HTML. We discuss how to keep the intermediate Markdown file, the figures, and what to commit to Git and push to GitHub. If GitHub is the primary venue, we render directly to GitHub-flavored markdown and never create HTML. +We will author an R Markdown document and render it to HTML. +We discuss how to keep the intermediate Markdown file, the figures, and what to commit to Git and push to GitHub. +If GitHub is the primary venue, we render directly to GitHub-flavored markdown and never create HTML. Here is the official R Markdown documentation: @@ -10,24 +12,39 @@ We'll practice with RStudio's boilerplate R Markdown document. Launch RStudio in a Project that is a Git repo that is connected to a GitHub repo. -We are modelling "walk before you run" here. It is best to increase complexity in small increments. We test our system's ability to render the ["hello world"](http://en.wikipedia.org/wiki/%22Hello,_world!%22_program) of R Markdown documents before we muddy the waters with our own, probably buggy, documents. +We are modelling "walk before you run" here. +It is best to increase complexity in small increments. +We test our system's ability to render the ["hello world"](http://en.wikipedia.org/wiki/%22Hello,_world!%22_program) of R Markdown documents before we muddy the waters with our own, probably buggy, documents. Do this: *File > New File > R Markdown ...* - - Give it an informative title. This will appear in the document but does not necessarily have anything to do with the file's name. But the title and filename should be related! Why confuse yourself? The title is for human eyeballs, so it can contain spaces and punctuation. The filename is for humans and computers, so it should have similar words in it but no spaces and no punctuation. - - Accept the default Author or edit if you wish. - - Accept the default output format of HTML. - - Click OK. - -Save this document to a reasonable filename and location. The filename should end in `.Rmd` or `.rmd`. Save in the top-level of this RStudio project and Git repository, that is also current working directory. Trust me on this and do this for a while. - -You might want to commit here. So you can see what's about to change ... - -Click on "Knit HTML" or do *File > Knit Document*. RStudio should display a preview of the resulting HTML. Also look at the file browser. You should see the R Markdown document, i.e. `foo.Rmd` AND the resulting HTML `foo.html`. +* Give it an informative title. This will appear in the document but does not + necessarily have anything to do with the file's name. But the title and + filename should be related! Why confuse yourself? The title is for human + eyeballs, so it can contain spaces and punctuation. The filename is for humans + and computers, so it should have similar words in it but no spaces and no + punctuation. +* Accept the default Author or edit if you wish. +* Accept the default output format of HTML. +* Click OK. + +Save this document to a reasonable filename and location. +The filename should end in `.Rmd` or `.rmd`. +Save in the top-level of this RStudio project and Git repository, that is also current working directory. +Trust me on this and do this for a while. + +You might want to commit at this point. +That will help you see exactly what's happening with your files, because this will appear as a "diff" in the Git pane. +Making change very visible is one of the big benefits of using Git. + +Click on "Knit HTML" or do *File > Knit Document*. +RStudio should display a preview of the resulting HTML. +Also look at the file browser. +You should see the original R Markdown document, i.e. `foo.Rmd` AND the resulting HTML `foo.html`. Congratulations, you've just made your first reproducible report with R Markdown. -You might want to commit here. +This is another good time to commit changes. ## Push to GitHub @@ -35,60 +52,97 @@ Push the current state to GitHub. Go visit it in the browser. -Do you see the new files? An R Markdown document and the associated HTML? Visit both in the browser. Verify this: +Do you see the new files? +An R Markdown document and the associated HTML? +Visit both in the browser. +Verify this: - * Rmd is quite readable. But the output is obviously not there. - * HTML is ugly. +* Rmd is quite readable. But the output is obviously not there. +* HTML is ugly. ## Output format -Do you really want HTML? Do you only want HTML? If so, you can skip this step! +Do you really want HTML? +Do you only want HTML? +Are you absolutely sure? +If so, you can skip this step! -The magical process that turns your R Markdown to HTML is like so: `foo.Rmd --> foo.md --> foo.html`. Note the intermediate markdown, `foo.md`. By default RStudio discards this, but you might want to hold on to that markdown. +The magical process that turns your R Markdown to HTML is like so: -Why? GitHub gives very special treatment to markdown files. They are rendered in an almost HTML-like way. This is great because it preserves all the charms of plain text but gives you a pseudo-webpage for free when you visit the file in the browser. In contrast, HTML is rendered as plain text on GitHub and you'll have to take special measures to see it the way you want. +``` +foo.Rmd --> foo.md --> foo.html +``` +Note the intermediate markdown, `foo.md`. +By default RStudio discards this, but you might want to hold on to that markdown file! -In many cases, you *only want the markdown*. In that case, we switch the output format to `github_document`. This means render will be `foo.Rmd --> foo.md`, where `foo.md` is GitHub-flavored markdown. If you still want the HTML *but also the intermediate markdown*, there's a way to request that too. +Why? +GitHub gives very special treatment to markdown files. +They are rendered in an almost HTML-like way. +This is great because it preserves all the charms of plain text, but gives you a pseudo-webpage for free when you visit the file in the browser. +In contrast, HTML is rendered as plain text on GitHub and you'll have to take special measures to see it the way you want. -**Output format** is one of the many things we can control in the YAML frontmatter -- the text at the top of your file between leading and trailing lines of `---`. +In many cases, you *only want the markdown*. +In that case, we switch the output format to `github_document`. +This means rendering look like this: -You can make some changes via the RStudio IDE: click on the "gear" in the top bar of the source editor, near the "Knit HTML" button. Select "Output options" and go to the Advanced tab and check "Keep markdown source file." Your YAML should now look more like this: +``` +foo.Rmd --> foo.md +``` + +where `foo.md` is GitHub-flavored markdown. +If you still want the HTML *but also the intermediate markdown*, there's a way to request that too. + +This point we're making about the importance of `.md` files is why so many R packages have a `NEWS.md` file and `README.md`, often generated from `README.Rmd`. + +**Output format** is one of the many things we can control in the YAML frontmatter of `.Rmd` documents, i.e. the text at the top of your file between leading and trailing lines of `---`. + +You can make some YAML changes via the RStudio IDE: click on the "gear" in the top bar of the source editor, near the "Knit HTML" button. +Select "Output options" and go to the Advanced tab and check "Keep markdown source file." +Your YAML should now look more like this: ``` yaml - --- - title: "Something fascinating" - author: "Jenny Bryan" - date: "`r format(Sys.Date())`" - output: - html_document: - keep_md: true - --- +--- +title: "Something fascinating" +author: "Jenny Bryan" +date: "`r format(Sys.Date())`" +output: + html_document: + keep_md: true +--- ``` -You should have gained the line `keep_md: true`. You can also simply edit the file yourself to achieve this. +You should have gained the line `keep_md: true`. +You can also simply edit the file yourself to achieve this. +The IDE only exposes a small fraction of what's possible to configure in the YAML. -In fact this hand-edit is necessary if you want to keep only markdown and get GitHub-flavored markdown. In that case, make your YAML look like this: +In fact, a hand-edit is necessary if you want to keep only markdown and get GitHub-flavored markdown. +In that case, make your YAML look like this: ``` yaml - --- - title: "Something fascinating" - author: "Jenny Bryan" - date: "`r format(Sys.Date())`" - output: github_document - --- +--- +title: "Something fascinating" +author: "Jenny Bryan" +date: "`r format(Sys.Date())`" +output: github_document +--- ``` Save! -You might want to commit here. +You might want to commit at this point. Render via "Knit HTML" button. -Now revisit the file browser. In addition to `foo.Rmd`, you should now see `foo.md`. If there are R chunks that make figures, the usage of markdown output formats will also cause those figure files to be left behind in a sensibly named sub-directory, `foo_files`. +Now revisit the file browser. +In addition to `foo.Rmd`, you should now see `foo.md`. +If there are R chunks that make figures, the usage of markdown output formats will also cause those figure files to be left behind in a sensibly named sub-directory, such as `foo_files`. If you commit and push `foo.md` and everything inside `foo_files`, then anyone with permission to view your GitHub repo can see a decent-looking version of your report. -If your output format is `html_document`, you should still see `foo.html`. If your output format is `github_document` and you see `foo.html`, that's leftover from earlier experiments. Delete that. It will only confuse you later. +If your output format is `html_document`, you should still see `foo.html`. +If your output format is `github_document` and you see `foo.html`, that's leftover from earlier experiments. +Delete that. +It will only confuse you later. You might want to commit here. @@ -98,26 +152,31 @@ Push the current state to GitHub. Go visit it in the browser. -Do you see the modifications and new file(s)? Your Rmd should be modified (the YAML frontmatter changed). And you should have gained at least, the associated markdown file. +Do you see the modifications and new file(s)? +Your `.Rmd` should be modified, i.e. you should see the changes you made to the YAML frontmatter. +And you should have gained, at least, the associated markdown file, `foo.md`. - * Visit the markdown file and compare to our previous HTML. - * Do you see how the markdown is much more useful on GitHub? Internalize that. +* Visit the markdown file and compare to our previous HTML. +* Do you see how the markdown is much more directly useful on GitHub? + Internalize this lesson. ## Put your stamp on it Select everything but the YAML frontmatter and ... delete it! -Write a single English sentence. +Write a single sentence. -Insert an empty R chunk, via the "Chunk" menu in upper right of source editor or with corresponding keyboard shortcut. +Insert an empty R chunk, via the "Chunk" menu in upper right of source editor or with the corresponding keyboard shortcut. - - -
```{r}
+````
+```{r, eval=TRUE}`r ''`
 ## insert your brilliant WORKING code here
-```
+``` +```` -Insert 1 to 3 lines of functioning code that begin the task at hand. "Walk through" and run those lines using the "Run" button or the corresponding keyboard shortcut. You MUST make sure your code actually works! +Insert 1 to 3 lines of functioning code that's relevant to you or the project where you're experimenting. +"Walk through" and run those lines using the "Run" button or the corresponding keyboard shortcut. +You MUST make sure your code actually works! Satisfied? Save! @@ -126,73 +185,145 @@ You might want to commit here. Now render the whole document via "Knit HTML." VoilĂ ! You might want to commit here. +And push. +And admire your evolving progress on GitHub. ## Develop your report -In this incremental manner, develop your report. Add code to this chunk. Refine it. Add new chunks. Go crazy! But keep running the code "manually" to make sure it works. +In this incremental manner, develop your report. +Add code to this chunk. +Refine it. +Add new chunks. +Go wild! +But keep running the code "manually" to make sure it actually works. -If it doesn't work with you babysitting it, I can guarantee you it will fail, in a more spectacular and cryptic way, when run at arms-length via "Knit HTML" or `rmarkdown::render()`. +If the code doesn't work with you babysitting it, I can guarantee you it will fail, in a more spectacular and cryptic way, when run at arms-length via "Knit HTML" or `rmarkdown::render()`. -Clean out your workspace and restart R and re-run everything periodically, if things get weird. There are lots of chunk menu items and keyboard shortcuts to accelerate this workflow. Render the whole document often to catch errors when they're easy to pinpoint and fix. Save often and commit every time you reach a point that you'd like as a "fall back" position. +Clean out your workspace and restart R and re-run everything periodically, if things get weird. +There are lots of chunk menu items and keyboard shortcuts to accelerate this workflow. +Render the whole document often to catch errors when they're easy to pinpoint and fix. +Save often and commit every time you reach a point that you'd like as a "fall back" position. You'll develop your own mojo soon, but this should give you your first successful R Markdown experience. ## Publish your report -If you've been making HTML, you can put that up on the web somewhere, email to your collaborator, whatever. +If you've been making HTML, you can put that up on the web somewhere, email it to your collaborator, whatever. -No matter what, technically you can publish this report merely by pushing a rendered version to GitHub. However, certain practices make this effort at publishing more satisfying for your audience. +No matter what, technically you can publish this report merely by pushing a rendered version to GitHub. +However, certain practices make this effort at publishing more satisfying for your audience. Here are two behaviors I find very frustrating: - * "Here is my code. Behold." This is when someone only pushes their source, i.e. R Markdown or R code AND they want other people to look at their "product". The implicit assumption is that target audience will download code and run it. Sometimes the potential payoff simply does not justify this effort. - * "Here is my HTML. Behold." This is when someone doesn't bother to edit the default output format and accepts HTML only. What am I supposed to do with HTML on GitHub? - -Creating, committing, and pushing markdown is a very functional, lighweight publishing strategy. Use `output: github_document` or `keep_md: true` if output is `html_document`. In both cases, it is critical to also commit and push everything inside `foo_files`. Now people can visit and consume your work like any other webpage. - -This is (sort of) another example of keeping things machine- and human-readable, which is bliss. By making `foo.Rmd` available, others can see and run your __actual code__. By sharing `foo.md` and/or `foo.html`, others can casually browse your end product and decide if they even want to bother. +* "Here is my code. Behold." This is when someone only pushes their source, i.e. + R Markdown or R code, AND they really want other people to appreciate their + "product". The implicit assumption is that the target audience will download + all of the data and code and execute it locally. +* "Here is my HTML. Behold." This is when someone accepts the default HTML-only + output. Remember, HTML files are GitHub are not readable by humans. Therefore, + the implicit assumption is that the target audience will download the repo + and point their browser at this HTML file, in order to see it. + HTML on GitHub? It's not readable by humans. + +Sometimes it's just very unrealistic to expect your audience to take the extra steps described above. +Often, with a very small change on your end, you can create an artefact on GitHub that your target audience can immediately appreciate. + +Creating, committing, and pushing markdown (i.e., `.md` files) is a very functional, lighweight publishing strategy. +Use `output: github_document` or, if output is `html_document`, add `keep_md: true`. +In both cases, it is critical to also commit and push everything inside `foo_files`, i.e. any figures that have been created. +Now people can visit and consume your work on GitHub, like any other webpage. + +This is (sort of) another example of a generally worthy principle, which is keeping things machine- and human-readable, whenever possible. +By making `foo.Rmd` available, others can see and run your __actual code__. +By also sharing `foo.md` and/or `foo.html`, others can casually browse your end product and decide if they want to obtain and run the code. ## HTML on GitHub -HTML files, such as `foo.html`, are not immediately useful on GitHub (though your local versions are easily viewable). Visit one and you'll see the raw HTML. Yuck. But there are ways to get a preview: such as . Expect much pain with HTML files inside private repos (hence the recommendations above to emphasize markdown). When it becomes vital for the whole world to see proper HTML in its full glory, it's time to use a more sophisticated web publishing strategy. +HTML files, such as `foo.html`, are not immediately useful on GitHub (though your local versions are easily viewable). +Visit one and you'll see the raw HTML. +Yuck. +But there are ways to get a preview: such as . Expect much pain with HTML files inside private repos (hence the recommendations above to emphasize markdown). +When it becomes vital for the whole world to see proper HTML in its full glory, it's time to use a more sophisticated web publishing strategy. I have more [general ideas](#workflows-browsability) about how to make a GitHub repo function as a website. ## Troubleshooting {#rmd-troubleshooting} -__Make sure RStudio and the `rmarkdown` package (and its dependencies) are up-to-date.__ In case of catastrophic failure to render the boilerplate R Markdown document, consider that your software may be too old. Details on the system used to render this document and how to check your setup: - - * rmarkdown version `r packageVersion("rmarkdown")`. Use `packageVersion("rmarkdown")` to check yours. - * `r R.version.string`. Use `R.version.string` to check yours. - * RStudio IDE 1.2.1555. Use *RStudio > About RStudio* or `RStudio.Version()$version` to check yours. - -__Get rid of your `.Rprofile`__, at least temporarily. I have found that a "mature" `.Rprofile` that has accumulated haphazardly over the years can cause trouble. Specifically, if you've got anything in there relating to `knitr`, `markdown`, `rmarkdown` and RStudio itself, it may be preventing the installation or usage of the most recent goodies. Comment the whole file out or rename it something else and relaunch or even re-install RStudio. - -__"I have ignored your advice and dumped a bunch of code in at once. Now my Rmd does not render."__ If you can't figure out what's wrong by reading the error messages, pick one: - - * Back out of these changes, get back to a functional state (possibly with no code), and restore them gradually. Run your code interactively to make sure it works. Render the entire document frequently. Commit after each successful addition! When you re-introduce the broken code, now it will be part of a small change and the root problem will be much easier to pinpoint and fix. - * Tell knitr to soldier on, even in the presence of errors. Some problems are easier to diagnose if you can execute specific R statements during rendering and leave more evidence behind for forensic examination. - - Insert this chunk near the top of your `.Rmd` document: -
```{r setup, include = FALSE, cache = FALSE}
-knitr::opts_chunk$set(error = TRUE)
-```
- - If it's undesirable to globally accept errors, you can still do this for a specific chunk like so: -
```{r wing-and-a-prayer, error = TRUE}
-## your sketchy code goes here ;)
-```
- * Adapt the ["git bisect" strategy](http://webchick.net/node/99): - - Put `knitr::knit_exit()` somewhere early in your `.Rmd` document, either in inline R code or in a chunk. Keep moving it earlier until things work. Now move it down in the document. Eventually you'll be able to narrow down the location of your broken code well enough to find the line(s) and fix it. - -__Check your working directory.__ It's going to break your heart as you learn how often your mistakes are really mundane and basic. Ask me how I know. When things go wrong consider: - - * What is the working directory? - * Is that file I want to read/write actually where I think it is? +__Make sure RStudio and the rmarkdown package (and its dependencies) are up-to-date.__ +In case of catastrophic failure to render the boilerplate R Markdown document, consider that your software may be too old. +Details on the system used to render this document and how to check your setup: + +* rmarkdown version `r packageVersion("rmarkdown")`. + Use `packageVersion("rmarkdown")` to check yours. +* `r R.version.string`. Use `R.version.string` to check yours. +* RStudio IDE 2021.9.0.341 ("Ghost Orchid" Preview). + Use *RStudio > About RStudio* or `RStudio.Version()$version` to check yours. + +__Get rid of your `.Rprofile`__, at least temporarily. +I have found that a "mature" `.Rprofile` that has accumulated haphazardly over the years can cause trouble. +Specifically, if you've got anything in there relating to knitr, markdown, rmarkdown, or RStudio itself, it may be preventing the installation or usage of the most recent goodies. +Comment the whole file out or rename it to something else and relaunch or even re-install RStudio. + +__"I have ignored your advice and dumped a bunch of code in at once. Now my Rmd does not render."__ +If you can't figure out what's wrong by reading the error messages, pick one: + +* Back out of these changes, get back to a functional state (possibly with no + code), and restore them gradually. Run your code interactively to make sure it + works. Render the entire document frequently. Commit after each successful + addition! When you re-introduce the broken code, now it will be part of a + small change and the root problem will be much easier to pinpoint and fix. +* Tell knitr to soldier on, even in the presence of errors. Some problems are + easier to diagnose if you can execute specific R statements during rendering + and leave more evidence behind for forensic examination. + - Insert this chunk near the top of your `.Rmd` document: + + ```` + ```{r setup, include = FALSE, cache = FALSE}`r ''` + knitr::opts_chunk$set(error = TRUE) + ``` + ```` + + - If it's undesirable to globally accept errors, you can still specify + `error = TRUE` for a specific chunk like so: + + ```` + ```{r wing-and-a-prayer, error = TRUE}`r ''` + ## your sketchy code goes here ;) + ``` + ```` + +* Adapt the ["git bisect" strategy](http://webchick.net/node/99): + - Put `knitr::knit_exit()` somewhere early in your `.Rmd` document, either in + inline R code or in a chunk. + Keep moving it earlier until things work. + Now move it down in the document. + Eventually you'll be able to narrow down the location of your broken code + well enough to find the line(s) and fix it. + +__Check your working directory.__ +It's going to break your heart as you learn how often your mistakes are really mundane and basic. +Ask me how I know. +When things go wrong consider: + +* What is the working directory? +* Is that file I want to read/write actually where I think it is? Drop these commands into R chunks to check the above: - * `getwd()` will display working directory at __run time__. If you monkeyed around with working directory with, e.g., the mouse, maybe it's set to one place for your interactive development and another when "Knit HTML" takes over? - * `list.files()` will list the files in working directory. Is the file you want even there? - -__Don't try to change working directory within an R Markdown document__. Just don't. See [knitr FAQ #5](https://yihui.name/knitr/faq/). That is all. - -__Don't be in a hurry to create a complicated sub-directory structure.__ RStudio/`knitr`/`rmarkdown` (which bring you the "Knit HTML" button) are rather opinionated about the working directory being set to the `.Rmd` file's location and about all files living together in one big happy directory. This can all be worked around. For example, I [recommend the here package](https://github.com/jennybc/here_here#readme) for building file paths, once you require sub-directories. But don't do this until you really need it. +* `getwd()` will display working directory at __run time__. + If you monkeyed around with working directory with, e.g., the mouse, maybe + it's set to one place for your interactive development and another when + "Knit HTML" takes over? +* `list.files()` will list the files in working directory. + Is the file you want even there? + +__Don't try to change working directory within an R Markdown document__. +Just don't. +See [knitr FAQ #5](https://yihui.name/knitr/faq/). +That is all. + +__Don't be in a hurry to create a complicated sub-directory structure.__ +RStudio/knitr/rmarkdown (which bring you the "Knit HTML" button) are rather opinionated about the working directory being set to the `.Rmd` file's location and about all files living together in one big happy directory. +This can all be worked around. +For example, I [recommend the here package](https://github.com/jennybc/here_here#readme) for building file paths, once you require sub-directories. +But don't do this until you really need it.