forked from hadley/r-pkgs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathman.Rmd
728 lines (549 loc) · 35.7 KB
/
man.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
# Function documentation {#sec-man}
```{r, echo = FALSE}
source("common.R")
status("restructuring")
library(stringr) # for links
```
## Introduction
Documentation is one of the most important aspects of a good package: without it, users won't know how to use your package!
Documentation is also useful for future-you (so you remember what your functions were supposed to do) and for developers extending your package.
In this chapter, you'll learn about function documentation, as accessed by `?` or `help()`.
Function documentation works like a dictionary: it's helpful if you want to know what a function does, but it won't help you find the right function for a new situation.
That's one of the jobs of vignettes, which you'll learn about in the next chapter.
In this chapter we'll focus on documenting functions, but the same ideas apply to documenting datasets, classes and generics, and packages.
You can learn more about those important topics in `vignette("rd-other", package = "roxygen2")`.
Base R provides a standard way of documenting a package where each documentation **topic** corresponds to an `.Rd` file in the `man/` directory.
These files use a custom syntax, loosely based on LaTeX, that are rendered to HTML, plain text, or pdf, as needed, for viewing.
We are not going to use these files directly.
Instead, we'll use the roxygen2 package to generate them from specially formatted comments.
There are a few advantages to using roxygen2:
- Code and documentation are intermingled so that when you modify your code, it's easy to remember to also update your documentation.
- You can with using markdown, rather learning a new text formatting syntax.
- `.Rd` boilerplate is automated away.
- It provides a number of tools for sharing content between documentation topics and even between topics and vignettes.
You'll see these files when you work with them in git, but you'll otherwise rarely need to look at them.
## roxygen2 basics
To get started, we'll work through the basic roxygen2 workflow and discuss the overall structure of roxygen2 comments which are organised into blocks and tags.
### The documentation workflow {#man-workflow}
The documentation workflow starts when you add roxygen comments, comments that start with `'`, to your source file.
Here's a simple example:
```{r}
#' Add together two numbers
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of `x` and `y`.
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}
```
Then you'll press Ctrl/Cmd + Shift + D or type `devtools::document()` which then runs `roxygen2::roxygenise()` which generates a `man/add.Rd` that looks like this:
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/across.R
\name{add}
\alias{add}
\title{Add together two numbers}
\usage{
add(x, y)
}
\arguments{
\item{x}{A number.}
\item{y}{A number.}
}
\value{
The sum of \code{x} and \code{y}.
}
\description{
Add together two numbers
}
\examples{
add(1, 1)
add(10, 1)
}
If you've used LaTeX before, this should look familiar since the `.Rd` format is loosely based on it, and if you're interested you can read more about it in [*R extensions*](https://cran.r-project.org/doc/manuals/R-exts.html#Rd-format).
Otherwise you won't need to look at it except to check it in to git.
When you use `?add`, `help("add")`, or `example("add")`, R looks for an `.Rd` file containing `\alias{add}`.
It then parses the file, converts it into HTML, and displays it.
Here's what the result looks like in RStudio:
```{r, echo = FALSE}
knitr::include_graphics("images/man-add.png", dpi = 220)
```
To preview the development documentation, devtools uses some tricks to override the usual help functions so they know where to look in your source packages.
To activate these tricks, you need to run `devtools::load_all()` once.
So if the development documentation doesn't appear, you may need to load your package first.
To summarize, there are four steps in the basic roxygen2 workflow:
1. Add roxygen2 comments to your `.R` files.
2. Press Ctrl/Cmd + Shift + D or type `devtools::document()` to convert roxygen2 comments to `.Rd` files.
3. Preview documentation with `?`.
4. Rinse and repeat until the documentation looks the way you want.
### roxygen2 comments, blocks, and tags {#roxygen-comments}
Now that you understand the basic workflow, lets talk a little more about the syntax.
roxygen2 comments start with `#'` and the set of all roxygen2 comments preceding a function is called a **block**.
Blocks are broken up by **tags**, which look like `@tagName tagValue`.
The content of a tag extends from the end of the tag name to the start of the next tag[^man-1].
A block can contain text before the first tag which is called the **introduction**. By default, each roxygen2 block will generate a single documentation **topic**, i.e. one `.Rd` file[^man-2] in the `man/` directory
.
[^man-1]: Or the end of the block, if it's the last tag.
[^man-2]: The name of the file is automatically derived from the object you're documenting.
Throughout this chapter I'm going to show you roxygen2 comments from real tidyverse packages, focusing on stringr since the functions there tend to be fairly straightforward leading to documentation that is easier to excerpt.
Here's a simple first example: the documentation for `stringr::str_unique()`.
```{r}
#' Remove duplicated strings
#'
#' `str_unique()` removes duplicated values, with optional control over
#' how duplication is measured.
#'
#' @param string A character vector to return unique entries.
#' @param ... Other options used to control matching behavior between duplicate
#' strings. Passed on to [stringi::stri_opts_collator()].
#' @returns A character vector.
#' @seealso [unique()], [stringi::stri_unique()] which this function wraps.
#' @examples
#' str_unique(c("a", "b", "c", "b", "a"))
#'
#' # Use ... to pass additional arguments to stri_unique()
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"))
#' str_unique(c("motley", "mötley", "pinguino", "pingüino"), strength = 1)
#' @export
str_unique <- function(string, ...) {
...
}
```
Here the introduction includes the title ("Remove duplicated strings") and a basic description of what the function does.
It's followed by five tags, two `@params`, one `@returns`, one `@seealso`, one `@examples`, and one `@export`.
Note that I've wrapped each line of the roxygen2 block 80 characters wide, to match the wrapping of my code, and I've indented the second and subsequent lines of the long `@param` tag so it's easier to scan.
You can get more roxygen2 style advice in the [tidyverse style guide](https://style.tidyverse.org/documentation.html).
The following sections will work through the most important tags.
We'll start with the introduction which provides the title, description, and details, then we'll cover the inputs (the function arguments), outputs (the return value), and examples.
We'll then discuss links and cross-references, and finish off with some techniques to share documentation between topics.
## Title, description, details
The block introduction provides a title, description, and, optionally, details, for the function:
- The **title** is taken from the first sentence.
It should be written in sentence case, not end in a full stop, and be followed by a blank line.
The title is shown in various function indexes and what the user will see when browsing functions.
- The **description** is taken from the next paragraph.
It comes first in the documentation and should briefly describe the most important features of the function.
- Additional **details** are anything after the description.
Details are optional, but can be any length so are useful if want to dig deep into some important aspect of the function.
The following sections describe each component in more detail, and then discuss a few useful related tags.
### Title
When figuring out what to use as a title, I think it's most important to consider the functions in your package holistically.
When the user is skimming the index, how will they find the function to solve their current problem?
What do functions have in common that doesn't need to be repeated in every title?
What is unique to that function and should be highlighted?
As an example, take the titles of some of the key dplyr functions[^man-3]:
[^man-3]: Like all the examples, these might have changed a bit since we wrote this book, because we're constantly striving to do better.
You might compare what's in the book to what we now use, and consider if you think if it's an improvement.
- `mutate()`: Create, modify, and delete columns.
- `summarise()`: Summarise each group to fewer rows.
- `filter()`: Subset rows using column values.
- `select()`: Subset columns using their names and types.
- `arrange()`: Arrange rows by column values.
Here we've tried to succinctly describe what the function does, making sure to describe whether it affects rows, columns, or groups.
We do our best to use synonyms, instead of repeating the function name, to hopefully give folks another chance to understand the intent of the function.
At the time we wrote this, I don't think the function titles for stringr were that successful.
But they provide a useful negative case study:
- `str_detect()`: Detect the presence or absence of a pattern in a string.
- `str_extract()`: Extract matching patterns from a string.
- `str_locate()`: Locate the position of patterns in a string.
- `str_match()`: Extract matched groups from a string.
There's a lot of repetition ("pattern", "from a string") and the verb used for the function name is repeated in the title, so if you don't understand the function already, the title seems unlikely to help much.
(In hindsight, it also seems like the function names could have been better chosen.) Hopefully we'll have improved those titles by the time you read this.
### Description
The purpose of the description is to summarize the goal of the function, usually in under a paragraph.
This can be challenging for simple functions, because it might feel like you're repeating the title of the function.
But it's okay for the description to be a little duplicative of the rest of the documentation; it's often useful for the reader to see the same thing expressed in two different ways.
It's a little extra work keeping it all up to date, but the extra effort is often worth it.
```{r}
#' Detect the presence/absence of a pattern
#'
#' `str_detect()` returns a logical vector `TRUE` if `pattern` is found within
#' each element of `string` or a `FALSE` if not. It's equivalent
#' `grepl(pattern, string)`.
```
If you want to use multiple paragraphs or a bulleted list, you can use the explicit `@description` tag[^man-4].
Here's an example from `stringr::str_like()`, which mimics the `LIKE` operator from SQL:
[^man-4]: You can also use explicit `@title` and `@details` tags if needed, but we don't generally recommend them because they add extra noise to the docs without enabling any extra functionality.
```{r}
#' Detect the a pattern in the same way as `SQL`'s `LIKE` operator.
#'
#' @description
#' `str_like()` follows the conventions of the SQL `LIKE` operator:
#'
#' * Must match the entire string.
#' * `_` matches a single character (like `.`).
#' * `%` matches any number of characters (like `.*`).
#' * `\%` and `\_` match literal `%` and `_`.
#' * The match is case insensitive by default.
```
Finally, it's often particularly hard to write a good description if you've just written the function because the purpose seems so intuitively obvious.
Do your best, and then come back in a couple of months when you've forgotten exactly what the function does, and re-write the description to jog your memory.
### Details
The "details" are just any additional details or explanation that you think your function needs.
Most functions don't need details, but some functions need a lot.
If you have a lot of information to convey, I recommend using markdown headings to break the documentation up into sections.
Here's a example from `dplyr::mutate()`.
We've elided some of the details to keep this example short, but you should still get a sense of how we used headings to break up the content in to skimmable chunks:
```{r}
#' Create, modify, and delete columns
#'
#' `mutate()` adds new variables and preserves existing ones;
#' `transmute()` adds new variables and drops existing ones.
#' New variables overwrite existing variables of the same name.
#' Variables can be removed by setting their value to `NULL`.
#'
#' # Useful mutate functions
#'
#' * [`+`], [`-`], [log()], etc., for their usual mathematical meanings
#'
#' ...
#'
#' # Grouped tibbles
#'
#' Because mutating expressions are computed within groups, they may
#' yield different results on grouped tibbles. This will be the case
#' as soon as an aggregating, lagging, or ranking function is
#' involved. Compare this ungrouped mutate:
#'
#' ...
```
Note that even though these headings come immediately after the description they are shown much later (after the function arguments and return value) in the rendered documentation.
In older code, you might also see the use of `@section title:` which was used to create sections before roxygen2 fully supported RMarkdown.
You can now move these below the description and turn them into markdown headings.
## Arguments
For most functions, the bulk of your work will go towards documenting how each argument affects the output of the function.
For this purpose, you'll use `@param` (short for parameter, a synonym of argument) followed by the argument name and a description of its action.
The most important job of the description is to provide a succinct summary of the allowed inputs and what the parameter does.
For example, here's `str_detect()`:
```{r}
#' @param string Input vector. Either a character vector, or something
#' coercible to one.
```
And here are three of the arguments to `str_flatten()`:
```{r}
#' @param collapse String to insert between each piece. Defaults to `""`.
#' @param last Optional string use in place of final separator.
#' @param na.rm Remove missing values? If `FALSE` (the default), the result
#' will be `NA` if any element of `string` is `NA`.
```
Note that `@param collapse` and `@param na.rm` describe their default arguments.
This is good practice because the function usage (which shows the default values) and the argument description are often quite far apart.
The primary downside is that introducing this duplication means that you'll need to update the docs if you change the default value; we believe this small amount of extra work is worth it to make the life of the user easier.
If an argument has a fixed set of possible parameters, you should list them.
If they're simple, you can just list them in a sentence, like in `str_trim()`:
```{r}
#' @param side Side on which to remove whitespace: `"left"`, `"right"`, or
#' `"both"` (the default).
```
If they need more explanation, you might use a bulleted list, as in `str_wrap()`:
```{r}
#' @param whitespace_only A boolean.
#' * `TRUE` (the default): wrapping will only occur at whitespace.
#' * `FALSE`: can break on any non-word character (e.g. `/`, `-`).
```
The documentation for most arguments tends to be relatively short, often one or two sentences.
But you should take as much space as you need, and you'll see some examples of multi-paragraph argument documentation shortly.
### Multiple arguments
If the behavior of multiple arguments is tightly coupled, you can document them together by separating the names with commas (with no spaces).
For example, in `str_equal()` `x` and `y` are interchangeable, so they're documented together:
```{r}
#' @param x,y A pair of character vectors.
```
In `str_sub()` `start` and `end` define the range of characters to replace, and you can use just `start` if you pass in a two-column matrix.
So it makes sense to document them together:
```{r}
#' @param start,end Two integer vectors. `start` gives the position
#' of the first character (defaults to first), `end` gives the position
#' of the last (defaults to last character). Alternatively, pass a two-column
#' matrix to `start`.
#'
#' Negative values count backwards from the last character.
```
In `str_wrap()` `indent` and `exdent` define the indentation for the first line and all subsequent lines respectively:
```{r}
#' @param indent,exdent A non-negative integer giving the indent for the
#' first line (`indent`) and all subsequent lines (`exdent`).
```
### Inheriting arguments
If your package contains many closely related functions, it's common for them to have arguments that share the same name and meaning.
It would be annoying and error prone to copy and paste the same `@param` documentation to every function so roxygen2 provides `@inheritParams` which allows you to inherit argument documentation from another package.
stringr uses `@inheritParams` extensively because most functions have `string` and `pattern` arguments.
So `str_detect()` documents them in detail:
```{r}
#' @param string Input vector. Either a character vector, or something
#' coercible to one.
#' @param pattern Pattern to look for.
#'
#' The default interpretation is a regular expression, as described
#' `vignette("regular-expressions")`. Control options with [regex()].
#'
#' Match a fixed string (i.e. by comparing only bytes), using
#' [fixed()]. This is fast, but approximate. Generally,
#' for matching human text, you'll want [coll()] which
#' respects character matching rules for the specified locale.
#'
#' Match character, word, line and sentence boundaries with
#' [boundary()]. An empty pattern, "", is equivalent to
#' `boundary("character")`.
```
Then the other stringr functions use `@inheritParams str_detect` to get a detailed documentation for `string` and `pattern` without having to duplicate that text.
`@inheritParams` only inherits docs for arguments that aren't already documented, so you can document some arguments and inherit others.
`str_match()` uses this to inherit its standard `string` argument but document its unusual `pattern` argument:
```{r}
#' @inheritParams str_detect
#' @param pattern Unlike other stringr functions, `str_match()` only supports
#' regular expressions, as described `vignette("regular-expressions")`.
#' The pattern should contain at least one capturing group.
```
You can inherit documentation from a function in another package by using the standard `::` notation, i.e. `@inheritParams anotherpackage::function`.
This does introduce one small annoyance: now the documentation for your package is no longer self-contained and the version of `anotherpackage` can affect the generated docs.
Beware of spurious diffs caused by contributors with different installed versions.
## Return value
As important as a function's inputs are its outputs.
Documenting the outputs is the job of the `@returns`[^man-5] tag.
Here the goal of the docs is not to describe exactly how the values are computed (which is the job of the description and details), but to roughly describe the overall "shape" of the output, i.e. what sort of object it is, and its dimensions (if that makes sense).
For example, if your function returns a vector you might describe its type and length, or if your function returns a data frame you might describe the names and types of the columns and the expected number of rows.
[^man-5]: For historical reasons, you can also use `@return`, but I think you should use `@returns` because it reads a little nicer.
The return documentation for functions in stringr are straightforward because almost all functions return some type of vector with the same length as one of the inputs.
For example, here's `str_like()`:
```{r}
#' @returns A logical vector the same length as `string`.
```
A more complicated case is the joint documentation for `str_locate()` and `str_locate_all()`[^man-6].
`str_locate()` returns an integer matrix, and `str_locate_all()` returns a list of matrices, so the text needs to describe what defines the rows and columns.
[^man-6]: We'll come back how to document multiple functions in one topic in @sec-multiple-functions.
```{r}
#' @return `str_locate()` returns an integer matrix with two columns and
#' one row for each element of `string`. The first column, `start`,
#' gives the position at the start of the match, and second column, `end`,
#' gives the position of the end.
#'
#' `str_locate_all()` returns a list of integer matrices as above, but
#' the matrices have one row for each match in the corresponding element
#' in `string`.
```
In other cases it can be easier to figure out what to describe by thinking about the set of functions and how they differ.
For example, most dplyr functions return data frames, so just saying `@return A data frame` is not very useful.
Instead we sat down and thought about exactly what makes each function different.
We decided it makes sense to describe each function in terms of how it affects the rows, the columns, the groups, and the attributes.
For example, here's `dplyr::filter()`:
```{r}
#' @returns
#' An object of the same type as `.data`. The output has the following properties:
#'
#' * Rows are a subset of the input, but appear in the same order.
#' * Columns are not modified.
#' * The number of groups may be reduced (if `.preserve` is not `TRUE`).
#' * Data frame attributes are preserved.
```
`@returns` is also a good place to describe any important warnings or errors that the user might see here.
For example `readr::read_csv()`:
```{r}
#' @returns A [tibble()]. If there are parsing problems, a warning will alert you.
#' You can retrieve the full details by calling [problems()] on your dataset.
```
::: callout-warning
## Submitting to CRAN
For your initial CRAN submission, all functions must document their return value.
This is not required for subsequent submission, but it's still good practice.
There's currently no way to check that you've documented the return value of every function (we're [working on it](https://github.com/r-lib/roxygen2/issues/1334)) which is why you'll notice some tidyverse functions lack output documentation.
:::
## Links and cross-references
- Regular markdown to link to web pages: [`https://r-project.org`](https://r-project.org) or `[The R Project](https://r-project.org)`.
- To link to a function we slightly abuse markdown syntax: `[function()]` or `[pkg::function()]`. To link to non-function documentation just omit the `()`: `[topic]`, `[pkg::topic]`.
Useful tags
- `@seealso` allows you to point to other useful resources, either on the web, in your package `[functioname()]`, or another package `[pkg::function()]`.
- If you have a family of related functions where every function should link to every other function in the family, use `@family`.
The value of `@family` should be plural.
When you start using links (and images), you'll also need to use a new documentation workflow, as the workflow described above does not show images or links between topics or.
If you'd like to also see links, you can use this slower but more comprehensive workflow:
1. Re-document you package `Cmd + Shift + D`.
2. Build and install your package by clicking ![](images/build-reload.png){width="91"} in the build pane or by pressing Ctrl/Cmd + Shift + B.
This installs it in your regular library, then restarts R and reloads your package.
3. Preview documentation with `?`.
## Examples {#sec-examples}
Describing how a function works is great, but *showing* how it works is even better.
That's the role of the `@examples` tag, which uses executable R code to show what a function does.
Unlike other parts of the documentation where we've focused mainly on what you should write, here we'll briefly give some content advice and then focus mainly on the mechanics.
The mechanics of examples are complex because they must not error, and they're run in four different situations:
- Interactively using the `example()` function.
- During `R CMD check` on your computer, or another computer you control (e.g. GitHub action).
- During `R CMD check` run by CRAN.
- When building your pkgdown website.
After discussing what to put in your examples, we'll talk about keeping your examples self-contained, how to display errors if needed, handling dependencies, running examples conditionally, and
### Contents
Use examples to show the basic operation of the function, and then to highlight any particularly important properties.
For example, `str_detect()` starts by showing a few simple variations and then highlights a property you might easily miss from reading the docs: as well as passing a vector of strings and one pattern, you can also pass one string and vector of patterns.
```{r}
#' @examples
#' fruit <- c("apple", "banana", "pear", "pineapple")
#' str_detect(fruit, "a")
#' str_detect(fruit, "^a")
#' str_detect(fruit, "a$")
#'
#' # Also vectorised over pattern
#' str_detect("aecfg", letters)
```
Try to stay focused on the most important features without getting into the weeds of every last edge case: if you make the examples too long, it becomes hard for the user to find the key application that they're looking for.
If you find yourself writing very long examples, it may be a sign that you should write a vignette instead.
There aren't any formal ways to break up your examples into sections but you can use sectioning comments that use many `===` or `---` to create a visual breakdown.
Here's an example from `tidyr::chop()`:
```{r}
#' @examples
#' # Chop ==============================================================
#' df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1)
#' # Note that we get one row of output for each unique combination of
#' # non-chopped variables
#' df %>% chop(c(y, z))
#' # cf nest
#' df %>% nest(data = c(y, z))
#'
#' # Unchop ============================================================
#' df <- tibble(x = 1:4, y = list(integer(), 1L, 1:2, 1:3))
#' df %>% unchop(y)
#' df %>% unchop(y, keep_empty = TRUE)
#'
#' #' # Incompatible types -------------------------------------------------
#' # If the list-col contains types that can not be natively
#' df <- tibble(x = 1:2, y = list("1", 1:3))
#' try(df %>% unchop(y))
```
Strive to keep the examples focused on the specific function that you're documenting.
If you can make the point with a familiar built-in dataset, like `iris`, do so.
If you find yourself needing to do a bunch of setup to create a dataset or object to use in the example, it may be a sign that you need to create a package dataset.
See @sec-data for details.
### Pack it in; pack it out
As much as possible, keep your examples as self-contained as possible.
For example, this means:
- If you modify `options()`, reset them at the end of the example.
- If you create a file, create it somewhere in `tempdir()` and make sure to delete it at the end of the example.
- Don't change the working directory.
- Don't write to the clipboard.
- Avoid accessing websites in examples. If the website is down, your example will fail and hence `R CMD check` will error.
Unfortunately due to the way that examples are run during `R CMD check` there's no way to use familiar tools like withr to enforce these constraints.
Instead you'll need to do it by hand.
These constraints are often in tension with good documentation if you're trying to document a function that somehow changes the state of the world.
So if you're finding it really hard to follow these rules, this might be another sign to switch to a vignette.
::: callout-warning
## Submitting to CRAN
Many of these constraints are also mentioned in the [CRAN repository policy](https://cran.r-project.org/web/packages/policies.html), which you must adhere to when submitting to CRAN.
Use find in page to search for "malicious or anti-social" to see the details.
:::
Additionally, you want your examples to send the user on a short walk, not a long hike.
Examples need to execute relatively quickly so users can quickly see the results, it doesn't take ages to build your website, automated checks happen quickly, and it doesn't take up computing resources when submitting to CRAN.
::: callout-warning
## Submitting to CRAN
All examples must run in under 10 minutes.
:::
### Errors
What can you do if you want to include code that causes an error for the purposes of teaching.
There are two basic options:
- You can wrap the code in `try()` so that the error is shown, but doesn't stop execution of the error.
- You can wrap the code `\dontrun{}`[^man-7] so it is never run by `example()`.
[^man-7]: You used to be able to use `\donttest{}` for a similar purpose, but we no longer recommended it because CRAN sets a special flag that causes it to be executed.
### Dependencies and conditional execution
You can only use packages in examples that your package depends on (i.e. that appear in `Imports` or `Suggests`).
Example code is run in the user's environment, not the package environment, so you'll have to either explicitly attach the package with `library()` or refer to each function with `::`.
In the past, we recommended only using code from suggested packages inside an if block that used `if (requireNamespace("suggested_package", quietly = TRUE))`.
Today, we no longer recommend that technique because:
- We expect that suggested packages are installed when running `R CMD check`[^man-8].
- The cost of wrapping code in `{}` is high: you can no longer see intermediate results. The cost of a package not being installed is low: users can usually recognize the package not loaded error and can resolve it themselves.
[^man-8]: This is certainly true for CRAN and is true in most other automated checking scenarios.
In other cases, your example code may depend on something other than a package being installed.
For example, if your examples talk to a web API, you probably only want to run them if the user is authenticated, and want to avoid such code being run on CRAN.
In this case you can use `@examplesIf` instead of `@examples`.
The code in an `@examplesIf` block will only be executed if some condition is `TRUE`:
```{r}
#' @examplesIf some_condition()
#' some_other_function()
#' some_more_functions()
```
googledrive uses `@examplesIf` in almost every function because the examples can only work if you have an authenticated and active connection to Google Drive as judged by `googledrive::drive_has_token()`.
For example, here's `googledrive::drive_publish()`:
```{r}
#' @examplesIf drive_has_token()
#' # Create a file to publish
#' file <- drive_example_remote("chicken_sheet") %>%
#' drive_cp()
#'
#' # Publish file
#' file <- drive_publish(file)
#' file$published
```
::: callout-warning
## Submitting to CRAN
For initial CRAN submission of your package, all functions must contain some runnable examples (i.e. there must be examples and they must not all be wrapped in `\dontrun{}`).
:::
### Intermixing examples and text
An alternative to examples is to use RMarkdown's code blocks, either ```` ```R ```` if you just want to show some code or ```` ```{r} ```` if you want the code to be run.
These can be effective techniques but there are downsides to each:
- The code in ```` ```R ```` blocks is never run; this means it's easy to accidentally introduce syntax errors or to forget to update it when your package changes.
- The code in ```` ```{r} ```` blocks is run every time you document the package. This has the nice advantage of including the output in the documentation (unlike examples), but the code can't take very long to run or your iterative documentation workflow will become quite painful.
## Re-using documentation
roxygen2 provides a number of features that allow you to reuse documentation across topics.
They are documented in `vignettes("reuse", package = "roxygen2")` so here we'll focus on the three most important:
- Documenting multiple functions in one topic.
- Inheriting documentation from another topic.
- Use child documents to share prose between topics, or to share between documentation topics and vignettes.
### Multiple functions in one topic {#sec-multiple-functions}
By default, each function gets its own documentation topic, but if two functions are very closely connected you can combine the documentation for multiple functions into a single topic.
For example, take `str_length()` and `str_width()` which provide two different ways of computing the size of a string.
As you can see from the description, both functions are documented together, because this makes it easy to see how they differ:
```{r}
#' The length/width of a string
#'
#' @description
#' `str_length()` returns the number of codepoints in a string. These are
#' the individual elements (which are often, but not always letters) that
#' can be extracted with [str_sub()].
#'
#' `str_width()` returns how much space the string will occupy when printed
#' in a fixed width font (i.e. when printed in the console).
#'
#' ...
str_length <- function(string) {
...
}
```
To merge the two topics, `str_width()` uses `@rdname str_length` to add its documentation to an existing topic:
```{r}
#' @rdname str_length
str_width <- function(string) {
...
}
```
This technique is best used for functions that have not just similar arguments, but also similar return value and related examples, as discussed next.
### Inheriting documentation
In other cases, functions in a make might share many related behaviors, but aren't closely enough connected that you want to document them together.
Instead, you can use `@inherits`, which generalizes `@inheritParams`, to inherit any component of the document from one topic.
There are three useful inherit tags:
- `@inherit source_function` will inherit all supported components from `source_function`.
You can choose to only inherit selected components by listing them after the function name, e.g. `@inherit source_function return details`.
The complete list of currently supported components are `r paste0("\x60", roxygen2:::inherit_components, "\x60", collapse = ", ")`.
- `@inheritSection source_function Section title` will inherit the single section with title "Section title" from `source_function()`.
- `@inheritDotParams` automatically generates parameter documentation for `...` for the common case where you pass `...` on to another function.
Because you often override some arguments, it comes with a flexible specification for argument selection:
- `@inheritDotParams foo` takes all parameters from `foo()`.
- `@inheritDotParams foo a b e:h` takes parameters `a`, `b`, and all parameters between `e` and `h`.
- `@inheritDotParams foo -x -y` takes all parameters except for `x` and `y`.
All of these tags also work to inherit documentation from functions in another package by using `pkg::source_function`.
### Child documents
Finally, you can use the same `.Rmd` or `.md` document in the documentation, `README.Rmd`, and vignettes by using RMarkdown child documents.
The syntax looks like this:
```{r child = "common.Rmd"}`r ''`
```
The included Rmd file can have roxygen Markdown-style links to other help topics.
E.g. `[roxygen2::roxygenize()]` will link to the manual page of the `roxygenize` function in roxygen2.
See `vignette("rd-formatting")` for details.
If the Rmd file contains roxygen (Markdown-style) links to other help topics, then some care is needed, as those links will not work in Rmd files by default.
A workaround is to specify external HTML links for them.
These external locations will *not* be used for the manual which instead always links to the help topics in the manual.
Example:
See also the [roxygen2::roxygenize()] function.
[roxygen2::roxygenize()]: https://roxygen2.r-lib.org/reference/roxygenize.html
This example will link to the supplied URLs in HTML / Markdown files and it will link to the `roxygenize` help topic in the manual.
Note that if you add external link targets like these, then roxygen will emit a warning about these link references being defined multiple times (once externally, and once to the help topic).
This warning originates in Pandoc, and it is harmless.