-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.rmd
executable file
·93 lines (73 loc) · 2.39 KB
/
README.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# generalconference
[General Conference](https://www.churchofjesuschrist.org/study/general-conference?lang=eng) is a semi-annual event where members of The Church of Jesus Christ of Latter-day
Saints gather to listen to church prophets, apostles, and other leaders.
This package both scrapes General Conference talks and provides all talks
in a data package for analysis in R.
## Install the package
```{r, eval=F}
# install.packages('devtools')
devtools::install_github("bryanwhiting/generalconference")
```
Load the package:
```{r, message = FALSE}
library(generalconference)
```
Load the General Conference corpus, which is a tibble with nested data
for each conference, session, talk, and paragraph.
```{r}
data("genconf")
head(genconf)
```
## Getting Started
Unnest it to analyze individual talks, which can be unnested further to
the paragraph level.
```{r}
library(dplyr)
library(tidyr)
genconf %>%
tidyr::unnest(sessions) %>%
tidyr::unnest(talks) %>%
head()
```
Analyze individual paragraphs that contain the word "faith":
```{r}
library(gt)
genconf %>%
# unpack/unnest the dataframe, which is a tibble of lists
tidyr::unnest(sessions) %>%
tidyr::unnest(talks) %>%
tidyr::unnest(paragraphs) %>%
# extract just the date, title, author and paragraph
# date, title, and author will be repeated fields, with paragraph unique
select(date, title1, author1, paragraph) %>%
# Filter to just the paragraphs that mention the word "faith"
filter(stringr::str_detect(paragraph, "faith")) %>%
# take top 5 records
head(5) %>%
# convert into a gt() table with row groups for date/title/author
# (use row groups since these data are replicated by paragraph)
group_by(date, title1, author1) %>%
gt() %>%
tab_options(
row_group.background.color = 'lightgray'
) %>%
tab_header(
title='Paragraphs on Faith',
subtitle='Grouped by talk'
)
```
* See [Example Analysis](articles/example-analysis.html) for more examples on how to analyze the data.
* See [genconf](reference/genconf.html) for a schema of the data.
See documentation for scrapers, if you need them. But you shouldn't need them since all the data is available.