Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to sync with Python 1st ed #576

Merged
merged 185 commits into from
Dec 23, 2023
Merged
Changes from 1 commit
Commits
Show all changes
185 commits
Select commit Hold shift + click to select a range
944ea69
Add python link to readme
trevorcampbell Sep 12, 2023
e2a2e25
added link to py version
trevorcampbell Sep 12, 2023
8309582
get rid of random emph note and replace with a regular sentence
trevorcampbell Sep 12, 2023
d134275
update workflow to run on PR with stale dockerfile check
trevorcampbell Sep 13, 2023
8cfdf14
Merge pull request #531 from UBC-DSCI/workflow-on-pr
trevorcampbell Sep 13, 2023
44b29ee
bool -> string check in workflow
trevorcampbell Sep 23, 2023
b01e87c
added analytics file
trevorcampbell Sep 24, 2023
ba04459
fix M->Malignant, B->Benign causing missing rendered numbers
trevorcampbell Sep 25, 2023
10038b3
workflow improvements (auto pr / main deploy builds, PR commenting) f…
trevorcampbell Sep 25, 2023
9cff7d0
Merge pull request #542 from UBC-DSCI/workflow-improvements
trevorcampbell Sep 25, 2023
0b84c0f
Merge pull request #540 from UBC-DSCI/analytics
trevorcampbell Sep 25, 2023
0b24720
remove latex formatting unnecessary
trevorcampbell Sep 25, 2023
647fccc
always run build env to allow deploy pr to trigger
trevorcampbell Sep 25, 2023
ac47f2b
fix inconsistent subheading in scraping/api
trevorcampbell Sep 25, 2023
bbad134
remove redundant text
trevorcampbell Sep 25, 2023
7b2515c
minor wordsmith fwd sel start
trevorcampbell Sep 25, 2023
e5cb4a2
fix broken html syntax in scraping
trevorcampbell Sep 25, 2023
9cbe2a6
added canada wiki page html for reproducible scraping
trevorcampbell Sep 25, 2023
41754e7
make html scraping section reproducible
trevorcampbell Sep 25, 2023
f2187d3
convert twitter API to nasa API
trevorcampbell Sep 25, 2023
57c2b88
minor improvement to formatting variable names in clustering
trevorcampbell Sep 27, 2023
0b10d8e
add gcf ack
trevorcampbell Sep 27, 2023
acfb8d2
mlee comments addressed
trevorcampbell Sep 27, 2023
9caf0ca
remove unnecessary heading in parameters png nasa
trevorcampbell Sep 27, 2023
8b4ca4f
recrop rho oph img
trevorcampbell Sep 27, 2023
08295e9
Merge pull request #541 from UBC-DSCI/minor-fixes
trevorcampbell Sep 28, 2023
8826eed
add new workflows to readme
trevorcampbell Sep 28, 2023
dd5a76d
Merge pull request #547 from UBC-DSCI/trevorcampbell-patch-1
trevorcampbell Sep 28, 2023
0f36968
Update README.md
trevorcampbell Sep 28, 2023
9b4219a
add build_html.sh change trigger to build_book workflow
trevorcampbell Nov 8, 2023
0bd647f
add build_html.sh change trigger to deploy_main workflow
trevorcampbell Nov 8, 2023
9f952f0
added dockerfile trigger to pr deploy
trevorcampbell Nov 8, 2023
27f7c6f
Update deploy_pr_preview.yml
trevorcampbell Nov 8, 2023
b320b35
only deploy prs targeting main
trevorcampbell Nov 8, 2023
80269d0
accuracy -> RMSPE in reg1
trevorcampbell Nov 10, 2023
51c5b22
added py version quantile comment
trevorcampbell Nov 10, 2023
0b37739
fix dollar sign typesetting in inference
trevorcampbell Nov 10, 2023
7bdb564
dollar sign fixes in reg1
trevorcampbell Nov 10, 2023
7a38ab5
dollar sign fixes in reg2
trevorcampbell Nov 10, 2023
b9e56ea
better path section in reading
trevorcampbell Nov 10, 2023
b0bebda
fix dataset
trevorcampbell Nov 10, 2023
edac32a
Merge pull request #553 from UBC-DSCI/reg1-accuracy-fix
trevorcampbell Nov 10, 2023
53a7fc8
Merge branch 'main' into dollar-sign-fix
trevorcampbell Nov 10, 2023
fc3bf2a
Merge pull request #555 from UBC-DSCI/dataset-data-set
trevorcampbell Nov 10, 2023
66a36b9
typo fix
trevorcampbell Nov 10, 2023
178d846
Merge pull request #554 from UBC-DSCI/dollar-sign-fix
trevorcampbell Nov 11, 2023
bc659ef
Merge pull request #556 from UBC-DSCI/path-gps
trevorcampbell Nov 11, 2023
fa8b892
true to actual
trevorcampbell Nov 11, 2023
cc7d936
added better barplot discussion
trevorcampbell Nov 11, 2023
18b7caa
landmass caption improvement
trevorcampbell Nov 11, 2023
4ebb927
tru -> actual in clsfn1
trevorcampbell Nov 11, 2023
c8fda62
Merge pull request #557 from UBC-DSCI/truly-actually
trevorcampbell Nov 11, 2023
2c1ea4c
fix caption
trevorcampbell Nov 11, 2023
dadd05b
improved text about barplot refine in viz
trevorcampbell Nov 11, 2023
bbb06c9
added barplot title
trevorcampbell Nov 11, 2023
e564139
smaller font plot title
trevorcampbell Nov 11, 2023
fc78bb6
shortened plot title in barplot
trevorcampbell Nov 11, 2023
b0b9b93
proper dockerfile diff tracking in update build env
trevorcampbell Nov 11, 2023
d99aca7
typo fix in update build env
trevorcampbell Nov 11, 2023
d4d9123
revert buggy exit code checks in update_build_env
trevorcampbell Nov 11, 2023
c6714d1
fix update build environment triggering
trevorcampbell Nov 12, 2023
d66be90
trying to keep size 12 font in barplot title for consistency
trevorcampbell Nov 12, 2023
5cf4118
font size back to 10 on barplot with title
trevorcampbell Nov 12, 2023
a134490
Update viz.Rmd
trevorcampbell Nov 12, 2023
c9b277a
Merge pull request #558 from UBC-DSCI/bar-mean
trevorcampbell Nov 12, 2023
23e651d
working on adding colns in db using mutate
trevorcampbell Nov 12, 2023
ef466c0
minor ed
trevorcampbell Nov 12, 2023
b4df504
minor fix
trevorcampbell Nov 12, 2023
22baaf4
Merge pull request #559 from UBC-DSCI/db-create-column
trevorcampbell Nov 12, 2023
67baa3e
improved clustering usage of data files / minor bugfixes
trevorcampbell Nov 12, 2023
3f8ca6c
points overtop lines in the cluster distance diagrams
trevorcampbell Nov 12, 2023
6fb43f8
renaming existing data files, removing unused
trevorcampbell Nov 12, 2023
2f32135
fix order of warning/df in reading
trevorcampbell Nov 12, 2023
f91c839
minor aesthetic improvement -- keep coln ordering the same
trevorcampbell Nov 12, 2023
fbcf2db
Merge pull request #560 from UBC-DSCI/penguins-simpler-data
trevorcampbell Nov 12, 2023
ad273ca
remove extra stuff about confusion matrix (we now have a better intro…
trevorcampbell Nov 13, 2023
bb240f1
Merge pull request #561 from UBC-DSCI/python-synch-bestparams-fit
trevorcampbell Nov 13, 2023
ad07ec5
empty commit
trevorcampbell Nov 13, 2023
154a7fb
evaluating on the test set in clsfcn2
trevorcampbell Nov 14, 2023
0ee258a
fixing inconsistent train/test split in reg1,2
trevorcampbell Nov 14, 2023
20301d8
seed hacking to get reg1 and reg2 story to align with py
trevorcampbell Nov 14, 2023
7f4ebb9
lobjs intro
trevorcampbell Nov 14, 2023
a4113c6
learning objs reading
trevorcampbell Nov 14, 2023
bc51506
more discussion of prec/rec; robustifying the cv5 vs 10
trevorcampbell Nov 15, 2023
f42a768
revert 50fold removal; now with less seed hacking needed
trevorcampbell Nov 15, 2023
2b821ff
Update source/regression2.Rmd
trevorcampbell Nov 15, 2023
6c3df20
Merge pull request #562 from UBC-DSCI/train-test-improvements
trevorcampbell Nov 15, 2023
e33fe32
learning objs cls1
trevorcampbell Nov 15, 2023
f288f4a
learning objs cls2
trevorcampbell Nov 15, 2023
05c2de9
knn uniformization
trevorcampbell Nov 15, 2023
a393846
minor ed cls2 lobjs
trevorcampbell Nov 15, 2023
46df334
remove trailing whitespaces
trevorcampbell Nov 15, 2023
cf8a274
Merge branch 'main' into learning-objectives
trevorcampbell Nov 15, 2023
5e73d67
reg1 reg2 learning objs
trevorcampbell Nov 16, 2023
ebb5733
added under/overfitting to lobjs in reg1, cls2
trevorcampbell Nov 16, 2023
41b687e
clustering lobjs
trevorcampbell Nov 16, 2023
964dd83
viz lobjs
trevorcampbell Nov 16, 2023
c6a13dd
wrangling lobjs
trevorcampbell Nov 16, 2023
547ed73
uniformize K-NN
trevorcampbell Nov 16, 2023
8bdc75e
more uniformization
trevorcampbell Nov 16, 2023
aaf4928
KNN to K-NN
trevorcampbell Nov 16, 2023
a298b3f
stars to dashes lobjs
trevorcampbell Nov 16, 2023
8ee0821
Merge pull request #563 from UBC-DSCI/learning-objectives
trevorcampbell Nov 16, 2023
6e39efb
empty commit
trevorcampbell Nov 16, 2023
44fd827
move citation out of caption for pdf build sync with python
trevorcampbell Nov 16, 2023
a321df2
intro index
trevorcampbell Nov 16, 2023
e4b5de2
Reading index
trevorcampbell Nov 16, 2023
3a602f8
index wrangling
trevorcampbell Nov 16, 2023
2cabe8e
viz index
trevorcampbell Nov 16, 2023
fbbd73d
jupyter vsnctl idcs
trevorcampbell Nov 16, 2023
e869d58
setup index
trevorcampbell Nov 16, 2023
6a7057a
index inference
trevorcampbell Nov 16, 2023
948ef88
clustering index
trevorcampbell Nov 16, 2023
11256f4
cls1 index
trevorcampbell Nov 16, 2023
39b41a3
cls2 index
trevorcampbell Nov 16, 2023
e564234
regresin1 index
trevorcampbell Nov 16, 2023
fdd9f34
reg2 indx
trevorcampbell Nov 16, 2023
8569651
bugfix index cls1
trevorcampbell Nov 17, 2023
e076a5a
bugfix index viz
trevorcampbell Nov 17, 2023
9e3a0dd
bugfixing index
trevorcampbell Nov 17, 2023
4e49b01
index bugfixes
trevorcampbell Nov 17, 2023
d73ade7
Merge pull request #564 from UBC-DSCI/index-update
trevorcampbell Nov 17, 2023
ad3ebe4
empty commit
trevorcampbell Nov 17, 2023
e75f9d2
use default colors in inference
trevorcampbell Nov 20, 2023
c08434c
consistent font label clustering elbow
trevorcampbell Nov 20, 2023
b523442
consistent cluster centre style in clustering
trevorcampbell Nov 20, 2023
a066346
classification2 new graphics
trevorcampbell Nov 21, 2023
c737dff
added source files for cls2 new graphics
trevorcampbell Nov 21, 2023
6c81a90
orange2 -> darkorange; steelblue2 -> steelblue
trevorcampbell Nov 21, 2023
29ef8bf
landmass bar colors steelblue,darkorange
trevorcampbell Nov 21, 2023
072a8a7
steelblue/orange in cls2 predictor selection irrelevant plot
trevorcampbell Nov 21, 2023
71a0f67
dotted to dashed vert rule in reg1; thinner default dash in viz
trevorcampbell Nov 21, 2023
610ae81
improvements to consistency in visualizations in reg1,2
trevorcampbell Nov 22, 2023
3d4ec8b
Steelblue, darkorange consistency with prev chps
trevorcampbell Nov 22, 2023
a42ed8f
consistent style in inference
trevorcampbell Nov 22, 2023
7f7db50
centering all figs; new figs in wrangling (just names; not committing…
trevorcampbell Nov 22, 2023
d7ce717
change jpegs to pngs in intro chp
trevorcampbell Nov 25, 2023
a14b97c
graphic design: ch3
trevorcampbell Nov 25, 2023
be05f02
fix 1-6
trevorcampbell Nov 26, 2023
ec97921
version control graphics
trevorcampbell Dec 10, 2023
6e2b6ff
frontmatter figure chapter overview
trevorcampbell Dec 20, 2023
8e365c2
pop vs sample figure inference
trevorcampbell Dec 20, 2023
d4ac466
reading chp file tree fig update
trevorcampbell Dec 20, 2023
df6fd34
Added canada mapto ch1
trevorcampbell Dec 20, 2023
500904a
Merge pull request #565 from UBC-DSCI/graphic-design
trevorcampbell Dec 20, 2023
64087e1
bug hunt init
trevorcampbell Dec 20, 2023
7f147d0
intro bugs from py issue
trevorcampbell Dec 20, 2023
72e0f20
ch3 bug hunt from py issue
trevorcampbell Dec 20, 2023
a265822
bug hunt ch4 py issue
trevorcampbell Dec 20, 2023
4bccdbc
ch13 bug hunt py issue
trevorcampbell Dec 20, 2023
cf3656d
bug hunt r issue chapter 2
trevorcampbell Dec 20, 2023
2f5a023
minor bug hunt vsn ctl py issue
trevorcampbell Dec 21, 2023
b268308
minor ed to fix figure description vsn ctl graphic update
trevorcampbell Dec 21, 2023
8b6be75
added jupyter cite to jupyter chapter bug hunt py issue
trevorcampbell Dec 21, 2023
919e5bb
minor ed caption cls1 bug hunt py issue
trevorcampbell Dec 21, 2023
dd03028
cls1 bug hunt py issue wordsmith
trevorcampbell Dec 21, 2023
9355afd
minor fix bughunt reg1 py issue
trevorcampbell Dec 21, 2023
2cfa0d8
remove bias py bug hunt
trevorcampbell Dec 21, 2023
dd15f33
better 10.1 fig caption py bug hunt
trevorcampbell Dec 22, 2023
139e1c0
typo fix py bug hunt
trevorcampbell Dec 22, 2023
7d5eb3b
typo fix bug hunt inf py issue
trevorcampbell Dec 22, 2023
19d0f48
bugfix ml paradigm fig cls2
trevorcampbell Dec 22, 2023
7a85e8d
period in formula py bug hunt
trevorcampbell Dec 22, 2023
4484bc9
minor update cls2 rnd nums py bug hunt
trevorcampbell Dec 22, 2023
1184617
randomness cls2 py bug hunt
trevorcampbell Dec 22, 2023
f147f40
minor axis label fix cls2 py bug hunt
trevorcampbell Dec 22, 2023
c6e2cf4
remove broken emph refs bib
trevorcampbell Dec 22, 2023
a66fd2f
source img files import
trevorcampbell Dec 22, 2023
9411ecd
intro bug hunt r
trevorcampbell Dec 23, 2023
e77700f
restore bookdown yml
trevorcampbell Dec 23, 2023
39bf197
remove grey bar generate pat
trevorcampbell Dec 23, 2023
a585883
swap filled/empty circle in jupyter
trevorcampbell Dec 23, 2023
ce7e7d9
remove broken md syntax link
trevorcampbell Dec 23, 2023
f587a05
fix 'shown below' in vsn ctl
trevorcampbell Dec 23, 2023
9c4c7cf
sync viz with python book; remove old style syntax fig
trevorcampbell Dec 23, 2023
54e6688
fix version control orange arrow
trevorcampbell Dec 23, 2023
03e8e69
wrangling bugfixes
trevorcampbell Dec 23, 2023
d88a686
fix bugs in pivot func figs wrangling
trevorcampbell Dec 23, 2023
3faebd4
upload sources for prev bugfixed figs
trevorcampbell Dec 23, 2023
6eef695
vsn ctl update fig desc
trevorcampbell Dec 23, 2023
ed6db0e
consistency unknown legend cls1 bughunt
trevorcampbell Dec 23, 2023
d510f31
fix axis label size cls1 zoom
trevorcampbell Dec 23, 2023
685a22d
added points to predsel subsec cls2 bughunt
trevorcampbell Dec 23, 2023
c7697f0
spacing out kwd arguments cls2 bughunt
trevorcampbell Dec 23, 2023
db6e782
Merge pull request #574 from UBC-DSCI/bug-hunt
trevorcampbell Dec 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
clustering index
trevorcampbell committed Nov 16, 2023
commit 948ef88c49b371f6b81e4467ef6383f46b48e238
18 changes: 3 additions & 15 deletions source/clustering.Rmd
Original file line number Diff line number Diff line change
@@ -164,7 +164,7 @@ library(tidyverse)
set.seed(1)
```

Now we can load and preview the `penguins` data.
Now we can load and preview the `penguins` data.\index{read function!read\_csv}

```{r message = FALSE, warning = FALSE}
penguins <- read_csv("data/penguins.csv")
@@ -639,7 +639,7 @@ in the fourth iteration; both the centers and labels will remain the same from t

### Random restarts

Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart, nstart} can get "stuck" in a bad solution.
Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart} can get "stuck" in a bad solution.
For example, Figure \@ref(fig:10-toy-kmeans-bad-init) illustrates an unlucky random initialization by K-means.

```{r 10-toy-kmeans-bad-init, echo = FALSE, warning = FALSE, message = FALSE, fig.height = 3.25, fig.width = 3.75, fig.pos = "H", out.extra="", fig.align = "center", fig.cap = "Random initialization of labels."}
@@ -910,7 +910,7 @@ set.seed(1)

We can perform K-means clustering in R using a `tidymodels` workflow similar
to those in the earlier classification and regression chapters.
We will begin by loading the `tidyclust`\index{tidyclust} library, which contains the necessary
We will begin by loading the `tidyclust`\index{K-means}\index{tidyclust} library, which contains the necessary
functionality.
```{r, echo = TRUE, warning = FALSE, message = FALSE}
library(tidyclust)
@@ -993,18 +993,6 @@ clustered_data <- kmeans_fit |>
clustered_data
```

<!--
If for some reason we need access to just the cluster assignments,
we can extract those from the fit as a data frame using
the `extract_cluster_assignment` function. Note that in this case,
the cluster assignments variable is named `.cluster`, while the `augment`
function earlier creates a variable named `.pred_cluster`.

```{r 10-kmeans-extract-clusterasgn}
extract_cluster_assignment(kmeans_fit)
```
-->

Now that we have the cluster assignments included in the `clustered_data` tidy data frame, we can
visualize them as shown in Figure \@ref(fig:10-plot-clusters-2).
Note that we are plotting the *un-standardized* data here; if we for some reason wanted to