From 075690f9a2da342626c211b3c9ab0dc86d341616 Mon Sep 17 00:00:00 2001
From: PiotrTymoszuk <80723424+PiotrTymoszuk@users.noreply.github.com>
Date: Wed, 11 Oct 2023 15:38:48 +0200
Subject: [PATCH] Update README.md
Examples of basic usage
---
README.md | 991 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 989 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 1ea4954..1a222a0 100644
--- a/README.md
+++ b/README.md
@@ -3,11 +3,11 @@
# caretExtra
-Tools for quality control, prediction and plotting for caret models
+Tools for quality control, prediction and plotting of caret models
## Description
-The `caretExtra` package includes a bunch of methods for user friendly extraction of the predictions in the train, cross-validation and test data, fit statistic calulation, diagnostic via residuals plots and graphical representation of the results as scatter, ROC and heat map plots. In addition, regression models may be calibrated by quantile GAM (generalized additive model) method.
+The `caretExtra` package includes a bunch of methods for user friendly extraction of predictions in the train, cross-validation and test data, fit statistic calulation, diagnostic via residuals plots and graphical representation of the results as scatter, ROC and heat map plots. In addition, regression models may be calibrated by the quantile GAM (generalized additive model) method.
Currently, it works only with cross-validated Caret models (CV or repreated CV) created from formula objects. Solution for bootstrap and holdout model construction methods are on the way.
@@ -39,3 +39,990 @@ The package maintainer is [Piotr Tymoszuk](mailto:piotr.s.tymoszuk@gmail.com).
`clustTools` uses tools provided by the [rlang](https://rlang.r-lib.org/), [tidyverse](https://www.tidyverse.org/), [caret](https://topepo.github.io/caret/), [coxed](https://cran.r-project.org/web/packages/coxed/index.html), [ggrepel](https://ggrepel.slowkow.com/), [generics](https://github.com/r-lib/generics), [DescTools](https://andrisignorell.github.io/DescTools/), [plotROC](https://cran.r-project.org/web/packages/plotROC/vignettes/examples.html), [qgam](https://mfasiolo.github.io/qgam/), and [survial](https://cran.r-project.org/web/packages/survival/index.html). Many thanks to their developers, maintainers and contributors.
+## Usage
+
+### Fitting of models
+
+
+
+In the example of basic usage of the `caretExtra` tools, I'll use three published data sets:
+
+* `Boston` provided by the R package `MASS` consisting of data on environmental, geographic and socioeconomic features of houses in Boston as well as their value. This data set will be used for modeling of the house value with regression
+
+* `biopsy` included in the `MASS` package consisting of nine pathological and cytologic properties of breast lesion biopsies rated by the pathologist with a 1 - 10 point scale. With this data set I'll construct a binary classifier predicting the biopsy sample classification as benign or malignant tissue
+
+* `wines` included in the `kohonen` package. and consisting of physical and chemical properties of various Italian wines. This data set will be used to model vintage class with a multi-category classifier
+
+For each data set, randomly selected two-thirds of observations will be used for tuning of the models with 5-repeats 5 fold cross-validation. The remaining observations will be put aside for the final evaluation of the model performance.
+
+Let's start with few packages which are required for the current analysis:
+
+```r
+ ## the 'biopsy' and `Boston` data sets provided by MASS
+
+ library(MASS)
+
+ ## the 'wines' data set provided by kohonen
+
+ library(kohonen)
+
+ ## packages used for analysis
+
+ library(tidyverse)
+ library(caret)
+ library(caretExtra)
+ library(gbm)
+
+ ## doParallel for a parallel backend
+ ## used for tuning and cross-validation
+
+ library(doParallel)
+
+ ## patchwork for plot panels
+
+ library(patchwork)
+```
+Preprocessing is in many cases a tedious step preceding the 'real' analysis. For the `Boston` data set, this includes recoding of the dummy chas variable and normalization of numeric explanatory features as well as selection of the training and test subset observations with the `createDataPartition()` function from the `caret` package.
+
+```r
+ ## Boston: used for the regression model of the house price
+ ## normalization of numeric explanatory variables
+ ## adding the observation ID in the rownames
+
+ my_boston <- Boston %>%
+ filter(complete.cases(.)) %>%
+ mutate(chas = car::recode(chas,
+ "0 = 'no';
+ 1 = 'yes'"),
+ chas = factor(chas, c('no', 'yes')))
+
+ my_boston[, c("crim", "zn", "indus",
+ "nox", "rm", "age",
+ "dis", "tax", "rad",
+ "ptratio", "black", "lstat")] <-
+ my_boston[, c("crim", "zn", "indus",
+ "nox", "rm", "age",
+ "dis", "tax", "rad",
+ "ptratio", "black", "lstat")] %>%
+ map_dfc(~scale(.x)[, 1])
+
+ rownames(my_boston) <- paste0('obs_', 1:nrow(my_boston))
+
+ ## training and test portion at a 2 to 1 ratio
+
+ boston_ids <- createDataPartition(my_boston$medv, p = 2/3)[[1]]
+
+ my_boston <- list(train = my_boston[boston_ids, ],
+ test = my_boston[-boston_ids, ])
+```
+```r
+> my_boston %>% map(head)
+$train
+ crim zn indus chas nox rm age dis rad tax ptratio
+obs_2 -0.4169267 -0.48724019 -0.5927944 no -0.7395304 0.1940824 0.36680343 0.5566090 -0.8670245 -0.9863534 -0.3027945
+obs_6 -0.4166314 -0.48724019 -1.3055857 no -0.8344581 0.2068916 -0.35080997 1.0766711 -0.7521778 -1.1050216 0.1129203
+obs_7 -0.4098372 0.04872402 -0.4761823 no -0.2648919 -0.3880270 -0.07015919 0.8384142 -0.5224844 -0.5769480 -1.5037485
+obs_8 -0.4032966 0.04872402 -0.4761823 no -0.2648919 -0.1603069 0.97784057 1.0236249 -0.5224844 -0.5769480 -1.5037485
+obs_10 -0.4003331 0.04872402 -0.4761823 no -0.2648919 -0.3994130 0.61548134 1.3283202 -0.5224844 -0.5769480 -1.5037485
+obs_11 -0.3939564 0.04872402 -0.4761823 no -0.2648919 0.1314594 0.91389483 1.2117800 -0.5224844 -0.5769480 -1.5037485
+ black lstat medv
+obs_2 0.4406159 -0.49195252 21.6
+obs_6 0.4101651 -1.04229087 28.7
+obs_7 0.4263763 -0.03123671 22.9
+obs_8 0.4406159 0.90979986 27.1
+obs_10 0.3289995 0.62272769 18.9
+obs_11 0.3926395 1.09184562 15.0
+
+$test
+ crim zn indus chas nox rm age dis rad tax ptratio black
+obs_1 -0.4193669 0.28454827 -1.2866362 no -0.1440749 0.4132629 -0.1198948 0.140075 -0.9818712 -0.6659492 -1.4575580 0.4406159
+obs_3 -0.4169290 -0.48724019 -0.5927944 no -0.7395304 1.2814456 -0.2655490 0.556609 -0.8670245 -0.9863534 -0.3027945 0.3960351
+obs_4 -0.4163384 -0.48724019 -1.3055857 no -0.8344581 1.0152978 -0.8090878 1.076671 -0.7521778 -1.1050216 0.1129203 0.4157514
+obs_5 -0.4120741 -0.48724019 -1.3055857 no -0.8344581 1.2273620 -0.5106743 1.076671 -0.7521778 -1.1050216 0.1129203 0.4406159
+obs_9 -0.3955433 0.04872402 -0.4761823 no -0.2648919 -0.9302853 1.1163897 1.086122 -0.5224844 -0.5769480 -1.5037485 0.3281233
+obs_12 -0.4064448 0.04872402 -0.4761823 no -0.2648919 -0.3922967 0.5089051 1.154792 -0.5224844 -0.5769480 -1.5037485 0.4406159
+ lstat medv
+obs_1 -1.07449897 24.0
+obs_3 -1.20753241 34.7
+obs_4 -1.36017078 33.4
+obs_5 -1.02548665 36.2
+obs_9 2.41937935 16.5
+obs_12 0.08639286 18.9
+```
+
+Pre-processing of the `biopsy` data set is quite easy: I'm not going to normalize the numeric explanatory variables because they are anyway on the same scale. Still, I'm setting unique observation names and selecting the training and test observations with `createDataPartition()`:
+
+```r
+ ## biopsy: binary classification of benign and malignant samples
+ ## explanatory variables are not normalized, since they
+ ## are anyway in the same 1 - 10 scale
+ ## IDs in the rownames
+
+ my_biopsy <- biopsy %>%
+ filter(complete.cases(.))
+
+ rownames(my_biopsy) <-
+ paste(my_biopsy$ID, 1:nrow(my_biopsy), sep = '_sample_')
+
+ my_biopsy <- my_biopsy %>%
+ select(-ID)
+
+ ## training and test portion at a 2 to 1 ratio
+
+ biopsy_ids <- createDataPartition(my_biopsy$class, p = 2/3)[[1]]
+
+ my_biopsy <- list(train = my_biopsy[biopsy_ids, ],
+ test = my_biopsy[-biopsy_ids, ])
+```
+
+```r
+> my_biopsy %>% map(head)
+$train
+ V1 V2 V3 V4 V5 V6 V7 V8 V9 class
+1000025_sample_1 5 1 1 1 2 1 3 1 1 benign
+1002945_sample_2 5 4 4 5 7 10 3 2 1 benign
+1016277_sample_4 6 8 8 1 3 4 3 7 1 benign
+1017023_sample_5 4 1 1 3 2 1 3 1 1 benign
+1035283_sample_11 1 1 1 1 1 1 3 1 1 benign
+1036172_sample_12 2 1 1 1 2 1 2 1 1 benign
+
+$test
+ V1 V2 V3 V4 V5 V6 V7 V8 V9 class
+1015425_sample_3 3 1 1 1 2 2 3 1 1 benign
+1017122_sample_6 8 10 10 8 7 10 9 7 1 malignant
+1018099_sample_7 1 1 1 1 2 10 3 1 1 benign
+1018561_sample_8 2 1 2 1 2 1 3 1 1 benign
+1033078_sample_9 2 1 1 1 2 1 1 1 5 benign
+1033078_sample_10 4 2 1 1 2 1 2 1 1 benign
+```
+
+In the `wines` data set I need to rename variables to be compatible with R formulas, normalize explanatory variables and select the training and test observations in a similar way as for the remaining data sets:
+
+```r
+ ## wines: multi-level classification of vintages
+ ## normalization of explanatory variables
+ ## IDs in the rownames
+
+ data(wines)
+
+ my_wines <- wines %>%
+ as.data.frame %>%
+ filter(complete.cases(.)) %>%
+ map_dfc(~scale(.x)[, 1])
+
+ names(my_wines) <- make.names(names(my_wines))
+
+ my_wines$vintage <- vintages
+
+ my_wines <- as.data.frame(my_wines)
+
+ rownames(my_wines) <- paste0('wine_', 1:nrow(my_wines))
+
+ ## training - test split at a 2 to 1 ratio
+
+ wines_ids <- createDataPartition(my_wines$vintage, p = 2/3)[[1]]
+
+ my_wines <- list(train = my_wines[wines_ids, ],
+ test = my_wines[-wines_ids, ])
+
+```
+```r
+> my_wines %>% map(head)
+$train
+ alcohol malic.acid ash ash.alkalinity magnesium tot..phenols flavonoids non.flav..phenols proanth
+wine_1 0.2551008 -0.50020530 -0.8221529 -2.4930372 0.02909756 0.5710456 0.7375437 -0.8208101 -0.5370519
+wine_2 0.2056453 0.01796903 1.1045562 -0.2748590 0.09964918 0.8104843 1.2181890 -0.4999191 2.1399040
+wine_3 1.7016732 -0.34832662 0.4865552 -0.8144158 0.94626865 2.4865554 1.4685250 -0.9812556 1.0376281
+wine_6 1.7264010 -0.41979894 0.3047901 -1.4738742 -0.25310893 0.3316069 0.4972211 -0.4999191 0.6876992
+wine_7 1.3183934 -0.16964581 0.8864382 -0.5746128 1.51068164 0.4912327 0.4872077 -0.4196964 -0.5895412
+wine_8 2.2704111 -0.62528187 -0.7130939 -1.6537265 -0.18255731 0.8104843 0.9578395 -0.5801419 0.6876992
+ col..int. col..hue OD.ratio proline vintage
+wine_1 -0.290306650 0.4059482 1.1284966 0.9683055 Barolo
+wine_2 0.268966296 0.3186634 0.8023031 1.3970348 Barolo
+wine_3 1.181011408 -0.4232572 1.1994082 2.3338876 Barolo
+wine_6 0.083976014 0.2750210 1.3837785 1.7304908 Barolo
+wine_7 -0.002065978 0.4495905 1.3837785 1.7463697 Barolo
+wine_8 0.062465516 0.5368753 0.3484686 0.9524266 Barolo
+
+$test
+ alcohol malic.acid ash ash.alkalinity magnesium tot..phenols flavonoids non.flav..phenols proanth col..int.
+wine_4 0.3045563 0.2234520 1.8316163 0.4445501 1.2990268 0.8104843 0.6674496 0.2220856 0.40775610 -0.31611925
+wine_5 1.4914875 -0.5180734 0.3047901 -1.2940219 0.8757170 1.5607256 1.3683906 -0.1790282 0.67020276 0.72929095
+wine_11 1.3925766 -0.7682265 -0.1677989 -0.8144158 -0.3236606 -0.1472706 0.4071002 -0.8208101 -0.02965499 -0.02357648
+wine_16 1.6151262 -0.3751287 1.2863212 0.1447963 1.4401300 0.8104843 1.1180545 -0.2592509 0.67020276 0.49267547
+wine_24 0.6260168 -0.4734032 0.8864382 0.1447963 -0.2531089 0.3794946 0.5873421 -0.6603646 0.12781300 -0.66028721
+wine_26 0.4900143 -0.5091393 0.9227912 -1.0242435 -0.4647638 0.8902972 0.9177857 -0.1790282 -0.23961231 -0.10961847
+ col..hue OD.ratio proline vintage
+wine_4 0.3623058 0.46192721 -0.03206274 Barolo
+wine_5 0.4059482 0.34846859 2.23861439 Barolo
+wine_11 0.9296568 0.30592161 1.69873311 Barolo
+wine_16 0.4932329 0.06482205 1.69873311 Barolo
+wine_24 0.7114449 1.72415433 0.31727220 Barolo
+wine_26 -0.1614029 0.87321470 1.42879247 Barolo
+```
+
+`caret` models to be analyzed with the `caretExtra` tools need to meet few requirements. They need to be constructed with a formula, tuned in a cross-validation setting (single or repeated), contain the training data, final predictions and performance metrics in final resamples. Specifically for classification models, propabilities of class assignment need to be included in the `caret` object. To make sure that all those elements are included in modeling results, we customize the `trainControl` object as presented below:
+
+```r
+
+ train_control <- trainControl(method = 'repeatedcv',
+ number = 5,
+ repeats = 5,
+ savePredictions = 'final',
+ returnData = TRUE,
+ returnResamp = 'final',
+ classProbs = TRUE)
+```
+Setting `savePredictions = 'final'`, `returnData = TRUE`, `returnResamp = 'final'` and, for classification models, `classProbs = TRUE` is absolutely crucial for subsequent analysis. Calling `as_caretx()` for caret models without these components raises an error.
+
+Construction of `caret` models is done as usual with the `train()` function. Importantly, the models must be built with formulas and not with the `x` and `y` variable matrices, and the customized `trainControl` object needs to be passed to the `trControl` argument. In the current example, I will fit the requested models with the gradient boosted machine (GBM) algorithm and default tune grids. You are however welcome to test a richer set of combinations of the tuning parameters.
+
+```r
+ my_models <- list()
+
+ registerDoParallel(cores = 7)
+
+ set.seed(12345)
+
+ my_models$regression <- caret::train(form = medv ~ .,
+ data = my_boston$train,
+ metric = 'MAE',
+ method = 'gbm',
+ trControl = train_control)
+
+ my_models$binary <- caret::train(form = class ~ .,
+ data = my_biopsy$train,
+ metric = 'Kappa',
+ method = 'gbm',
+ trControl = train_control)
+
+ my_models$multi_class <- caret::train(form = vintage ~ .,
+ data = my_wines$train,
+ metric = 'Kappa',
+ method = 'gbm',
+ trControl = train_control)
+
+ stopImplicitCluster()
+
+```
+The final step of the model construction is simple: I'm just calling `as_caretx()` for the models. Of importance, the function returns objects of the `caretx` class, which inherit most of the methods from traditional `caret` models.
+
+```r
+
+ my_models <- my_models %>%
+ map(as_caretx)
+
+```
+
+My personal experience with the anyway excellent `caret` package was that regression and classification models required different and quite often project-specific approaches to diagnostic, performance evaluation and visualization. This was my prime motivation to develop the `caretExtra` package. Another motivation was to create a framework compatible with `tidyverse` environment and `ggplot` graphic interface. For this reasons, the package offers a relatively simple S3 method interface (`model.frame()`, `plot()`, `summary()`, `components()` etc.), returns numeric statistics in `tibble` form and generates `ggplot` graphical objects that can be easily customized by the user.
+
+For instance, `model.frame()` and `formula()` extract respectively the training data and the formula from the model.
+
+```r
+> head(model.frame(my_models$regression))
+
+ crim zn indus chas nox rm age dis rad tax ptratio
+obs_2 -0.4169267 -0.48724019 -0.5927944 no -0.7395304 0.1940824 0.36680343 0.5566090 -0.8670245 -0.9863534 -0.3027945
+obs_6 -0.4166314 -0.48724019 -1.3055857 no -0.8344581 0.2068916 -0.35080997 1.0766711 -0.7521778 -1.1050216 0.1129203
+obs_7 -0.4098372 0.04872402 -0.4761823 no -0.2648919 -0.3880270 -0.07015919 0.8384142 -0.5224844 -0.5769480 -1.5037485
+obs_8 -0.4032966 0.04872402 -0.4761823 no -0.2648919 -0.1603069 0.97784057 1.0236249 -0.5224844 -0.5769480 -1.5037485
+obs_10 -0.4003331 0.04872402 -0.4761823 no -0.2648919 -0.3994130 0.61548134 1.3283202 -0.5224844 -0.5769480 -1.5037485
+obs_11 -0.3939564 0.04872402 -0.4761823 no -0.2648919 0.1314594 0.91389483 1.2117800 -0.5224844 -0.5769480 -1.5037485
+ black lstat medv
+obs_2 0.4406159 -0.49195252 21.6
+obs_6 0.4101651 -1.04229087 28.7
+obs_7 0.4263763 -0.03123671 22.9
+obs_8 0.4406159 0.90979986 27.1
+obs_10 0.3289995 0.62272769 18.9
+obs_11 0.3926395 1.09184562 15.0
+
+> formula(my_models$binary)
+
+class ~ V1 + V2 + V3 + V4 + V5 + V6 + V7 + V8 + V9
+attr(,"variables")
+list(class, V1, V2, V3, V4, V5, V6, V7, V8, V9)
+attr(,"factors")
+ V1 V2 V3 V4 V5 V6 V7 V8 V9
+class 0 0 0 0 0 0 0 0 0
+V1 1 0 0 0 0 0 0 0 0
+V2 0 1 0 0 0 0 0 0 0
+V3 0 0 1 0 0 0 0 0 0
+V4 0 0 0 1 0 0 0 0 0
+V5 0 0 0 0 1 0 0 0 0
+V6 0 0 0 0 0 1 0 0 0
+V7 0 0 0 0 0 0 1 0 0
+V8 0 0 0 0 0 0 0 1 0
+V9 0 0 0 0 0 0 0 0 1
+attr(,"term.labels")
+[1] "V1" "V2" "V3" "V4" "V5" "V6" "V7" "V8" "V9"
+attr(,"order")
+[1] 1 1 1 1 1 1 1 1 1
+attr(,"intercept")
+[1] 1
+attr(,"response")
+[1] 1
+attr(,"predvars")
+list(class, V1, V2, V3, V4, V5, V6, V7, V8, V9)
+attr(,"dataClasses")
+ class V1 V2 V3 V4 V5 V6 V7 V8 V9
+ "factor" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
+
+```
+Model residuals are computed by `residuals()`, while `augment()` returns the actual outcome (`.outcome`) with predictions in the training data and out-of-fold predictions in resamples (`.fitted`) together with class assignment probabilities and the explanatory variables. By providing the `residuals()` and `augment()` functions with a test data set passed to the optional `newdata` argument, test set residuals and predictions can be calculated as well:
+
+```r
+> residuals(my_models$regression, newdata = my_boston$test)
+
+$train
+# A tibble: 338 × 8
+ .observation .outcome .fitted .resid .std.resid .sq.std.resid .candidate_missfit .expect.norm
+
+ 1 246 50 34.3 -15.7 -7.66 58.6 yes -2.97
+ 2 4 27.1 20.6 -6.52 -3.18 10.1 yes -2.62
+ 3 244 50 43.7 -6.25 -3.05 9.31 yes -2.44
+ 4 111 36.2 30.1 -6.11 -2.98 8.88 yes -2.31
+ 5 259 23.2 17.2 -5.96 -2.91 8.47 yes -2.22
+ 6 277 17.9 13.3 -4.57 -2.23 4.97 yes -2.14
+ 7 171 50 45.4 -4.56 -2.23 4.95 yes -2.07
+ 8 131 23.7 19.2 -4.49 -2.19 4.79 yes -2.01
+ 9 122 50 45.5 -4.46 -2.17 4.73 yes -1.96
+10 58 28.4 24.9 -3.45 -1.69 2.84 no -1.91
+# … with 328 more rows
+# ℹ Use `print(n = ...)` to see more rows
+
+$cv
+# A tibble: 1,690 × 9
+ .observation .outcome .fitted .resample .resid .std.resid .sq.std.resid .candidate_missfit .expect.norm
+
+ 1 246 50 23.5 Fold5.Rep3 -26.5 -7.52 56.6 yes -3.44
+ 2 246 50 24.9 Fold1.Rep5 -25.1 -7.12 50.7 yes -3.13
+ 3 246 50 25.5 Fold5.Rep4 -24.5 -6.94 48.2 yes -2.97
+ 4 246 50 25.7 Fold4.Rep2 -24.3 -6.88 47.4 yes -2.87
+ 5 246 50 27.1 Fold4.Rep1 -22.9 -6.49 42.2 yes -2.79
+ 6 244 50 31.2 Fold2.Rep4 -18.8 -5.34 28.5 yes -2.72
+ 7 244 50 32.6 Fold3.Rep1 -17.4 -4.92 24.2 yes -2.67
+ 8 245 50 34.1 Fold2.Rep4 -15.9 -4.49 20.2 yes -2.62
+ 9 245 50 34.6 Fold3.Rep1 -15.4 -4.36 19.0 yes -2.57
+10 245 50 36.8 Fold5.Rep3 -13.2 -3.73 13.9 yes -2.54
+# … with 1,680 more rows
+# ℹ Use `print(n = ...)` to see more rows
+
+$test
+# A tibble: 168 × 8
+ .observation .outcome .fitted .resid .std.resid .sq.std.resid .candidate_missfit .expect.norm
+
+ 1 127 50 30.4 -19.6 -5.13 26.4 yes -2.75
+ 2 74 37 30.2 -6.77 -1.75 3.06 no -2.37
+ 3 86 30.1 23.4 -6.72 -1.73 3.01 no -2.17
+ 4 87 50 43.7 -6.28 -1.62 2.62 no -2.04
+ 5 73 32 26.4 -5.55 -1.43 2.03 no -1.93
+ 6 160 19.1 14.3 -4.82 -1.23 1.52 no -1.84
+ 7 77 34.9 30.1 -4.77 -1.22 1.49 no -1.77
+ 8 100 29.1 24.4 -4.70 -1.20 1.44 no -1.70
+ 9 76 34.6 29.9 -4.66 -1.19 1.42 no -1.64
+10 81 50 45.6 -4.43 -1.13 1.28 no -1.58
+# … with 158 more rows
+# ℹ Use `print(n = ...)` to see more rows
+
+```
+```r
+> augment(my_models$multi_class)
+
+$train
+# A tibble: 119 × 19
+ .observ…¹ .outc…² .fitted Barbera Barolo Grign…³ alcohol malic…⁴ ash ash.a…⁵ magne…⁶ tot..…⁷ flavo…⁸ non.f…⁹ proanth col..i…˟
+
+ 1 1 Barolo Barolo 1.40e-7 1.00 4.03e-6 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537 -0.290
+ 2 2 Barolo Barolo 5.79e-6 1.00 1.95e-6 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14 0.269
+ 3 3 Barolo Barolo 3.83e-6 1.00 6.18e-6 1.70 -0.348 0.487 -0.814 0.946 2.49 1.47 -0.981 1.04 1.18
+ 4 4 Barolo Barolo 6.27e-7 1.00 1.97e-6 1.73 -0.420 0.305 -1.47 -0.253 0.332 0.497 -0.500 0.688 0.0840
+ 5 5 Barolo Barolo 1.37e-6 1.00 1.16e-6 1.32 -0.170 0.886 -0.575 1.51 0.491 0.487 -0.420 -0.590 -0.00207
+ 6 6 Barolo Barolo 4.92e-7 1.00 8.38e-6 2.27 -0.625 -0.713 -1.65 -0.183 0.810 0.958 -0.580 0.688 0.0625
+ 7 7 Barolo Barolo 9.03e-7 1.00 7.42e-5 1.07 -0.884 -0.350 -1.05 -0.112 1.10 1.13 -1.14 0.460 0.931
+ 8 8 Barolo Barolo 7.73e-6 1.00 5.34e-5 1.37 -0.161 -0.241 -0.455 0.382 1.05 1.30 -1.14 1.39 0.299
+ 9 9 Barolo Barolo 5.56e-7 1.00 2.52e-6 0.935 -0.545 0.159 -1.05 -0.747 0.491 0.738 -0.580 0.390 0.235
+10 10 Barolo Barolo 8.32e-6 1.00 2.83e-4 2.17 -0.545 0.0867 -2.43 -0.606 1.29 1.67 0.543 2.14 0.149
+# … with 109 more rows, 3 more variables: col..hue , OD.ratio , proline , and abbreviated variable names
+# ¹.observation, ².outcome, ³Grignolino, ⁴malic.acid, ⁵ash.alkalinity, ⁶magnesium, ⁷tot..phenols, ⁸flavonoids,
+# ⁹non.flav..phenols, ˟col..int.
+# ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
+
+$cv
+# A tibble: 595 × 20
+ .observa…¹ .outc…² .fitted .resa…³ Barbera Barolo Grign…⁴ alcohol malic…⁵ ash ash.a…⁶ magne…⁷ tot..…⁸ flavo…⁹ non.f…˟ proanth
+
+ 1 1 Barolo Barolo Fold5.… 4.01e- 7 1.00 1.77e-5 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537
+ 2 1 Barolo Barolo Fold3.… 2.07e- 7 1.00 1.42e-6 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537
+ 3 1 Barolo Barolo Fold5.… 7.50e-10 1.00 1.86e-7 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537
+ 4 1 Barolo Barolo Fold4.… 1.01e- 7 1.00 3.10e-5 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537
+ 5 1 Barolo Barolo Fold3.… 2.77e- 9 1.00 1.05e-6 0.255 -0.500 -0.822 -2.49 0.0291 0.571 0.738 -0.821 -0.537
+ 6 2 Barolo Barolo Fold1.… 2.27e- 6 1.00 1.12e-6 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14
+ 7 2 Barolo Barolo Fold1.… 1.25e- 6 1.00 1.64e-6 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14
+ 8 2 Barolo Barolo Fold2.… 5.31e- 7 1.00 3.59e-7 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14
+ 9 2 Barolo Barolo Fold3.… 2.19e- 5 1.00 8.17e-6 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14
+10 2 Barolo Barolo Fold5.… 2.74e- 6 1.00 2.02e-6 0.206 0.0180 1.10 -0.275 0.0996 0.810 1.22 -0.500 2.14
+# … with 585 more rows, 4 more variables: col..int. , col..hue , OD.ratio , proline , and abbreviated variable
+# names ¹.observation, ².outcome, ³.resample, ⁴Grignolino, ⁵malic.acid, ⁶ash.alkalinity, ⁷magnesium, ⁸tot..phenols, ⁹flavonoids,
+# ˟non.flav..phenols
+# ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
+
+```
+
+Many other elements of the model like tuning results, square distances to the outcome and confusion matrices can be extracted from the model with the `components()` method. As described above, if `newdata` is specified, the requested metrics and objects are generated not only for the training and cross-validaiton data sets but also for the test data:
+
+```r
+> components(my_models$multi_class, what = 'tuning')
+
+ shrinkage interaction.depth n.minobsinnode n.trees Accuracy Kappa AccuracySD KappaSD
+1 0.1 1 10 50 0.9549478 0.9318877 0.03798866 0.05703971
+4 0.1 2 10 50 0.9648870 0.9468514 0.03080134 0.04637411
+7 0.1 3 10 50 0.9598021 0.9389724 0.02532945 0.03850012
+2 0.1 1 10 100 0.9599478 0.9394119 0.03921166 0.05894816
+5 0.1 2 10 100 0.9664870 0.9492163 0.02368318 0.03580442
+8 0.1 3 10 100 0.9596029 0.9386257 0.03318834 0.05055427
+3 0.1 1 10 150 0.9682986 0.9519018 0.03001619 0.04548033
+6 0.1 2 10 150 0.9664870 0.9490494 0.03156632 0.04809489
+9 0.1 3 10 150 0.9613420 0.9414862 0.03433323 0.05161714
+
+```
+
+```r
+> components(my_models$binary, what = 'square_dist')
+
+$train
+# A tibble: 456 × 4
+ .observation .outcome .fitted square_dist
+
+ 1 1 benign benign 0.00000120
+ 2 2 benign benign 0.0519
+ 3 3 benign benign 0.0759
+ 4 4 benign benign 0.00000704
+ 5 5 benign benign 0.0000145
+ 6 6 benign benign 0.000000831
+ 7 7 malignant malignant 0.0232
+ 8 8 benign benign 0.000129
+ 9 9 malignant malignant 0.000000131
+10 10 benign benign 0.00000120
+# … with 446 more rows
+# ℹ Use `print(n = ...)` to see more rows
+
+$cv
+# A tibble: 2,280 × 5
+ .observation .resample .outcome .fitted square_dist
+
+ 1 1 Fold1.Rep3 benign benign 0.000000516
+ 2 1 Fold4.Rep1 benign benign 0.000000222
+ 3 1 Fold3.Rep2 benign benign 0.000000922
+ 4 1 Fold5.Rep4 benign benign 0.000000351
+ 5 1 Fold2.Rep5 benign benign 0.000000645
+ 6 2 Fold1.Rep4 benign malignant 0.980
+ 7 2 Fold4.Rep1 benign malignant 0.981
+ 8 2 Fold5.Rep2 benign malignant 0.946
+ 9 2 Fold3.Rep3 benign malignant 0.981
+10 2 Fold5.Rep5 benign malignant 0.984
+# … with 2,270 more rows
+# ℹ Use `print(n = ...)` to see more rows
+```
+```r
+> components(my_models$binary,
++ what = 'confusion',
++ newdata = my_biopsy$test)
+
+$train
+ .fitted
+.outcome benign malignant
+ benign 295 1
+ malignant 1 159
+
+$cv
+ .fitted
+.outcome benign malignant
+ benign 1440 40
+ malignant 30 770
+
+$test
+ .fitted
+.outcome benign malignant
+ benign 144 4
+ malignant 5 74
+```
+
+
+
+
+### Diagnostic plots and visualization of modeling results
+
+
+
+#### Regression models
+
+By calling `plot(type = 'diagnostic')` for a regression model, a list of `ggplot` objects with diagnostic plots of residuals is returned. Such list includes plots of residuals versus fitted (`resid_fitted`), standardized residuals versus fitted (`std.resid_fitted`), square residuals versus fitted (`sq.resid_fitted`) and quantile-quantile plot of residuals, which can be helpful at assessment of residuals' normality (`qq.std.resid`). Candidate outliers identified with the $2 \times SD$ cutoff are highlighted in red. As before, if `newdata` is specified, diagnostic plots will be generated for the test data set as well.
+
+```r
+ regression_resid_plots <- plot(my_models$regression,
+ newdata = my_boston$test,
+ type = 'diagnostic')
+
+ regression_resid_fitted_plots <- regression_resid_plots %>%
+ map(~.x$resid_fitted) %>%
+ map2(., c('Boston: training', 'Boston: CV', 'Boston: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'bottom'))
+
+ regression_qqplots <- regression_resid_plots %>%
+ map(~.x$qq.std.resid) %>%
+ map2(., c('Boston: training', 'Boston: CV', 'Boston: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'bottom'))
+```
+```r
+ regression_resid_fitted_plots$train +
+ regression_resid_fitted_plots$cv +
+ regression_resid_fitted_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/324d4988-a47c-405b-b573-709547a7c4ac)
+
+```r
+ regression_qqplots$train +
+ regression_qqplots$cv +
+ regression_qqplots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/b4954e3a-146f-463f-a864-8bec093f59e2)
+
+In many instances plots of predicted and observed values of the outcome variable are useful, e.g. for assessing model calibration. They are generated by calling `plot(type = 'fit')`. In such plots, the ideal calibration is represented by a slope 1 dashed line. By default, LOESS or GAM trends (`geom_smooth()` from `ggplot2`) are displayed as well; R^2 and RMSE values are displayed in the plot subtitle and numbers of complete observations are indicated in the plot tag:
+
+```r
+ regression_fit_predict <- plot(my_models$regression,
+ newdata = my_boston$test,
+ type = 'fit')
+
+ regression_fit_predict <- regression_fit_predict %>%
+ map2(., c('Boston: training', 'Boston: CV', 'Boston: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(plot.tag.position = 'bottom'))
+
+ regression_fit_predict$train +
+ regression_fit_predict$cv +
+ regression_fit_predict$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/aff30e66-4ac9-4429-9953-df9cad1e52b9)
+
+A graphical representation of the key performance statistics: R^2, RMSE and Spearman's $\rho$ coefficient of correlation between the outcome and predictions can be plotted with `plot(type = 'performance')`:
+
+```r
+ regression_performance_plots <- plot(my_models$regression,
+ newdata = my_boston$test,
+ type = 'performance')
+
+ regression_performance_plots +
+ scale_size_area(limits = c(0, 1))
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/46bab64a-ede7-4215-b019-25144e18dbcd)
+
+#### Binary classifiers
+
+Class assignment probability and square distance to the outcome may serve as quality measures for classification models. They can be obtained by calling `plot(type = 'class_p')`. In this case, a list of scatter plots of sorted class assignment probabilities (`winner_p`) and squared distance to the outcome (`square_dist`) is returned. Misclassified observations are highlighted in red. By default, overall accuracy, Cohen's $\kappa$ and Brier score (BS) are displayed in the plot subtitles.
+
+```r
+ binary_p_plots <- plot(my_models$binary,
+ newdata = my_biopsy$test,
+ type = 'class_p')
+
+ binary_sq_dist_plots <- binary_p_plots %>%
+ map(~.x$square_dist) %>%
+ map2(c('Biopsy: training', 'Biopsy: CV', 'Biopsy: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'bottom'))
+
+ binary_p_plots <- binary_p_plots %>%
+ map(~.x$winner_p) %>%
+ map2(c('Biopsy: training', 'Biopsy: CV', 'Biopsy: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'bottom'))
+```
+
+```r
+ binary_sq_dist_plots$train +
+ binary_sq_dist_plots$cv +
+ binary_sq_dist_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/9ec8d044-2e24-4b38-af3b-019f1ce4fd9f)
+
+```r
+ binary_p_plots$train +
+ binary_p_plots$cv +
+ binary_p_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/9cf40621-fb99-4dcd-b0bd-5324baf1de3b)
+
+Heat map representations of confusion matrices are fetched by `plot(type = 'confusion')`. By default, such heat maps of confusion matrices visualize observation counts. To make them show percentages, the user can specify `scale = 'percent'`:
+
+```r
+ binary_confusion_plots <- plot(my_models$binary,
+ newdata = my_biopsy$test,
+ type = 'confusion',
+ scale = 'percent')
+
+ binary_confusion_plots <- binary_confusion_plots %>%
+ map2(c('Biopsy: training', 'Biopsy: CV', 'Biopsy: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(plot.tag.position = 'bottom',
+ legend.position = 'none'))
+
+ binary_confusion_plots$train +
+ binary_confusion_plots$cv +
+ binary_confusion_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/460bfdf0-442e-45f3-9892-1e65b2946f47)
+
+Graphical representation of results of receiver operating characteristic, so called 'ROC curves', is a traditional way to visualize sensitivity, specificity and overall accuracy (so called area under the curve or AUC) of a classifier. Such plots are obtained with `plot(type = 'roc')`. Again, by specifying `newdata`, ROC curves will be plotted also for the test data set. Sensitivity (Se), specificity (Sp) at the `p = 0.5` class assignment probability cutoff as well as AUC are presented in the plots.
+
+```r
+ binary_roc_plots <- plot(my_models$binary,
+ newdata = my_biopsy$test,
+ type = 'roc')
+
+ binary_roc_plots <- binary_roc_plots %>%
+ map2(c('Biopsy: training', 'Biopsy: CV', 'Biopsy: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(plot.tag.position = 'bottom'))
+
+ binary_roc_plots$train +
+ binary_roc_plots$cv +
+ binary_roc_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/02488e87-ee6b-4398-9abf-c56b56b419aa)
+
+By calling `plot(type = 'performance')`, the most essential performance statistics, overall accuracy, Cohen's $\kappa$ and Brier score, can be displayed in a bubble plot. As with any `ggplot2` object returned by the package's tools, feel free to modify it!
+
+```r
+ binary_performance_plots <- plot(my_models$binary,
+ newdata = my_biopsy$test,
+ type = 'performance')
+
+ ## indicating kappa and Brier score values expected
+ ## for a dummy classifier
+
+ binary_performance_plots +
+ geom_hline(yintercept = 0.75, linetype = 'dashed') +
+ geom_vline(xintercept = 0, linetype = 'dashed') +
+ scale_size_area(limits = c(0.5, 1))
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/e69f4e9a-b795-4ff8-b175-85a9ceb4afa6)
+
+#### Multi-category classifiers
+
+In general, the repertoire of plots for multi-category classifiers is similar to binary classifer plots with exception of ROC curves which are available only for the later.
+
+```r
+ ## plots of squared distances
+ ## and class assignment probabilities
+
+ multi_p_plots <- plot(my_models$multi_class,
+ newdata = my_wines$test,
+ type = 'class_p')
+
+ multi_p_sq_dist_plots <- multi_p_plots %>%
+ map(~.x$square_dist) %>%
+ map2(., c('Wines: training', 'Wines: CV', 'Wines: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'bottom'))
+
+ multi_p_sq_dist_plots$train +
+ multi_p_sq_dist_plots$cv +
+ multi_p_sq_dist_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/579bfd08-ac5c-464a-bee4-23062bab9665)
+
+```r
+ ## confusion matrix plots
+
+ multi_confusion_plots <- plot(my_models$multi_class,
+ newdata = my_wines$test,
+ type = 'confusion')
+
+ multi_confusion_plots <- multi_confusion_plots %>%
+ map2(., c('Wines: training', 'Wines: CV', 'Wines: test'),
+ ~.x +
+ labs(title = .y) +
+ theme(legend.position = 'none',
+ plot.tag.position = 'bottom'))
+
+ multi_confusion_plots$train +
+ multi_confusion_plots$cv +
+ multi_confusion_plots$test +
+ plot_layout(ncol = 2)
+```
+![image](https://github.com/PiotrTymoszuk/caretExtra/assets/80723424/2d2637b7-2001-4a2e-b539-0f12698926e9)
+
+
+
+
+### Numeric statistics of model performance
+
+
+
+The most crucial model performance statistics can be computed for `caretex` models with the `summary()` method:
+
+* _regression models_: mean absolute error (`MAE`), mean square error (`MSE`), root mean square error (`RMSE`), pseudo-R^2 metric of explained variance (`rsq`), R^2 computed as square Pearson's correlation coefficient (`caret_rsq`) as well as coefficients of correlation between the outcome and predictions (Pearson's r: `pearson`, Spearman's $\rho$: `spearman`, Kendall's $\tau B$: `kendall`)
+
+* _binary classifiers_: Harrel's concordance index (`c_index`), log loss (`log_loss`), area under the ROC curve (`AUC`), area under the precision-recall curve (`prAUC`), overall accuracy (`correct_rate`), Cohen's $\kappa$ (`kappa`), F1 score (`F1`), sensitivity and specificity (`Se` and `Sp`), positive and negative prdiction value (`PPV` and `NPV`), precision, recall, detection rate (`detection_rate`), balanced accuaracy (`balanced_accuracy`), Brier score (`brier_score`), as well as mean assignment probabilty in the outcome and fitted classes (`class_p_outcome` and `class_p_fitted`)
+
+* _multi-category classifiers_: as above except of Harrel's concordance index. Note that ROC metrics are averaged over all classes. Additionally, Brier score is calculated in a bit different way for multi-category classification models as compared with binary classifiers (see: the note by Goldstein-Greenwood in the reference list)
+
+The function computes the performance statistics for predictions in the training data set and out-of-fold predictions. By specifying the optional `newdata` argument, evaluation of model performance in the test data set will be done as well.
+
+```r
+ regression_stats <- summary(my_models$regression,
+ newdata = my_boston$test)
+
+ binary_stats <- summary(my_models$binary,
+ newdata = my_biopsy$test)
+
+ multi_class_stats <- summary(my_models$multi_class,
+ newdata = my_wines$test)
+```
+
+```r
+> regression_stats
+
+$train
+# A tibble: 8 × 4
+ statistic estimate lower_ci upper_ci
+
+1 MAE 1.46 NA NA
+2 MSE 4.19 NA NA
+3 RMSE 2.05 NA NA
+4 rsq 0.949 NA NA
+5 caret_rsq 0.950 NA NA
+6 pearson 0.975 NA NA
+7 spearman 0.962 NA NA
+8 kendall 0.849 NA NA
+
+$cv
+# A tibble: 8 × 4
+ statistic estimate lower_ci upper_ci
+
+1 MAE 2.44 1.97 3.11
+2 MSE 12.4 6.32 22.5
+3 RMSE 3.45 2.51 4.74
+4 rsq 0.850 0.700 0.916
+5 caret_rsq 0.855 0.716 0.917
+6 pearson 0.924 0.846 0.958
+7 spearman 0.908 0.857 0.956
+8 kendall 0.762 0.701 0.829
+
+$test
+# A tibble: 8 × 4
+ statistic estimate lower_ci upper_ci
+
+1 MAE 2.46 NA NA
+2 MSE 14.3 NA NA
+3 RMSE 3.79 NA NA
+4 rsq 0.839 NA NA
+5 caret_rsq 0.838 NA NA
+6 pearson 0.916 NA NA
+7 spearman 0.903 NA NA
+8 kendall 0.756 NA NA
+```
+```r
+> binary_stats
+
+$train
+# A tibble: 18 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 c_index 0.995 NA NA
+ 2 log_loss 0.0244 NA NA
+ 3 AUC 1.00 NA NA
+ 4 prAUC 0.990 NA NA
+ 5 correct_rate 0.996 NA NA
+ 6 kappa 0.990 NA NA
+ 7 F1 0.994 NA NA
+ 8 Se 0.994 NA NA
+ 9 Sp 0.997 NA NA
+10 PPV 0.994 NA NA
+11 NPV 0.997 NA NA
+12 precision 0.994 NA NA
+13 recall 0.994 NA NA
+14 detection_rate 0.349 NA NA
+15 balanced_accuracy 0.995 NA NA
+16 brier_score 0.00527 NA NA
+17 class_p_outcome 0.983 NA NA
+18 class_p_fitted 0.983 NA NA
+
+$cv
+# A tibble: 18 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 c_index 0.968 0.926 1
+ 2 log_loss 0.128 0.0242 0.306
+ 3 AUC 0.990 0.973 1
+ 4 prAUC 0.929 0.782 0.975
+ 5 correct_rate 0.969 0.930 1
+ 6 kappa 0.933 0.846 1
+ 7 F1 0.957 0.901 1
+ 8 Se 0.962 0.913 1
+ 9 Sp 0.973 0.932 1
+10 PPV 0.952 0.884 1
+11 NPV 0.980 0.952 1
+12 precision 0.952 0.884 1
+13 recall 0.962 0.913 1
+14 detection_rate 0.338 0.321 0.352
+15 balanced_accuracy 0.968 0.926 1
+16 brier_score 0.0284 0.00516 0.0654
+17 class_p_outcome 0.983 0.970 0.990
+18 class_p_fitted 0.982 0.968 0.991
+
+$test
+# A tibble: 18 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 c_index 0.955 NA NA
+ 2 log_loss 0.0972 NA NA
+ 3 AUC 0.995 NA NA
+ 4 prAUC 0.984 NA NA
+ 5 correct_rate 0.960 NA NA
+ 6 kappa 0.912 NA NA
+ 7 F1 0.943 NA NA
+ 8 Se 0.937 NA NA
+ 9 Sp 0.973 NA NA
+10 PPV 0.949 NA NA
+11 NPV 0.966 NA NA
+12 precision 0.949 NA NA
+13 recall 0.937 NA NA
+14 detection_rate 0.326 NA NA
+15 balanced_accuracy 0.955 NA NA
+16 brier_score 0.0300 NA NA
+17 class_p_outcome 0.979 NA NA
+18 class_p_fitted 0.978 NA NA
+```
+
+```r
+> multi_class_stats
+
+$train
+# A tibble: 17 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 log_loss 0.000368 NA NA
+ 2 AUC 1 NA NA
+ 3 prAUC 0.974 NA NA
+ 4 correct_rate 1 NA NA
+ 5 kappa 1 NA NA
+ 6 F1 1 NA NA
+ 7 Se 1 NA NA
+ 8 Sp 1 NA NA
+ 9 PPV 1 NA NA
+10 NPV 1 NA NA
+11 precision 1 NA NA
+12 recall 1 NA NA
+13 detection_rate 0.333 NA NA
+14 balanced_accuracy 1 NA NA
+15 brier_score 0.00000195 NA NA
+16 class_p_outcome 1.00 NA NA
+17 class_p_fitted 1.00 NA NA
+
+$cv
+# A tibble: 17 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 log_loss 0.145 0.0103 0.533
+ 2 AUC 0.998 0.987 1
+ 3 prAUC 0.868 0.856 0.877
+ 4 correct_rate 0.968 0.915 1
+ 5 kappa 0.952 0.871 1
+ 6 F1 0.969 0.919 1
+ 7 Se 0.971 0.925 1
+ 8 Sp 0.984 0.957 1
+ 9 PPV 0.970 0.917 1
+10 NPV 0.984 0.955 1
+11 precision 0.970 0.917 1
+12 recall 0.971 0.925 1
+13 detection_rate 0.323 0.305 0.333
+14 balanced_accuracy 0.978 0.941 1
+15 brier_score 0.0596 0.00241 0.164
+16 class_p_outcome 0.975 0.945 0.995
+17 class_p_fitted 0.975 0.943 0.994
+
+$test
+# A tibble: 17 × 4
+ statistic estimate lower_ci upper_ci
+
+ 1 log_loss 0.00710 NA NA
+ 2 AUC 1 NA NA
+ 3 prAUC 0.947 NA NA
+ 4 correct_rate 1 NA NA
+ 5 kappa 1 NA NA
+ 6 F1 1 NA NA
+ 7 Se 1 NA NA
+ 8 Sp 1 NA NA
+ 9 PPV 1 NA NA
+10 NPV 1 NA NA
+11 precision 1 NA NA
+12 recall 1 NA NA
+13 detection_rate 0.333 NA NA
+14 balanced_accuracy 1 NA NA
+15 brier_score 0.00179 NA NA
+16 class_p_outcome 0.992 NA NA
+17 class_p_fitted 0.992 NA NA
+```
+
+
+
+### Evaluation of performance in data subsets
+
+
+
+There are cases, when you would like to have a more detailed look at predictions of a machine learnig model in a particular suset of subsets of the data. This can be conveniently done by `split()` applied to a `caretx` model. The splitting factor - a categorical variable present in the training and, optionally, test data set - is speficied by the `f` argument. The `split()` method returns a plain list of prediction objects, which can be plotted or evaluated with `plot()` and `summary()` as described above. In this particular example, we would like to know, how the Boston house price model wors for objects located at and beyond the Charles River bank (coded by the `chas` variable in the `Boston` data set):
+
+```r
+ regression_chas <- split(my_models$regression,
+ f = chas,
+ newdata = my_boston$test)
+
+ regression_chas_stats <- regression_chas %>%
+ map(summary)
+```
+
+```r
+ regression_chas_stats[c("cv.no", "cv.yes")] %>%
+ map(filter, statistic == 'MAE')
+
+$cv.no
+# A tibble: 1 × 4
+ statistic estimate lower_ci upper_ci
+
+1 MAE 2.46 2.00 3.06
+
+$cv.yes
+# A tibble: 1 × 4
+ statistic estimate lower_ci upper_ci
+
+1 MAE 2.18 0.897 4.96
+```
+
+
+## References
+1. Kuhn M. Building predictive models in R using the caret package. J Stat Softw (2008) 28:1–26. doi:10.18637/jss.v028.i05
+2. Friedman JH. Greedy function approximation: A gradient boosting machine. https://doi.org/101214/aos/1013203451 (2001) 29:1189–1232. doi:10.1214/AOS/1013203451
+3. Cohen J. A Coefficient of Agreement for Nominal Scales. Educ Psychol Meas (1960) 20:37–46. doi:10.1177/001316446002000104
+4. McHugh ML. Interrater reliability: the kappa statistic. Biochem Medica (2012) 22:276. doi:10.11613/bm.2012.031
+5. Brier GW. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY. Mon Weather Rev (1950) 78:1–3. doi:10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2
+6. Goldstein-Greenwood J. A Brief on Brier Scores | UVA Library. (2021) Available at: https://library.virginia.edu/data/articles/a-brief-on-brier-scores [Accessed September 5, 2023]