-
Notifications
You must be signed in to change notification settings - Fork 5
/
README.Rmd
63 lines (46 loc) · 2.11 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# imbalance
[![Build Status](https://travis-ci.org/ncordon/imbalance.svg?branch=master)](https://travis-ci.org/ncordon/imbalance)
[![minimal R version](https://img.shields.io/badge/R%3E%3D-3.3.0-6666ff.svg)](https://cran.r-project.org/)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/imbalance)](https://cran.r-project.org/package=imbalance)
[![packageversion](https://img.shields.io/badge/Package%20version-1.0.1-orange.svg?style=flat-square)](https://github.com/ncordon/imbalance/commits/master)
`imbalance` provides a set of tools to work with imbalanced datasets: novel oversampling algorithms, filtering of instances and evaluation of synthetic instances.
## Installation
You can install imbalance from Github with:
```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("ncordon/imbalance")
```
## Examples
Run `pdfos` algorithm on `newthyroid1` imbalanced dataset and plot a comparison between attributes.
```{r example-pdfos, fig.width = 10, fig.height = 10}
library("imbalance")
data(newthyroid1)
newSamples <- pdfos(newthyroid1, numInstances = 80)
# Join new samples with old imbalanced dataset
newDataset <- rbind(newthyroid1, newSamples)
# Plot a visual comparison between both datasets
plotComparison(newthyroid1, newDataset, attrs = names(newthyroid1)[1:3], cols = 2, classAttr = "Class")
```
After filtering examples with `neater`:
```{r example-neater, fig.width = 10, fig.height = 10}
filteredSamples <- neater(newthyroid1, newSamples, iterations = 500)
filteredNewDataset <- rbind(newthyroid1, filteredSamples)
plotComparison(newthyroid1, filteredNewDataset, attrs = names(newthyroid1)[1:3])
```
Execute method `ADASYN` using the wrapper provided by the package, comparing imbalance ratios of the dataset before and after oversampling:
```{r example-adas}
imbalanceRatio(glass0)
newDataset <- oversample(glass0, method = "ADASYN")
imbalanceRatio(newDataset)
```