Fast and Numerically Stable Statistical Analysis Utilities
Perform fast and numerically stable statistical analysis using wink-statistics
. It can handle real-time stream of data and can incrementally compute required statistic that usually would take more than one pass over the data as in standard deviation or simple linear regression.
- Boxplot
- Covariance
- Difference with definable lag
- Five Number Summary
- Frequency Table
- Histogram
- Median Absolute Deviation (MAD)
- Maximum
- Mean
- Median
- Minimum
- Numerically stable sum
- Percentile
- Probability computation & CI from successes count
- Probability estimates aggregation
- Simple Linear Regression
- Standard Deviation
- Summary statistics
Use npm to install:
npm install wink-statistics --save
Here is an example of computing slope
, intercept
and r2
etc. from a stream of (x, y)
data in real-time:
// Load wink-statistics.
var stats = require( 'wink-statistics' );
// Instantiate streaming simple linear regression
var regression = stats.streaming.simpleLinearRegression();
// Following would be ideally placed within a stream of data:
regression.compute( 10, 80 );
regression.compute( 15, 75 );
regression.compute( 16, 65 );
regression.compute( 18, 50 );
regression.compute( 21, 45 );
regression.compute( 30, 30 );
regression.compute( 36, 18 );
regression.compute( 40, 9 );
// Use result() method to access the outcome in real time.
regression.result();
// returns { slope: -2.3621,
// intercept: 101.4188,
// r: -0.9766,
// r2: 0.9537,
// se: 5.624,
// size: 8
// }
The functions under the data
name space require data in an array. Here is an example of boxplot analysis:
var boxplot = stats.data.boxplot;
var data = [
-12, 14, 14, 14, 16, 18, 20, 20, 21, 23, 27, 27, 27, 29, 31,
31, 32, 32, 34, 36, 40, 40, 40, 40, 40, 42, 51, 56, 60, 88
];
boxplot( data );
// returns {
// min: -12, q1: 20, median: 31, q3: 40, max: 88,
// iqr: 20, range: 100, size: 30,
// leftOutliers: { begin: 0, end: 0, count: 1, fence: 14 },
// rightOutliers: { begin: 29, end: 29, count: 1, fence: 60 },
// leftNotch: 25.230655727612252,
// rightNotch: 36.76934427238775
// }
wink-stats
can handle data in different formats to avoid pre-processing. For example, you can compute median from the array of objects containing value:
var median = stats.data.median;
var data = [
{ value: 1 },
{ value: 1 },
{ value: 2 },
{ value: 2 },
{ value: 3 },
{ value: 3 },
{ value: 4 },
{ value: 4 }
];
// Use key name — `value` as the `accessor`
median( data, 'value' );
// returns 2.5
It even supports passing functions as accessors
to handle even more complex data structures.
Check out the statistics API documentation to learn more.
If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.
Wink is a family of open source packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.
wink-statistics is copyright 2017-20 GRAYPE Systems Private Limited.
It is licensed under the terms of the MIT License.