HypothesisTests package
This package implements several hypothesis tests in Julia.
From 7d7fb4a5a204f30e2ccb347391a0b39b9ebeebb9 Mon Sep 17 00:00:00 2001
From: "Documenter.jl" This package implements several hypothesis tests in Julia. Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. This package implements several hypothesis tests in Julia. Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. This page documents the generic Compute a confidence interval with coverage References External links Compute a confidence interval with coverage Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval. References Compute a confidence interval with coverage References Compute the p-value for a given Fisher exact test. The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$: \[ \begin{align*}
+ This page documents the generic Compute a confidence interval with coverage References External links Compute a confidence interval with coverage Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval. References Compute a confidence interval with coverage References Compute the p-value for a given Fisher exact test. The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$: \[ \begin{align*}
p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\
p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i)
- \end{align*}\] For \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\] References Returns the string value, e.g. "Binomial test" or "Sign Test". Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. For \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\] References Returns the string value, e.g. "Binomial test" or "Sign Test". Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups. Perform Bartlett's test of the hypothesis that the covariance matrices of Bartlett's test is sensitive to departures from multivariate normality. Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors Implements See also External resources Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups. Perform Bartlett's test of the hypothesis that the covariance matrices of Bartlett's test is sensitive to departures from multivariate normality. Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors Implements See also External resources Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. Available are both one-sample and $k$-sample tests. Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector Implements: Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors If Implements: References Perform a binomial test of the null hypothesis that the distribution from which Computed confidence intervals by default are Clopper-Pearson intervals. See the Implements: Compute a confidence interval with coverage References External links Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal. See The contingency table is structured as: The Implements: References Compute a confidence interval with coverage Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval. References Compute the p-value for a given Fisher exact test. The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$: \[ \begin{align*}
+ Available are both one-sample and $k$-sample tests. Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector Implements: Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors If Implements: References Perform a binomial test of the null hypothesis that the distribution from which Computed confidence intervals by default are Clopper-Pearson intervals. See the Implements: Compute a confidence interval with coverage References External links Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal. See The contingency table is structured as: The Implements: References Compute a confidence interval with coverage Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval. References Compute the p-value for a given Fisher exact test. The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$: \[ \begin{align*}
p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\
p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i)
- \end{align*}\] For \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\] References Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests. Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector Implements: Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector Implements: Perform an asymptotic two-sample Kolmogorov–Smirnov-test of the null hypothesis that Implements: External links Perform Kruskal-Wallis rank sum test of the null hypothesis that the The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups. The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$: \[ \begin{align*}
+ \end{align*}\] For \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\] References Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests. Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector Implements: Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector Implements: Perform an asymptotic two-sample Kolmogorov–Smirnov-test of the null hypothesis that Implements: External links Perform Kruskal-Wallis rank sum test of the null hypothesis that the The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups. The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$: \[ \begin{align*}
H & = \frac{12}{n(n+1)} \sum_{g ∈ \mathcal{G}} \frac{R_g^2}{n_g} - 3(n+1)\\
C & = 1-\frac{1}{n^3-n}\sum_{t ∈ \mathcal{T}} (t^3-t),
- \end{align*}\] where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details. Implements: References External links Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test. When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, Implements: Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as When there are no tied ranks, the exact p-value is computed using the Implements: Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic: \[ \begin{align*}
+ \end{align*}\] where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details. Implements: References External links Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test. When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, Implements: Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as When there are no tied ranks, the exact p-value is computed using the Implements: Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic: \[ \begin{align*}
μ & = \frac{n_x n_y}{2}\\
σ & = \frac{n_x n_y}{12}\left(n_x + n_y + 1 - \frac{a}{(n_x + n_y)(n_x +
n_y - 1)}\right)\\
a & = \sum_{t \in \mathcal{T}} t^3 - t
- \end{align*}\] where $\mathcal{T}$ is the set of the counts of tied values at each tied position. Implements: Perform a sign test of the null hypothesis that the distribution from which Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median. Implements: Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of When there are no tied ranks, the exact p-value is computed using the Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of The p-value is computed using a normal approximation to the distribution of the signed rank statistic: \[ \begin{align*}
+ \end{align*}\] where $\mathcal{T}$ is the set of the counts of tied values at each tied position. Implements: Perform a sign test of the null hypothesis that the distribution from which Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median. Implements: Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of When there are no tied ranks, the exact p-value is computed using the Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of The p-value is computed using a normal approximation to the distribution of the signed rank statistic: \[ \begin{align*}
μ & = \frac{n(n + 1)}{4}\\
σ & = \frac{n(n + 1)(2 * n + 1)}{24} - \frac{a}{48}\\
a & = \sum_{t \in \mathcal{T}} t^3 - t
- \end{align*}\] where $\mathcal{T}$ is the set of the counts of tied values at each tied position. Perform a permutation test (a.k.a. randomization test) of the null hypothesis that Perform a permutation test (a.k.a. randomization test) of the null hypothesis that Perform Fligner-Killeen median test of the null hypothesis that the This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights \[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\] The version implemented here uses median centering in each of the samples. Implements: References External links where $\mathcal{T}$ is the set of the counts of tied values at each tied position. Perform a permutation test (a.k.a. randomization test) of the null hypothesis that Perform a permutation test (a.k.a. randomization test) of the null hypothesis that Perform Fligner-Killeen median test of the null hypothesis that the This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights \[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\] The version implemented here uses median centering in each of the samples. Implements: References External links Perform a Shapiro-Wilk test of the null hypothesis that the data in vector This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size Keyword arguments The following keyword arguments may be passed. Implements: As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply: Implementation notes References Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591. Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203 Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109 Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146. Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. Perform a Shapiro-Wilk test of the null hypothesis that the data in vector This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size Keyword arguments The following keyword arguments may be passed. Implements: As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply: Implementation notes References Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591. Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203 Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109 Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146. Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5. Perform a Power Divergence test. If If Note that the entries of Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the The power divergence test is given by \[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij}
- /\hat{n}_{ij})^λ -1\right]\] where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed: Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$. Implements: References Compute a confidence interval with coverage References Perform a Pearson chi-squared test (equivalent to a If If only If Note that the entries of Perform a multinomial likelihood ratio test (equivalent to a If If Note that the entries of Perform a one sample t-test of the null hypothesis that Perform a one sample t-test of the null hypothesis that the data in vector Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia. Perform a two-sample t-test of the null hypothesis that samples Perform a two-sample t-test of the null hypothesis that Perform an unequal variance two-sample t-test of the null hypothesis that This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation: \[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n
- \frac{(k_i s_i^2)^2}{ν_i}}\] Perform a one sample z-test of the null hypothesis that Perform a one sample z-test of the null hypothesis that the data in vector Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors Perform a two-sample z-test of the null hypothesis that Perform an unequal variance two-sample z-test of the null hypothesis that Perform an F-test of the null hypothesis that two real-valued vectors Implements: References External links Perform one-way analysis of variance test of the hypothesis that that the The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples. Implements: External links Perform Levene's test of the hypothesis that that the The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows: \[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2}
- {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\] where The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom. Implements: References External links The Brown–Forsythe test is a statistical test for the equality of The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group. Implements: References External links Settings This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.HypothesisTests package
HypothesisTests package
Methods
confint
, pvalue
and testname
methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.Confidence interval
StatsAPI.confint
— Functionconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
level
for a binomial proportion using one of the following methods. Possible values for method
are::clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
level
for multinomial proportions using one of the following methods. Possible values for method
are::auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)p-value
StatsAPI.pvalue
— Functionpvalue(x::FisherExactTest; tail = :both, method = :central)
Methods
confint
, pvalue
and testname
methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.Confidence interval
StatsAPI.confint
— Functionconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
level
for a binomial proportion using one of the following methods. Possible values for method
are::clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
level
for multinomial proportions using one of the following methods. Possible values for method
are::auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)p-value
StatsAPI.pvalue
— Functionpvalue(x::FisherExactTest; tail = :both, method = :central)
tail = :both
, possible values for method
are::central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:Test name
HypothesisTests.testname
— Functiontestname(::HypothesisTest)
tail = :both
, possible values for method
are::central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:Test name
HypothesisTests.testname
— Functiontestname(::HypothesisTest)
Multivariate tests
Hotelling's $T^2$ test
HypothesisTests.OneSampleHotellingT2Test
— TypeOneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)
X
is equal to μ₀
.OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)
X
and Y
is equal to μ₀
.HypothesisTests.EqualCovHotellingT2Test
— TypeEqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
is zero, assuming that X
and Y
have equal covariance matrices.HypothesisTests.UnequalCovHotellingT2Test
— TypeUnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
is zero, without assuming that X
and Y
have equal covariance matrices.Equality of covariance matrices
HypothesisTests.BartlettTest
— TypeBartlettTest(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
are equal.Correlation and partial correlation test
HypothesisTests.CorrelationTest
— TypeCorrelationTest(x, y)
x
and y
is zero.CorrelationTest(x, y, Z)
x
and y
given the matrix Z
is zero.pvalue
for the t-test. Implements confint
using an approximate confidence interval based on Fisher's $z$-transform.partialcor
from StatsBase.Multivariate tests
Hotelling's $T^2$ test
HypothesisTests.OneSampleHotellingT2Test
— TypeOneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)
X
is equal to μ₀
.OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)
X
and Y
is equal to μ₀
.HypothesisTests.EqualCovHotellingT2Test
— TypeEqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
is zero, assuming that X
and Y
have equal covariance matrices.HypothesisTests.UnequalCovHotellingT2Test
— TypeUnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
is zero, without assuming that X
and Y
have equal covariance matrices.Equality of covariance matrices
HypothesisTests.BartlettTest
— TypeBartlettTest(X::AbstractMatrix, Y::AbstractMatrix)
X
and Y
are equal.Correlation and partial correlation test
HypothesisTests.CorrelationTest
— TypeCorrelationTest(x, y)
x
and y
is zero.CorrelationTest(x, y, Z)
x
and y
given the matrix Z
is zero.pvalue
for the t-test. Implements confint
using an approximate confidence interval based on Fisher's $z$-transform.partialcor
from StatsBase.Nonparametric tests
Anderson-Darling test
HypothesisTests.OneSampleADTest
— TypeOneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
come from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.KSampleADTest
— TypeKSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)
xs
come from the same distribution against the alternative hypothesis that the samples come from different distributions.modified
parameter enables a modified test calculation for samples whose observations do not all coincide.nsim
is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim
random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.pvalue
Binomial test
HypothesisTests.BinomialTest
— TypeBinomialTest(x::Integer, n::Integer, p::Real = 0.5)
-BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)
x
successes were encountered in n
draws (or alternatively from which the vector x
was drawn) has success probability p
against the alternative hypothesis that the success probability is not equal to p
.confint(::BinomialTest)
documentation for a list of supported methods to compute confidence intervals.pvalue
, confint(::BinomialTest)
StatsAPI.confint
— Methodconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
level
for a binomial proportion using one of the following methods. Possible values for method
are::clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.Fisher exact test
HypothesisTests.FisherExactTest
— TypeFisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)
pvalue(::FisherExactTest)
and confint(::FisherExactTest)
for details about the computation of the default p-value and confidence interval, respectively.- X1 X2 Y1 a b Y2 c d show
function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.pvalue(::FisherExactTest)
, confint(::FisherExactTest)
StatsAPI.confint
— Methodconfint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).StatsAPI.pvalue
— Methodpvalue(x::FisherExactTest; tail = :both, method = :central)
Nonparametric tests
Anderson-Darling test
HypothesisTests.OneSampleADTest
— TypeOneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
come from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.KSampleADTest
— TypeKSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)
xs
come from the same distribution against the alternative hypothesis that the samples come from different distributions.modified
parameter enables a modified test calculation for samples whose observations do not all coincide.nsim
is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim
random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.pvalue
Binomial test
HypothesisTests.BinomialTest
— TypeBinomialTest(x::Integer, n::Integer, p::Real = 0.5)
+BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)
x
successes were encountered in n
draws (or alternatively from which the vector x
was drawn) has success probability p
against the alternative hypothesis that the success probability is not equal to p
.confint(::BinomialTest)
documentation for a list of supported methods to compute confidence intervals.pvalue
, confint(::BinomialTest)
StatsAPI.confint
— Methodconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
level
for a binomial proportion using one of the following methods. Possible values for method
are::clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.Fisher exact test
HypothesisTests.FisherExactTest
— TypeFisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)
pvalue(::FisherExactTest)
and confint(::FisherExactTest)
for details about the computation of the default p-value and confidence interval, respectively.- X1 X2 Y1 a b Y2 c d show
function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.pvalue(::FisherExactTest)
, confint(::FisherExactTest)
StatsAPI.confint
— Methodconfint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).StatsAPI.pvalue
— Methodpvalue(x::FisherExactTest; tail = :both, method = :central)
tail = :both
, possible values for method
are::central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:Kolmogorov-Smirnov test
HypothesisTests.ExactOneSampleKSTest
— TypeExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.ApproximateOneSampleKSTest
— TypeApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.ApproximateTwoSampleKSTest
— TypeApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
and y
are drawn from the same distribution against the alternative hypothesis that they come from different distributions.pvalue
Kruskal-Wallis rank sum test
HypothesisTests.KruskalWallisTest
— TypeKruskalWallisTest(groups::AbstractVector{<:Real}...)
groups
$\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.tail = :both
, possible values for method
are::central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:Kolmogorov-Smirnov test
HypothesisTests.ExactOneSampleKSTest
— TypeExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.ApproximateOneSampleKSTest
— TypeApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.pvalue
HypothesisTests.ApproximateTwoSampleKSTest
— TypeApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
and y
are drawn from the same distribution against the alternative hypothesis that they come from different distributions.pvalue
Kruskal-Wallis rank sum test
HypothesisTests.KruskalWallisTest
— TypeKruskalWallisTest(groups::AbstractVector{<:Real}...)
groups
$\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.pvalue
Mann-Whitney U test
HypothesisTests.MannWhitneyUTest
— FunctionMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.MannWhitneyUTest
performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest
performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest
or ApproximateMannWhitneyUTest
directly.pvalue
HypothesisTests.ExactMannWhitneyUTest
— TypeExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.pwilcox
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.pvalue
HypothesisTests.ApproximateMannWhitneyUTest
— TypeApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.pvalue
Mann-Whitney U test
HypothesisTests.MannWhitneyUTest
— FunctionMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.MannWhitneyUTest
performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest
performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest
or ApproximateMannWhitneyUTest
directly.pvalue
HypothesisTests.ExactMannWhitneyUTest
— TypeExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.pwilcox
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.pvalue
HypothesisTests.ApproximateMannWhitneyUTest
— TypeApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.pvalue
Sign test
HypothesisTests.SignTest
— TypeSignTest(x::AbstractVector{T<:Real}, median::Real = 0)
-SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)
x
(or x - y
if y
is provided) was drawn has median median
against the alternative hypothesis that the median is not equal to median
.Wald-Wolfowitz independence test
HypothesisTests.WaldWolfowitzTest
— TypeWaldWolfowitzTest(x::AbstractVector{Bool})
-WaldWolfowitzTest(x::AbstractVector{<:Real})
pvalue
Wilcoxon signed rank test
HypothesisTests.SignedRankTest
— FunctionSignedRankTest(x::AbstractVector{<:Real})
-SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.SignedRankTest
performs an exact signed rank test. In all other cases, SignedRankTest
performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest
or ApproximateSignedRankTest
directly.HypothesisTests.ExactSignedRankTest
— TypeExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.psignrank
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.HypothesisTests.ApproximateSignedRankTest
— TypeApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.pvalue
Sign test
HypothesisTests.SignTest
— TypeSignTest(x::AbstractVector{T<:Real}, median::Real = 0)
+SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)
x
(or x - y
if y
is provided) was drawn has median median
against the alternative hypothesis that the median is not equal to median
.Wald-Wolfowitz independence test
HypothesisTests.WaldWolfowitzTest
— TypeWaldWolfowitzTest(x::AbstractVector{Bool})
+WaldWolfowitzTest(x::AbstractVector{<:Real})
pvalue
Wilcoxon signed rank test
HypothesisTests.SignedRankTest
— FunctionSignedRankTest(x::AbstractVector{<:Real})
+SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.SignedRankTest
performs an exact signed rank test. In all other cases, SignedRankTest
performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest
or ApproximateSignedRankTest
directly.HypothesisTests.ExactSignedRankTest
— TypeExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.psignrank
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.HypothesisTests.ApproximateSignedRankTest
— TypeApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.Permutation test
HypothesisTests.ExactPermutationTest
— FunctionExactPermutationTest(x::Vector, y::Vector, f::Function)
f(x)
is equal to f(y)
. All possible permutations are sampled.HypothesisTests.ApproximatePermutationTest
— FunctionApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)
f(x)
is equal to f(y)
. n
of the factorial(length(x)+length(y))
permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng()
.Fligner-Killeen test
HypothesisTests.FlignerKilleenTest
— FunctionFlignerKilleenTest(groups::AbstractVector{<:Real}...)
groups
have equal variances, a test for homogeneity of variances.pvalue
Shapiro-Wilk test
HypothesisTests.ShapiroWilkTest
— TypeShapiroWilkTest(X::AbstractVector{<:Real},
+ \end{align*}\]
Permutation test
HypothesisTests.ExactPermutationTest
— FunctionExactPermutationTest(x::Vector, y::Vector, f::Function)
f(x)
is equal to f(y)
. All possible permutations are sampled.HypothesisTests.ApproximatePermutationTest
— FunctionApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)
f(x)
is equal to f(y)
. n
of the factorial(length(x)+length(y))
permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng()
.Fligner-Killeen test
HypothesisTests.FlignerKilleenTest
— FunctionFlignerKilleenTest(groups::AbstractVector{<:Real}...)
groups
have equal variances, a test for homogeneity of variances.pvalue
Shapiro-Wilk test
HypothesisTests.ShapiroWilkTest
— TypeShapiroWilkTest(X::AbstractVector{<:Real},
swc::AbstractVector{<:Real}=shapiro_wilk_coefs(length(X));
sorted::Bool=issorted(X),
- censored::Integer=0)
X
come from a normal distribution.N = 3
, and for ranges 4 ≤ N ≤ 11
and 12 ≤ N ≤ 5000
(Royston 1992) two separate approximations for p-values are used.sorted::Bool=issorted(X)
: to indicate that sample X
is already sorted.censored::Integer=0
: to censor the largest samples from X
(so called upper-tail censoring)pvalue
N > 2000
) or small (N < 20
)censored / N > 0.8
)swc = shapiro_wilk_coefs(length(X))
once and pass it to the test via ShapiroWilkTest(X, swc)
for re-use.X
should be passed and indicated with sorted=true
keyword argument.X
come from a normal distribution.N = 3
, and for ranges 4 ≤ N ≤ 11
and 12 ≤ N ≤ 5000
(Royston 1992) two separate approximations for p-values are used.sorted::Bool=issorted(X)
: to indicate that sample X
is already sorted.censored::Integer=0
: to censor the largest samples from X
(so called upper-tail censoring)pvalue
N > 2000
) or small (N < 20
)censored / N > 0.8
)swc = shapiro_wilk_coefs(length(X))
once and pass it to the test via ShapiroWilkTest(X, swc)
for re-use.X
should be passed and indicated with sorted=true
keyword argument.Parametric tests
Power divergence test
HypothesisTests.PowerDivergenceTest
— TypePowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))
y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using the counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.x
(and y
if provided) must be non-negative integers.confint(::PowerDivergenceTest)
documentation for a list of supported methods to compute confidence intervals.pvalue
, confint(::PowerDivergenceTest)
StatsAPI.confint
— Methodconfint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
level
for multinomial proportions using one of the following methods. Possible values for method
are::auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)Pearson chi-squared test
HypothesisTests.ChisqTest
— FunctionChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])
PowerDivergenceTest
with $λ = 1$).y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.y
and x
are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0
is calculated by the proportion of each individual values in y
. Here, the hypothesis tested is whether the two samples x
and y
come from the same population or not.x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.x
(and y
if provided) must be non-negative integers.Multinomial likelihood ratio test
HypothesisTests.MultinomialLRTest
— FunctionMultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])
PowerDivergenceTest
with $λ = 0$).y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.x
(and y
if provided) must be non-negative integers.t-test
HypothesisTests.OneSampleTTest
— TypeOneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
n
values with mean xbar
and sample standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.HypothesisTests.EqualVarianceTTest
— TypeEqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)
x
and y
described by the number of elements nx
and ny
, the mean mx
and my
, and variance vx
and vy
come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.HypothesisTests.UnequalVarianceTTest
— TypeUnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.z-test
HypothesisTests.OneSampleZTest
— TypeOneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
n
values with mean xbar
and population standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.HypothesisTests.EqualVarianceZTest
— TypeEqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.HypothesisTests.UnequalVarianceZTest
— TypeUnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.F-test
HypothesisTests.VarianceFTest
— TypeVarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
x
and y
have equal variances.pvalue
One-way ANOVA Test
HypothesisTests.OneWayANOVATest
— FunctionOneWayANOVATest(groups::AbstractVector{<:Real}...)
groups
means are equal.pvalue
Levene's Test
HypothesisTests.LeveneTest
— FunctionLeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)
groups
variances are equal. By default the mean statistic
is used for centering in each of the groups
, but other statistics are accepted: median or truncated mean, see BrownForsytheTest
. By default the absolute value of the score difference, scorediff
, is used, but other functions are accepted: x² or √|x|.pvalue
Brown-Forsythe Test
HypothesisTests.BrownForsytheTest
— FunctionBrownForsytheTest(groups::AbstractVector{<:Real}...)
groups
variances.pvalue
where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:
Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.
Implements: pvalue
, confint(::PowerDivergenceTest)
References
StatsAPI.confint
— Methodconfint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)References
HypothesisTests.ChisqTest
— FunctionChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest
with $λ = 1$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If only y
and x
are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0
is calculated by the proportion of each individual values in y
. Here, the hypothesis tested is whether the two samples x
and y
come from the same population or not.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.MultinomialLRTest
— FunctionMultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest
with $λ = 0$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.OneSampleTTest
— TypeOneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that n
values with mean xbar
and sample standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.
HypothesisTests.EqualVarianceTTest
— TypeEqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)
Perform a two-sample t-test of the null hypothesis that samples x
and y
described by the number of elements nx
and ny
, the mean mx
and my
, and variance vx
and vy
come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceTTest
— TypeUnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:
\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n + \frac{(k_i s_i^2)^2}{ν_i}}\]
HypothesisTests.OneSampleZTest
— TypeOneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that n
values with mean xbar
and population standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
HypothesisTests.EqualVarianceZTest
— TypeEqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceZTest
— TypeUnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
HypothesisTests.VarianceFTest
— TypeVarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an F-test of the null hypothesis that two real-valued vectors x
and y
have equal variances.
Implements: pvalue
References
External links
HypothesisTests.OneWayANOVATest
— FunctionOneWayANOVATest(groups::AbstractVector{<:Real}...)
Perform one-way analysis of variance test of the hypothesis that that the groups
means are equal.
The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.
Implements: pvalue
External links
HypothesisTests.LeveneTest
— FunctionLeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)
Perform Levene's test of the hypothesis that that the groups
variances are equal. By default the mean statistic
is used for centering in each of the groups
, but other statistics are accepted: median or truncated mean, see BrownForsytheTest
. By default the absolute value of the score difference, scorediff
, is used, but other functions are accepted: x² or √|x|.
The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:
\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} + {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]
where
The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.
Implements: pvalue
References
External links
HypothesisTests.BrownForsytheTest
— FunctionBrownForsytheTest(groups::AbstractVector{<:Real}...)
The Brown–Forsythe test is a statistical test for the equality of groups
variances.
The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.
Implements: pvalue
References
External links
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.