HypothesisTests package
This package implements several hypothesis tests in Julia.
diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 9b4a880c..fedf6d9e 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-10-02T11:11:29","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-10-02T11:51:33","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/index.html b/dev/index.html index db677f4b..002ceafd 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -
This package implements several hypothesis tests in Julia.
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
This package implements several hypothesis tests in Julia.
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
This page documents the generic confint
, pvalue
and testname
methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.
StatsAPI.confint
— Functionconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
Compute a confidence interval with coverage level
for a binomial proportion using one of the following methods. Possible values for method
are:
:clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.References
External links
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
Compute a confidence interval with coverage level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).
Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.
References
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)References
StatsAPI.pvalue
— Functionpvalue(x::FisherExactTest; tail = :both, method = :central)
Compute the p-value for a given Fisher exact test.
The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:
\[ \begin{align*} +
This page documents the generic confint
, pvalue
and testname
methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.
StatsAPI.confint
— Functionconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
Compute a confidence interval with coverage level
for a binomial proportion using one of the following methods. Possible values for method
are:
:clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.References
External links
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
Compute a confidence interval with coverage level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).
Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.
References
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)References
StatsAPI.pvalue
— Functionpvalue(x::FisherExactTest; tail = :both, method = :central)
Compute the p-value for a given Fisher exact test.
The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:
\[ \begin{align*} p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\ p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i) - \end{align*}\]
For tail = :both
, possible values for method
are:
:central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:\[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]
References
HypothesisTests.testname
— Functiontestname(::HypothesisTest)
Returns the string value, e.g. "Binomial test" or "Sign Test".
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
For tail = :both
, possible values for method
are:
:central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:\[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]
References
HypothesisTests.testname
— Functiontestname(::HypothesisTest)
Returns the string value, e.g. "Binomial test" or "Sign Test".
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
HypothesisTests.OneSampleHotellingT2Test
— TypeOneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)
Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of X
is equal to μ₀
.
OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)
Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between X
and Y
is equal to μ₀
.
HypothesisTests.EqualCovHotellingT2Test
— TypeEqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X
and Y
is zero, assuming that X
and Y
have equal covariance matrices.
HypothesisTests.UnequalCovHotellingT2Test
— TypeUnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X
and Y
is zero, without assuming that X
and Y
have equal covariance matrices.
Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups.
HypothesisTests.BartlettTest
— TypeBartlettTest(X::AbstractMatrix, Y::AbstractMatrix)
Perform Bartlett's test of the hypothesis that the covariance matrices of X
and Y
are equal.
Bartlett's test is sensitive to departures from multivariate normality.
HypothesisTests.CorrelationTest
— TypeCorrelationTest(x, y)
Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors x
and y
is zero.
CorrelationTest(x, y, Z)
Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors x
and y
given the matrix Z
is zero.
Implements pvalue
for the t-test. Implements confint
using an approximate confidence interval based on Fisher's $z$-transform.
See also partialcor
from StatsBase.
External resources
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
HypothesisTests.OneSampleHotellingT2Test
— TypeOneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)
Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of X
is equal to μ₀
.
OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)
Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between X
and Y
is equal to μ₀
.
HypothesisTests.EqualCovHotellingT2Test
— TypeEqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X
and Y
is zero, assuming that X
and Y
have equal covariance matrices.
HypothesisTests.UnequalCovHotellingT2Test
— TypeUnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)
Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X
and Y
is zero, without assuming that X
and Y
have equal covariance matrices.
Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups.
HypothesisTests.BartlettTest
— TypeBartlettTest(X::AbstractMatrix, Y::AbstractMatrix)
Perform Bartlett's test of the hypothesis that the covariance matrices of X
and Y
are equal.
Bartlett's test is sensitive to departures from multivariate normality.
HypothesisTests.CorrelationTest
— TypeCorrelationTest(x, y)
Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors x
and y
is zero.
CorrelationTest(x, y, Z)
Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors x
and y
given the matrix Z
is zero.
Implements pvalue
for the t-test. Implements confint
using an approximate confidence interval based on Fisher's $z$-transform.
See also partialcor
from StatsBase.
External resources
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
Available are both one-sample and $k$-sample tests.
HypothesisTests.OneSampleADTest
— TypeOneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector x
come from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.KSampleADTest
— TypeKSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)
Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors xs
come from the same distribution against the alternative hypothesis that the samples come from different distributions.
modified
parameter enables a modified test calculation for samples whose observations do not all coincide.
If nsim
is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim
random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.
Implements: pvalue
References
HypothesisTests.BinomialTest
— TypeBinomialTest(x::Integer, n::Integer, p::Real = 0.5)
-BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)
Perform a binomial test of the null hypothesis that the distribution from which x
successes were encountered in n
draws (or alternatively from which the vector x
was drawn) has success probability p
against the alternative hypothesis that the success probability is not equal to p
.
Computed confidence intervals by default are Clopper-Pearson intervals. See the confint(::BinomialTest)
documentation for a list of supported methods to compute confidence intervals.
Implements: pvalue
, confint(::BinomialTest)
StatsAPI.confint
— Methodconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
Compute a confidence interval with coverage level
for a binomial proportion using one of the following methods. Possible values for method
are:
:clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.References
External links
HypothesisTests.FisherExactTest
— TypeFisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)
Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal.
See pvalue(::FisherExactTest)
and confint(::FisherExactTest)
for details about the computation of the default p-value and confidence interval, respectively.
The contingency table is structured as:
- | X1 | X2 |
---|---|---|
Y1 | a | b |
Y2 | c | d |
The show
function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.
Implements: pvalue(::FisherExactTest)
, confint(::FisherExactTest)
References
StatsAPI.confint
— Methodconfint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
Compute a confidence interval with coverage level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).
Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.
References
StatsAPI.pvalue
— Methodpvalue(x::FisherExactTest; tail = :both, method = :central)
Compute the p-value for a given Fisher exact test.
The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:
\[ \begin{align*} +
Available are both one-sample and $k$-sample tests.
HypothesisTests.OneSampleADTest
— TypeOneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector x
come from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.KSampleADTest
— TypeKSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)
Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors xs
come from the same distribution against the alternative hypothesis that the samples come from different distributions.
modified
parameter enables a modified test calculation for samples whose observations do not all coincide.
If nsim
is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim
random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.
Implements: pvalue
References
HypothesisTests.BinomialTest
— TypeBinomialTest(x::Integer, n::Integer, p::Real = 0.5)
+BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)
Perform a binomial test of the null hypothesis that the distribution from which x
successes were encountered in n
draws (or alternatively from which the vector x
was drawn) has success probability p
against the alternative hypothesis that the success probability is not equal to p
.
Computed confidence intervals by default are Clopper-Pearson intervals. See the confint(::BinomialTest)
documentation for a list of supported methods to compute confidence intervals.
Implements: pvalue
, confint(::BinomialTest)
StatsAPI.confint
— Methodconfint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
Compute a confidence interval with coverage level
for a binomial proportion using one of the following methods. Possible values for method
are:
:clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level
; it is usually too conservative.:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.:waldcc
: Wald interval with a continuity correction that extends the interval by 1/2n
on both ends.:wilson
: Wilson score interval relies on a normal approximation. In contrast to :wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.:arcsine
: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.References
External links
HypothesisTests.FisherExactTest
— TypeFisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)
Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal.
See pvalue(::FisherExactTest)
and confint(::FisherExactTest)
for details about the computation of the default p-value and confidence interval, respectively.
The contingency table is structured as:
- | X1 | X2 |
---|---|---|
Y1 | a | b |
Y2 | c | d |
The show
function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.
Implements: pvalue(::FisherExactTest)
, confint(::FisherExactTest)
References
StatsAPI.confint
— Methodconfint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
Compute a confidence interval with coverage level
. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).
Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.
References
StatsAPI.pvalue
— Methodpvalue(x::FisherExactTest; tail = :both, method = :central)
Compute the p-value for a given Fisher exact test.
The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:
\[ \begin{align*} p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\ p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i) - \end{align*}\]
For tail = :both
, possible values for method
are:
:central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:\[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]
References
Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.
HypothesisTests.ExactOneSampleKSTest
— TypeExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.ApproximateOneSampleKSTest
— TypeApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.ApproximateTwoSampleKSTest
— TypeApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an asymptotic two-sample Kolmogorov–Smirnov-test of the null hypothesis that x
and y
are drawn from the same distribution against the alternative hypothesis that they come from different distributions.
Implements: pvalue
External links
HypothesisTests.KruskalWallisTest
— TypeKruskalWallisTest(groups::AbstractVector{<:Real}...)
Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups
$\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.
The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.
The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$:
\[ \begin{align*} + \end{align*}\]
For tail = :both
, possible values for method
are:
:central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:\[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]
References
Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.
HypothesisTests.ExactOneSampleKSTest
— TypeExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.ApproximateOneSampleKSTest
— TypeApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
HypothesisTests.ApproximateTwoSampleKSTest
— TypeApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an asymptotic two-sample Kolmogorov–Smirnov-test of the null hypothesis that x
and y
are drawn from the same distribution against the alternative hypothesis that they come from different distributions.
Implements: pvalue
External links
HypothesisTests.KruskalWallisTest
— TypeKruskalWallisTest(groups::AbstractVector{<:Real}...)
Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups
$\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.
The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.
The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$:
\[ \begin{align*} H & = \frac{12}{n(n+1)} \sum_{g ∈ \mathcal{G}} \frac{R_g^2}{n_g} - 3(n+1)\\ C & = 1-\frac{1}{n^3-n}\sum_{t ∈ \mathcal{T}} (t^3-t), - \end{align*}\]
where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details.
Implements: pvalue
References
External links
HypothesisTests.MannWhitneyUTest
— FunctionMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest
performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest
performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest
or ApproximateMannWhitneyUTest
directly.
Implements: pvalue
HypothesisTests.ExactMannWhitneyUTest
— TypeExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
When there are no tied ranks, the exact p-value is computed using the pwilcox
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
Implements: pvalue
HypothesisTests.ApproximateMannWhitneyUTest
— TypeApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:
\[ \begin{align*} + \end{align*}\]
where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details.
Implements: pvalue
References
External links
HypothesisTests.MannWhitneyUTest
— FunctionMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest
performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest
performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest
or ApproximateMannWhitneyUTest
directly.
Implements: pvalue
HypothesisTests.ExactMannWhitneyUTest
— TypeExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
When there are no tied ranks, the exact p-value is computed using the pwilcox
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
Implements: pvalue
HypothesisTests.ApproximateMannWhitneyUTest
— TypeApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:
\[ \begin{align*} μ & = \frac{n_x n_y}{2}\\ σ & = \frac{n_x n_y}{12}\left(n_x + n_y + 1 - \frac{a}{(n_x + n_y)(n_x + n_y - 1)}\right)\\ a & = \sum_{t \in \mathcal{T}} t^3 - t - \end{align*}\]
where $\mathcal{T}$ is the set of the counts of tied values at each tied position.
Implements: pvalue
HypothesisTests.SignTest
— TypeSignTest(x::AbstractVector{T<:Real}, median::Real = 0)
-SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)
Perform a sign test of the null hypothesis that the distribution from which x
(or x - y
if y
is provided) was drawn has median median
against the alternative hypothesis that the median is not equal to median
.
HypothesisTests.WaldWolfowitzTest
— TypeWaldWolfowitzTest(x::AbstractVector{Bool})
-WaldWolfowitzTest(x::AbstractVector{<:Real})
Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.
Implements: pvalue
HypothesisTests.SignedRankTest
— FunctionSignedRankTest(x::AbstractVector{<:Real})
-SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest
performs an exact signed rank test. In all other cases, SignedRankTest
performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest
or ApproximateSignedRankTest
directly.
HypothesisTests.ExactSignedRankTest
— TypeExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks, the exact p-value is computed using the psignrank
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
HypothesisTests.ApproximateSignedRankTest
— TypeApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
The p-value is computed using a normal approximation to the distribution of the signed rank statistic:
\[ \begin{align*} + \end{align*}\]
where $\mathcal{T}$ is the set of the counts of tied values at each tied position.
Implements: pvalue
HypothesisTests.SignTest
— TypeSignTest(x::AbstractVector{T<:Real}, median::Real = 0)
+SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)
Perform a sign test of the null hypothesis that the distribution from which x
(or x - y
if y
is provided) was drawn has median median
against the alternative hypothesis that the median is not equal to median
.
HypothesisTests.WaldWolfowitzTest
— TypeWaldWolfowitzTest(x::AbstractVector{Bool})
+WaldWolfowitzTest(x::AbstractVector{<:Real})
Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.
Implements: pvalue
HypothesisTests.SignedRankTest
— FunctionSignedRankTest(x::AbstractVector{<:Real})
+SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest
performs an exact signed rank test. In all other cases, SignedRankTest
performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest
or ApproximateSignedRankTest
directly.
HypothesisTests.ExactSignedRankTest
— TypeExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks, the exact p-value is computed using the psignrank
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
HypothesisTests.ApproximateSignedRankTest
— TypeApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
The p-value is computed using a normal approximation to the distribution of the signed rank statistic:
\[ \begin{align*} μ & = \frac{n(n + 1)}{4}\\ σ & = \frac{n(n + 1)(2 * n + 1)}{24} - \frac{a}{48}\\ a & = \sum_{t \in \mathcal{T}} t^3 - t - \end{align*}\]
where $\mathcal{T}$ is the set of the counts of tied values at each tied position.
HypothesisTests.ExactPermutationTest
— FunctionExactPermutationTest(x::Vector, y::Vector, f::Function)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. All possible permutations are sampled.
HypothesisTests.ApproximatePermutationTest
— FunctionApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. n
of the factorial(length(x)+length(y))
permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng()
.
HypothesisTests.FlignerKilleenTest
— FunctionFlignerKilleenTest(groups::AbstractVector{<:Real}...)
Perform Fligner-Killeen median test of the null hypothesis that the groups
have equal variances, a test for homogeneity of variances.
This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights
\[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\]
The version implemented here uses median centering in each of the samples.
Implements: pvalue
References
External links
HypothesisTests.ShapiroWilkTest
— TypeShapiroWilkTest(X::AbstractVector{<:Real},
+ \end{align*}\]where $\mathcal{T}$ is the set of the counts of tied values at each tied position.
HypothesisTests.ExactPermutationTest
— FunctionExactPermutationTest(x::Vector, y::Vector, f::Function)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. All possible permutations are sampled.
HypothesisTests.ApproximatePermutationTest
— FunctionApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. n
of the factorial(length(x)+length(y))
permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng()
.
HypothesisTests.FlignerKilleenTest
— FunctionFlignerKilleenTest(groups::AbstractVector{<:Real}...)
Perform Fligner-Killeen median test of the null hypothesis that the groups
have equal variances, a test for homogeneity of variances.
This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights
\[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\]
The version implemented here uses median centering in each of the samples.
Implements: pvalue
References
External links
HypothesisTests.ShapiroWilkTest
— TypeShapiroWilkTest(X::AbstractVector{<:Real},
swc::AbstractVector{<:Real}=shapiro_wilk_coefs(length(X));
sorted::Bool=issorted(X),
- censored::Integer=0)
Perform a Shapiro-Wilk test of the null hypothesis that the data in vector X
come from a normal distribution.
This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size N = 3
, and for ranges 4 ≤ N ≤ 11
and 12 ≤ N ≤ 5000
(Royston 1992) two separate approximations for p-values are used.
Keyword arguments
The following keyword arguments may be passed.
sorted::Bool=issorted(X)
: to indicate that sample X
is already sorted.censored::Integer=0
: to censor the largest samples from X
(so called upper-tail censoring)Implements: pvalue
As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply:
N > 2000
) or small (N < 20
)censored / N > 0.8
)Implementation notes
swc = shapiro_wilk_coefs(length(X))
once and pass it to the test via ShapiroWilkTest(X, swc)
for re-use.X
should be passed and indicated with sorted=true
keyword argument.References
Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591.
Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203
Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109
Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146.
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
Perform a Shapiro-Wilk test of the null hypothesis that the data in vector X
come from a normal distribution.
This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size N = 3
, and for ranges 4 ≤ N ≤ 11
and 12 ≤ N ≤ 5000
(Royston 1992) two separate approximations for p-values are used.
Keyword arguments
The following keyword arguments may be passed.
sorted::Bool=issorted(X)
: to indicate that sample X
is already sorted.censored::Integer=0
: to censor the largest samples from X
(so called upper-tail censoring)Implements: pvalue
As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply:
N > 2000
) or small (N < 20
)censored / N > 0.8
)Implementation notes
swc = shapiro_wilk_coefs(length(X))
once and pass it to the test via ShapiroWilkTest(X, swc)
for re-use.X
should be passed and indicated with sorted=true
keyword argument.References
Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591.
Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203
Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109
Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146.
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
HypothesisTests.PowerDivergenceTest
— TypePowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))
Perform a Power Divergence test.
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using the counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the confint(::PowerDivergenceTest)
documentation for a list of supported methods to compute confidence intervals.
The power divergence test is given by
\[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij} - /\hat{n}_{ij})^λ -1\right]\]
where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:
Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.
Implements: pvalue
, confint(::PowerDivergenceTest)
References
StatsAPI.confint
— Methodconfint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)References
HypothesisTests.ChisqTest
— FunctionChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest
with $λ = 1$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If only y
and x
are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0
is calculated by the proportion of each individual values in y
. Here, the hypothesis tested is whether the two samples x
and y
come from the same population or not.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.MultinomialLRTest
— FunctionMultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest
with $λ = 0$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.OneSampleTTest
— TypeOneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that n
values with mean xbar
and sample standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that the data in vector v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.
HypothesisTests.EqualVarianceTTest
— TypeEqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)
Perform a two-sample t-test of the null hypothesis that samples x
and y
described by the number of elements nx
and ny
, the mean mx
and my
, and variance vx
and vy
come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceTTest
— TypeUnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:
\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n - \frac{(k_i s_i^2)^2}{ν_i}}\]
HypothesisTests.OneSampleZTest
— TypeOneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that n
values with mean xbar
and population standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that the data in vector v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
HypothesisTests.EqualVarianceZTest
— TypeEqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceZTest
— TypeUnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
HypothesisTests.VarianceFTest
— TypeVarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an F-test of the null hypothesis that two real-valued vectors x
and y
have equal variances.
Implements: pvalue
References
External links
HypothesisTests.OneWayANOVATest
— FunctionOneWayANOVATest(groups::AbstractVector{<:Real}...)
Perform one-way analysis of variance test of the hypothesis that that the groups
means are equal.
The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.
Implements: pvalue
External links
HypothesisTests.LeveneTest
— FunctionLeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)
Perform Levene's test of the hypothesis that that the groups
variances are equal. By default the mean statistic
is used for centering in each of the groups
, but other statistics are accepted: median or truncated mean, see BrownForsytheTest
. By default the absolute value of the score difference, scorediff
, is used, but other functions are accepted: x² or √|x|.
The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:
\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} - {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]
where
The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.
Implements: pvalue
References
External links
HypothesisTests.BrownForsytheTest
— FunctionBrownForsytheTest(groups::AbstractVector{<:Real}...)
The Brown–Forsythe test is a statistical test for the equality of groups
variances.
The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.
Implements: pvalue
References
External links
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.
where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:
Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.
Implements: pvalue
, confint(::PowerDivergenceTest)
References
StatsAPI.confint
— Methodconfint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)References
HypothesisTests.ChisqTest
— FunctionChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest
with $λ = 1$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If only y
and x
are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0
is calculated by the proportion of each individual values in y
. Here, the hypothesis tested is whether the two samples x
and y
come from the same population or not.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.MultinomialLRTest
— FunctionMultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest
with $λ = 0$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
HypothesisTests.OneSampleTTest
— TypeOneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that n
values with mean xbar
and sample standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.
HypothesisTests.EqualVarianceTTest
— TypeEqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)
Perform a two-sample t-test of the null hypothesis that samples x
and y
described by the number of elements nx
and ny
, the mean mx
and my
, and variance vx
and vy
come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceTTest
— TypeUnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:
\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n + \frac{(k_i s_i^2)^2}{ν_i}}\]
HypothesisTests.OneSampleZTest
— TypeOneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that n
values with mean xbar
and population standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
HypothesisTests.EqualVarianceZTest
— TypeEqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceZTest
— TypeUnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
HypothesisTests.VarianceFTest
— TypeVarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an F-test of the null hypothesis that two real-valued vectors x
and y
have equal variances.
Implements: pvalue
References
External links
HypothesisTests.OneWayANOVATest
— FunctionOneWayANOVATest(groups::AbstractVector{<:Real}...)
Perform one-way analysis of variance test of the hypothesis that that the groups
means are equal.
The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.
Implements: pvalue
External links
HypothesisTests.LeveneTest
— FunctionLeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)
Perform Levene's test of the hypothesis that that the groups
variances are equal. By default the mean statistic
is used for centering in each of the groups
, but other statistics are accepted: median or truncated mean, see BrownForsytheTest
. By default the absolute value of the score difference, scorediff
, is used, but other functions are accepted: x² or √|x|.
The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:
\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} + {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]
where
The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.
Implements: pvalue
References
External links
HypothesisTests.BrownForsytheTest
— FunctionBrownForsytheTest(groups::AbstractVector{<:Real}...)
The Brown–Forsythe test is a statistical test for the equality of groups
variances.
The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.
Implements: pvalue
References
External links
Settings
This document was generated with Documenter.jl version 1.7.0 on Wednesday 2 October 2024. Using Julia version 1.10.5.