diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 9b4a880c..fedf6d9e 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-10-02T11:11:29","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-10-02T11:51:33","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/index.html b/dev/index.html index db677f4b..002ceafd 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -HypothesisTests package · HypothesisTests.jl

HypothesisTests package

This package implements several hypothesis tests in Julia.

+HypothesisTests package · HypothesisTests.jl

HypothesisTests package

This package implements several hypothesis tests in Julia.

diff --git a/dev/methods/index.html b/dev/methods/index.html index df258fd3..95cb91cb 100644 --- a/dev/methods/index.html +++ b/dev/methods/index.html @@ -1,5 +1,5 @@ -Methods · HypothesisTests.jl

Methods

This page documents the generic confint, pvalue and testname methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.

Confidence interval

StatsAPI.confintFunction
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)

Compute a confidence interval with coverage level for a binomial proportion using one of the following methods. Possible values for method are:

  • :clopper_pearson (default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level; it is usually too conservative.
  • :wald: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.
  • :waldcc: Wald interval with a continuity correction that extends the interval by 1/2n on both ends.
  • :wilson: Wilson score interval relies on a normal approximation. In contrast to :wald, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.
  • :jeffrey: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.
  • :agresti_coull: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.
  • :arcsine: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.

References

  • Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101–117, 2001.
  • Pires, Ana & Amado, Conceição. (2008). Interval Estimators for a Binomial Proportion: Comparison of Twenty Methods. REVSTAT. 6. 10.57805/revstat.v6i2.63.

External links

source
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)

Compute a confidence interval with coverage level. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both, the only method implemented yet is the central interval (:central).

Note

Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.

References

  • Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

  • :auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
  • :sison_glaz: Sison-Glaz intervals
  • :bootstrap: Bootstrap intervals
  • :quesenberry_hurst: Quesenberry-Hurst intervals
  • :gold: Gold intervals (asymptotic simultaneous intervals)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
  • Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
  • Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
  • Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
source

p-value

StatsAPI.pvalueFunction
pvalue(x::FisherExactTest; tail = :both, method = :central)

Compute the p-value for a given Fisher exact test.

The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:

\[ \begin{align*} +Methods · HypothesisTests.jl

Methods

This page documents the generic confint, pvalue and testname methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.

Confidence interval

StatsAPI.confintFunction
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)

Compute a confidence interval with coverage level for a binomial proportion using one of the following methods. Possible values for method are:

  • :clopper_pearson (default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level; it is usually too conservative.
  • :wald: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.
  • :waldcc: Wald interval with a continuity correction that extends the interval by 1/2n on both ends.
  • :wilson: Wilson score interval relies on a normal approximation. In contrast to :wald, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.
  • :jeffrey: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.
  • :agresti_coull: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.
  • :arcsine: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.

References

  • Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101–117, 2001.
  • Pires, Ana & Amado, Conceição. (2008). Interval Estimators for a Binomial Proportion: Comparison of Twenty Methods. REVSTAT. 6. 10.57805/revstat.v6i2.63.

External links

source
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)

Compute a confidence interval with coverage level. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both, the only method implemented yet is the central interval (:central).

Note

Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.

References

  • Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

  • :auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
  • :sison_glaz: Sison-Glaz intervals
  • :bootstrap: Bootstrap intervals
  • :quesenberry_hurst: Quesenberry-Hurst intervals
  • :gold: Gold intervals (asymptotic simultaneous intervals)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
  • Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
  • Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
  • Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
source

p-value

StatsAPI.pvalueFunction
pvalue(x::FisherExactTest; tail = :both, method = :central)

Compute the p-value for a given Fisher exact test.

The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:

\[ \begin{align*} p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\ p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i) - \end{align*}\]

For tail = :both, possible values for method are:

  • :central (default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.
  • :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:

    \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]

References

  • Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

Test name

+ \end{align*}\]

For tail = :both, possible values for method are:

  • :central (default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.
  • :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:

    \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]

References

  • Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

Test name

diff --git a/dev/multivariate/index.html b/dev/multivariate/index.html index 3ecfd39c..66106575 100644 --- a/dev/multivariate/index.html +++ b/dev/multivariate/index.html @@ -1,2 +1,2 @@ -Multivariate tests · HypothesisTests.jl

Multivariate tests

Hotelling's $T^2$ test

HypothesisTests.OneSampleHotellingT2TestType
OneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)

Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of X is equal to μ₀.

source
OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)

Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between X and Y is equal to μ₀.

source
HypothesisTests.EqualCovHotellingT2TestType
EqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)

Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X and Y is zero, assuming that X and Y have equal covariance matrices.

source
HypothesisTests.UnequalCovHotellingT2TestType
UnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)

Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X and Y is zero, without assuming that X and Y have equal covariance matrices.

source

Equality of covariance matrices

Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups.

HypothesisTests.BartlettTestType
BartlettTest(X::AbstractMatrix, Y::AbstractMatrix)

Perform Bartlett's test of the hypothesis that the covariance matrices of X and Y are equal.

Note

Bartlett's test is sensitive to departures from multivariate normality.

source

Correlation and partial correlation test

HypothesisTests.CorrelationTestType
CorrelationTest(x, y)

Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors x and y is zero.

CorrelationTest(x, y, Z)

Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors x and y given the matrix Z is zero.

Implements pvalue for the t-test. Implements confint using an approximate confidence interval based on Fisher's $z$-transform.

See also partialcor from StatsBase.

External resources

source
+Multivariate tests · HypothesisTests.jl

Multivariate tests

Hotelling's $T^2$ test

HypothesisTests.OneSampleHotellingT2TestType
OneSampleHotellingT2Test(X::AbstractMatrix, μ₀=<zero vector>)

Perform a one sample Hotelling's $T^2$ test of the hypothesis that the vector of column means of X is equal to μ₀.

source
OneSampleHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix, μ₀=<zero vector>)

Perform a paired Hotelling's $T^2$ test of the hypothesis that the vector of mean column differences between X and Y is equal to μ₀.

source
HypothesisTests.EqualCovHotellingT2TestType
EqualCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)

Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X and Y is zero, assuming that X and Y have equal covariance matrices.

source
HypothesisTests.UnequalCovHotellingT2TestType
UnequalCovHotellingT2Test(X::AbstractMatrix, Y::AbstractMatrix)

Perform a two sample Hotelling's $T^2$ test of the hypothesis that the difference in the mean vectors of X and Y is zero, without assuming that X and Y have equal covariance matrices.

source

Equality of covariance matrices

Bartlett's test for equality of two covariance matrices is provided. This is equivalent to Box's $M$-test for two groups.

HypothesisTests.BartlettTestType
BartlettTest(X::AbstractMatrix, Y::AbstractMatrix)

Perform Bartlett's test of the hypothesis that the covariance matrices of X and Y are equal.

Note

Bartlett's test is sensitive to departures from multivariate normality.

source

Correlation and partial correlation test

HypothesisTests.CorrelationTestType
CorrelationTest(x, y)

Perform a t-test for the hypothesis that $\text{Cor}(x,y) = 0$, i.e. the correlation of vectors x and y is zero.

CorrelationTest(x, y, Z)

Perform a t-test for the hypothesis that $\text{Cor}(x,y|Z=z) = 0$, i.e. the partial correlation of vectors x and y given the matrix Z is zero.

Implements pvalue for the t-test. Implements confint using an approximate confidence interval based on Fisher's $z$-transform.

See also partialcor from StatsBase.

External resources

source
diff --git a/dev/nonparametric/index.html b/dev/nonparametric/index.html index 14fd12fa..5a8f52a7 100644 --- a/dev/nonparametric/index.html +++ b/dev/nonparametric/index.html @@ -1,24 +1,24 @@ -Nonparametric tests · HypothesisTests.jl

Nonparametric tests

Anderson-Darling test

Available are both one-sample and $k$-sample tests.

HypothesisTests.OneSampleADTestType
OneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector x come from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
HypothesisTests.KSampleADTestType
KSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)

Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors xs come from the same distribution against the alternative hypothesis that the samples come from different distributions.

modified parameter enables a modified test calculation for samples whose observations do not all coincide.

If nsim is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.

Implements: pvalue

References

  • F. W. Scholz and M. A. Stephens, K-Sample Anderson-Darling Tests, Journal of the American Statistical Association, Vol. 82, No. 399. (Sep., 1987), pp. 918-924.
source

Binomial test

HypothesisTests.BinomialTestType
BinomialTest(x::Integer, n::Integer, p::Real = 0.5)
-BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)

Perform a binomial test of the null hypothesis that the distribution from which x successes were encountered in n draws (or alternatively from which the vector x was drawn) has success probability p against the alternative hypothesis that the success probability is not equal to p.

Computed confidence intervals by default are Clopper-Pearson intervals. See the confint(::BinomialTest) documentation for a list of supported methods to compute confidence intervals.

Implements: pvalue, confint(::BinomialTest)

source
StatsAPI.confintMethod
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)

Compute a confidence interval with coverage level for a binomial proportion using one of the following methods. Possible values for method are:

  • :clopper_pearson (default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level; it is usually too conservative.
  • :wald: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.
  • :waldcc: Wald interval with a continuity correction that extends the interval by 1/2n on both ends.
  • :wilson: Wilson score interval relies on a normal approximation. In contrast to :wald, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.
  • :jeffrey: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.
  • :agresti_coull: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.
  • :arcsine: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.

References

  • Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101–117, 2001.
  • Pires, Ana & Amado, Conceição. (2008). Interval Estimators for a Binomial Proportion: Comparison of Twenty Methods. REVSTAT. 6. 10.57805/revstat.v6i2.63.

External links

source

Fisher exact test

HypothesisTests.FisherExactTestType
FisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)

Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal.

See pvalue(::FisherExactTest) and confint(::FisherExactTest) for details about the computation of the default p-value and confidence interval, respectively.

The contingency table is structured as:

-X1X2
Y1ab
Y2cd
Note

The show function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.

Implements: pvalue(::FisherExactTest), confint(::FisherExactTest)

References

  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
StatsAPI.confintMethod
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)

Compute a confidence interval with coverage level. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both, the only method implemented yet is the central interval (:central).

Note

Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.

References

  • Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
StatsAPI.pvalueMethod
pvalue(x::FisherExactTest; tail = :both, method = :central)

Compute the p-value for a given Fisher exact test.

The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:

\[ \begin{align*} +Nonparametric tests · HypothesisTests.jl

Nonparametric tests

Anderson-Darling test

Available are both one-sample and $k$-sample tests.

HypothesisTests.OneSampleADTestType
OneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector x come from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
HypothesisTests.KSampleADTestType
KSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)

Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors xs come from the same distribution against the alternative hypothesis that the samples come from different distributions.

modified parameter enables a modified test calculation for samples whose observations do not all coincide.

If nsim is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.

Implements: pvalue

References

  • F. W. Scholz and M. A. Stephens, K-Sample Anderson-Darling Tests, Journal of the American Statistical Association, Vol. 82, No. 399. (Sep., 1987), pp. 918-924.
source

Binomial test

HypothesisTests.BinomialTestType
BinomialTest(x::Integer, n::Integer, p::Real = 0.5)
+BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)

Perform a binomial test of the null hypothesis that the distribution from which x successes were encountered in n draws (or alternatively from which the vector x was drawn) has success probability p against the alternative hypothesis that the success probability is not equal to p.

Computed confidence intervals by default are Clopper-Pearson intervals. See the confint(::BinomialTest) documentation for a list of supported methods to compute confidence intervals.

Implements: pvalue, confint(::BinomialTest)

source
StatsAPI.confintMethod
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)

Compute a confidence interval with coverage level for a binomial proportion using one of the following methods. Possible values for method are:

  • :clopper_pearson (default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of level; it is usually too conservative.
  • :wald: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.
  • :waldcc: Wald interval with a continuity correction that extends the interval by 1/2n on both ends.
  • :wilson: Wilson score interval relies on a normal approximation. In contrast to :wald, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.
  • :jeffrey: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.
  • :agresti_coull: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.
  • :arcsine: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.

References

  • Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101–117, 2001.
  • Pires, Ana & Amado, Conceição. (2008). Interval Estimators for a Binomial Proportion: Comparison of Twenty Methods. REVSTAT. 6. 10.57805/revstat.v6i2.63.

External links

source

Fisher exact test

HypothesisTests.FisherExactTestType
FisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)

Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal.

See pvalue(::FisherExactTest) and confint(::FisherExactTest) for details about the computation of the default p-value and confidence interval, respectively.

The contingency table is structured as:

-X1X2
Y1ab
Y2cd
Note

The show function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.

Implements: pvalue(::FisherExactTest), confint(::FisherExactTest)

References

  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
StatsAPI.confintMethod
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)

Compute a confidence interval with coverage level. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both, the only method implemented yet is the central interval (:central).

Note

Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.

References

  • Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source
StatsAPI.pvalueMethod
pvalue(x::FisherExactTest; tail = :both, method = :central)

Compute the p-value for a given Fisher exact test.

The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:

\[ \begin{align*} p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\ p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i) - \end{align*}\]

For tail = :both, possible values for method are:

  • :central (default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.
  • :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:

    \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]

References

  • Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

Kolmogorov-Smirnov test

Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.

HypothesisTests.ExactOneSampleKSTestType
ExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
HypothesisTests.ApproximateOneSampleKSTestType
ApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source

Kruskal-Wallis rank sum test

HypothesisTests.KruskalWallisTestType
KruskalWallisTest(groups::AbstractVector{<:Real}...)

Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups $\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.

The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.

The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$:

\[ \begin{align*} + \end{align*}\]

For tail = :both, possible values for method are:

  • :central (default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.
  • :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:

    \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]

References

  • Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

Kolmogorov-Smirnov test

Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.

HypothesisTests.ExactOneSampleKSTestType
ExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
HypothesisTests.ApproximateOneSampleKSTestType
ApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source

Kruskal-Wallis rank sum test

HypothesisTests.KruskalWallisTestType
KruskalWallisTest(groups::AbstractVector{<:Real}...)

Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups $\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.

The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.

The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$:

\[ \begin{align*} H & = \frac{12}{n(n+1)} \sum_{g ∈ \mathcal{G}} \frac{R_g^2}{n_g} - 3(n+1)\\ C & = 1-\frac{1}{n^3-n}\sum_{t ∈ \mathcal{T}} (t^3-t), - \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details.

Implements: pvalue

References

  • Meyer, J.P, Seaman, M.A., Expanded tables of critical values for the Kruskal-Wallis H statistic. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, April 2006.

External links

source

Mann-Whitney U test

HypothesisTests.MannWhitneyUTestFunction
MannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest or ApproximateMannWhitneyUTest directly.

Implements: pvalue

source
HypothesisTests.ExactMannWhitneyUTestType
ExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

When there are no tied ranks, the exact p-value is computed using the pwilcox function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue

source
HypothesisTests.ApproximateMannWhitneyUTestType
ApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:

\[ \begin{align*} + \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details.

Implements: pvalue

References

  • Meyer, J.P, Seaman, M.A., Expanded tables of critical values for the Kruskal-Wallis H statistic. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, April 2006.

External links

source

Mann-Whitney U test

HypothesisTests.MannWhitneyUTestFunction
MannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest or ApproximateMannWhitneyUTest directly.

Implements: pvalue

source
HypothesisTests.ExactMannWhitneyUTestType
ExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

When there are no tied ranks, the exact p-value is computed using the pwilcox function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue

source
HypothesisTests.ApproximateMannWhitneyUTestType
ApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:

\[ \begin{align*} μ & = \frac{n_x n_y}{2}\\ σ & = \frac{n_x n_y}{12}\left(n_x + n_y + 1 - \frac{a}{(n_x + n_y)(n_x + n_y - 1)}\right)\\ a & = \sum_{t \in \mathcal{T}} t^3 - t - \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position.

Implements: pvalue

source

Sign test

HypothesisTests.SignTestType
SignTest(x::AbstractVector{T<:Real}, median::Real = 0)
-SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)

Perform a sign test of the null hypothesis that the distribution from which x (or x - y if y is provided) was drawn has median median against the alternative hypothesis that the median is not equal to median.

Implements: pvalue, confint

source

Wald-Wolfowitz independence test

HypothesisTests.WaldWolfowitzTestType
WaldWolfowitzTest(x::AbstractVector{Bool})
-WaldWolfowitzTest(x::AbstractVector{<:Real})

Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.

Implements: pvalue

source

Wilcoxon signed rank test

HypothesisTests.SignedRankTestFunction
SignedRankTest(x::AbstractVector{<:Real})
-SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest performs an exact signed rank test. In all other cases, SignedRankTest performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest or ApproximateSignedRankTest directly.

Implements: pvalue, confint

source
HypothesisTests.ExactSignedRankTestType
ExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks, the exact p-value is computed using the psignrank function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue, confint

source
HypothesisTests.ApproximateSignedRankTestType
ApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

The p-value is computed using a normal approximation to the distribution of the signed rank statistic:

\[ \begin{align*} + \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position.

Implements: pvalue

source

Sign test

HypothesisTests.SignTestType
SignTest(x::AbstractVector{T<:Real}, median::Real = 0)
+SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)

Perform a sign test of the null hypothesis that the distribution from which x (or x - y if y is provided) was drawn has median median against the alternative hypothesis that the median is not equal to median.

Implements: pvalue, confint

source

Wald-Wolfowitz independence test

HypothesisTests.WaldWolfowitzTestType
WaldWolfowitzTest(x::AbstractVector{Bool})
+WaldWolfowitzTest(x::AbstractVector{<:Real})

Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.

Implements: pvalue

source

Wilcoxon signed rank test

HypothesisTests.SignedRankTestFunction
SignedRankTest(x::AbstractVector{<:Real})
+SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest performs an exact signed rank test. In all other cases, SignedRankTest performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest or ApproximateSignedRankTest directly.

Implements: pvalue, confint

source
HypothesisTests.ExactSignedRankTestType
ExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks, the exact p-value is computed using the psignrank function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue, confint

source
HypothesisTests.ApproximateSignedRankTestType
ApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

The p-value is computed using a normal approximation to the distribution of the signed rank statistic:

\[ \begin{align*} μ & = \frac{n(n + 1)}{4}\\ σ & = \frac{n(n + 1)(2 * n + 1)}{24} - \frac{a}{48}\\ a & = \sum_{t \in \mathcal{T}} t^3 - t - \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position.

Implements: pvalue, confint

source

Permutation test

HypothesisTests.ExactPermutationTestFunction
ExactPermutationTest(x::Vector, y::Vector, f::Function)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). All possible permutations are sampled.

source
HypothesisTests.ApproximatePermutationTestFunction
ApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). n of the factorial(length(x)+length(y)) permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng().

source

Fligner-Killeen test

HypothesisTests.FlignerKilleenTestFunction
FlignerKilleenTest(groups::AbstractVector{<:Real}...)

Perform Fligner-Killeen median test of the null hypothesis that the groups have equal variances, a test for homogeneity of variances.

This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights

\[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\]

The version implemented here uses median centering in each of the samples.

Implements: pvalue

References

  • Conover, W. J., Johnson, M. E., Johnson, M. M., A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23, 351–361, 1980

External links

source

Shapiro-Wilk test

Permutation test

HypothesisTests.ExactPermutationTestFunction
ExactPermutationTest(x::Vector, y::Vector, f::Function)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). All possible permutations are sampled.

source
HypothesisTests.ApproximatePermutationTestFunction
ApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). n of the factorial(length(x)+length(y)) permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng().

source

Fligner-Killeen test

HypothesisTests.FlignerKilleenTestFunction
FlignerKilleenTest(groups::AbstractVector{<:Real}...)

Perform Fligner-Killeen median test of the null hypothesis that the groups have equal variances, a test for homogeneity of variances.

This test is most robust against departures from normality, see references. It is a $k$-sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights

\[a_{N,i} = \Phi^{-1}(1/2 + (i/2(N+1)))\]

The version implemented here uses median centering in each of the samples.

Implements: pvalue

References

  • Conover, W. J., Johnson, M. E., Johnson, M. M., A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23, 351–361, 1980

External links

source

Shapiro-Wilk test

HypothesisTests.ShapiroWilkTestType
ShapiroWilkTest(X::AbstractVector{<:Real},
                 swc::AbstractVector{<:Real}=shapiro_wilk_coefs(length(X));
                 sorted::Bool=issorted(X),
-                censored::Integer=0)

Perform a Shapiro-Wilk test of the null hypothesis that the data in vector X come from a normal distribution.

This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size N = 3, and for ranges 4 ≤ N ≤ 11 and 12 ≤ N ≤ 5000 (Royston 1992) two separate approximations for p-values are used.

Keyword arguments

The following keyword arguments may be passed.

  • sorted::Bool=issorted(X): to indicate that sample X is already sorted.
  • censored::Integer=0: to censor the largest samples from X (so called upper-tail censoring)

Implements: pvalue

Warning

As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply:

  • Sample size is large (N > 2000) or small (N < 20)
  • Too much data is censored (censored / N > 0.8)

Implementation notes

  • The current implementation DOES NOT implement p-values for censored data.
  • If multiple Shapiro-Wilk tests are to be performed on samples of same size, it is faster to construct swc = shapiro_wilk_coefs(length(X)) once and pass it to the test via ShapiroWilkTest(X, swc) for re-use.
  • For maximal performance sorted X should be passed and indicated with sorted=true keyword argument.

References

Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591.

Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203

Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109

Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146.

source
+ censored::Integer=0)

Perform a Shapiro-Wilk test of the null hypothesis that the data in vector X come from a normal distribution.

This implementation is based on the method by Royston (1992). The calculation of the p-value is exact for sample size N = 3, and for ranges 4 ≤ N ≤ 11 and 12 ≤ N ≤ 5000 (Royston 1992) two separate approximations for p-values are used.

Keyword arguments

The following keyword arguments may be passed.

  • sorted::Bool=issorted(X): to indicate that sample X is already sorted.
  • censored::Integer=0: to censor the largest samples from X (so called upper-tail censoring)

Implements: pvalue

Warning

As noted by Royston (1993), (approximated) W-statistic will be accurate but returned p-values may not be reliable if either of these apply:

  • Sample size is large (N > 2000) or small (N < 20)
  • Too much data is censored (censored / N > 0.8)

Implementation notes

  • The current implementation DOES NOT implement p-values for censored data.
  • If multiple Shapiro-Wilk tests are to be performed on samples of same size, it is faster to construct swc = shapiro_wilk_coefs(length(X)) once and pass it to the test via ShapiroWilkTest(X, swc) for re-use.
  • For maximal performance sorted X should be passed and indicated with sorted=true keyword argument.

References

Shapiro, S. S., & Wilk, M. B. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52, 591–611. doi:10.1093/BIOMET/52.3-4.591.

Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. doi:10.1007/BF01891203

Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. Journal of the Royal Statistical Society Series D (The Statistician), 42(1), 37–43. doi:10.2307/2348109

Royston, P. (1995). Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics), 44(4), 547–551. doi:10.2307/2986146.

source
diff --git a/dev/parametric/index.html b/dev/parametric/index.html index f0d920e4..53ad9dee 100644 --- a/dev/parametric/index.html +++ b/dev/parametric/index.html @@ -1,5 +1,5 @@ Parametric tests · HypothesisTests.jl

Parametric tests

Power divergence test

HypothesisTests.PowerDivergenceTestType
PowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))

Perform a Power Divergence test.

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using the counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the confint(::PowerDivergenceTest) documentation for a list of supported methods to compute confidence intervals.

The power divergence test is given by

\[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij} - /\hat{n}_{ij})^λ -1\right]\]

where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:

  • $λ = 1$: equal to Pearson's chi-squared statistic
  • $λ \to 0$: converges to the likelihood ratio test statistic
  • $λ \to -1$: converges to the minimum discrimination information statistic (Gokhale and Kullback, 1978)
  • $λ = -2$: equals Neyman modified chi-squared (Neyman, 1949)
  • $λ = -1/2$: equals the Freeman-Tukey statistic (Freeman and Tukey, 1950).

Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.

Implements: pvalue, confint(::PowerDivergenceTest)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
source
StatsAPI.confintMethod
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

  • :auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
  • :sison_glaz: Sison-Glaz intervals
  • :bootstrap: Bootstrap intervals
  • :quesenberry_hurst: Quesenberry-Hurst intervals
  • :gold: Gold intervals (asymptotic simultaneous intervals)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
  • Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
  • Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
  • Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
source

Pearson chi-squared test

HypothesisTests.ChisqTestFunction
ChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest with $λ = 1$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If only y and x are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0 is calculated by the proportion of each individual values in y. Here, the hypothesis tested is whether the two samples x and y come from the same population or not.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

source

Multinomial likelihood ratio test

HypothesisTests.MultinomialLRTestFunction
MultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest with $λ = 0$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

source

t-test

HypothesisTests.OneSampleTTestType
OneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that n values with mean xbar and sample standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

Note

This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.

source
HypothesisTests.EqualVarianceTTestType
EqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)

Perform a two-sample t-test of the null hypothesis that samples x and y described by the number of elements nx and ny, the mean mx and my, and variance vx and vy come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.

Implements: pvalue, confint

source
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample t-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

source
HypothesisTests.UnequalVarianceTTestType
UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample t-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:

\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n - \frac{(k_i s_i^2)^2}{ν_i}}\]

Implements: pvalue, confint

source

z-test

HypothesisTests.OneSampleZTestType
OneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that n values with mean xbar and population standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
HypothesisTests.EqualVarianceZTestType
EqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample z-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

source
HypothesisTests.UnequalVarianceZTestType
UnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample z-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

Implements: pvalue, confint

source

F-test

One-way ANOVA Test

HypothesisTests.OneWayANOVATestFunction
OneWayANOVATest(groups::AbstractVector{<:Real}...)

Perform one-way analysis of variance test of the hypothesis that that the groups means are equal.

The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.

Implements: pvalue

External links

source

Levene's Test

HypothesisTests.LeveneTestFunction
LeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)

Perform Levene's test of the hypothesis that that the groups variances are equal. By default the mean statistic is used for centering in each of the groups, but other statistics are accepted: median or truncated mean, see BrownForsytheTest. By default the absolute value of the score difference, scorediff, is used, but other functions are accepted: x² or √|x|.

The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:

\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} - {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]

where

  • $k$ is the number of different groups to which the sampled cases belong,
  • $N_i$ is the number of cases in the $i$th group,
  • $N$ is the total number of cases in all groups,
  • $Y_{ij}$ is the value of the measured variable for the $j$th case from the $i$th group,
  • $Z_{ij} = |Y_{ij} - \bar{Y}_{i\cdot}|$, $\bar{Y}_{i\cdot}$ is a mean of the $i$th group,
  • $Z_{i\cdot} = \frac{1}{N_i} \sum_{j=1}^{N_i} Z_{ij}$ is the mean of the $Z_{ij}$ for group $i$,
  • $Z_{\cdot\cdot} = \frac{1}{N} \sum_{i=1}^k \sum_{j=1}^{N_i} Z_{ij}$ is the mean of all $Z_{ij}$.

The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.

Implements: pvalue

References

  • Levene, Howard, "Robust tests for equality of variances". In Ingram Olkin; Harold Hotelling; et al. (eds.). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press. pp. 278–292, 1960

External links

source

Brown-Forsythe Test

HypothesisTests.BrownForsytheTestFunction
BrownForsytheTest(groups::AbstractVector{<:Real}...)

The Brown–Forsythe test is a statistical test for the equality of groups variances.

The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.

Implements: pvalue

References

  • Brown, Morton B.; Forsythe, Alan B., "Robust tests for the equality of variances". Journal of the American Statistical Association. 69: 364–367, 1974 doi:10.1080/01621459.1974.10482955.

External links

source
+ /\hat{n}_{ij})^λ -1\right]\]

where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:

Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.

Implements: pvalue, confint(::PowerDivergenceTest)

References

source
StatsAPI.confintMethod
confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

  • :auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
  • :sison_glaz: Sison-Glaz intervals
  • :bootstrap: Bootstrap intervals
  • :quesenberry_hurst: Quesenberry-Hurst intervals
  • :gold: Gold intervals (asymptotic simultaneous intervals)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
  • Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
  • Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
  • Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
source

Pearson chi-squared test

HypothesisTests.ChisqTestFunction
ChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest with $λ = 1$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If only y and x are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0 is calculated by the proportion of each individual values in y. Here, the hypothesis tested is whether the two samples x and y come from the same population or not.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

source

Multinomial likelihood ratio test

HypothesisTests.MultinomialLRTestFunction
MultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest with $λ = 0$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

source

t-test

HypothesisTests.OneSampleTTestType
OneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that n values with mean xbar and sample standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

Note

This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.

source
HypothesisTests.EqualVarianceTTestType
EqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)

Perform a two-sample t-test of the null hypothesis that samples x and y described by the number of elements nx and ny, the mean mx and my, and variance vx and vy come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.

Implements: pvalue, confint

source
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample t-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

source
HypothesisTests.UnequalVarianceTTestType
UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample t-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:

\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n + \frac{(k_i s_i^2)^2}{ν_i}}\]

Implements: pvalue, confint

source

z-test

HypothesisTests.OneSampleZTestType
OneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that n values with mean xbar and population standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

source
HypothesisTests.EqualVarianceZTestType
EqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample z-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

source
HypothesisTests.UnequalVarianceZTestType
UnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample z-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

Implements: pvalue, confint

source

F-test

HypothesisTests.VarianceFTestType
VarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an F-test of the null hypothesis that two real-valued vectors x and y have equal variances.

Implements: pvalue

References

  • George E. P. Box, "Non-Normality and Tests on Variances", Biometrika 40 (3/4): 318–335, 1953.

External links

source

One-way ANOVA Test

HypothesisTests.OneWayANOVATestFunction
OneWayANOVATest(groups::AbstractVector{<:Real}...)

Perform one-way analysis of variance test of the hypothesis that that the groups means are equal.

The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.

Implements: pvalue

External links

source

Levene's Test

HypothesisTests.LeveneTestFunction
LeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)

Perform Levene's test of the hypothesis that that the groups variances are equal. By default the mean statistic is used for centering in each of the groups, but other statistics are accepted: median or truncated mean, see BrownForsytheTest. By default the absolute value of the score difference, scorediff, is used, but other functions are accepted: x² or √|x|.

The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:

\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} + {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]

where

  • $k$ is the number of different groups to which the sampled cases belong,
  • $N_i$ is the number of cases in the $i$th group,
  • $N$ is the total number of cases in all groups,
  • $Y_{ij}$ is the value of the measured variable for the $j$th case from the $i$th group,
  • $Z_{ij} = |Y_{ij} - \bar{Y}_{i\cdot}|$, $\bar{Y}_{i\cdot}$ is a mean of the $i$th group,
  • $Z_{i\cdot} = \frac{1}{N_i} \sum_{j=1}^{N_i} Z_{ij}$ is the mean of the $Z_{ij}$ for group $i$,
  • $Z_{\cdot\cdot} = \frac{1}{N} \sum_{i=1}^k \sum_{j=1}^{N_i} Z_{ij}$ is the mean of all $Z_{ij}$.

The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.

Implements: pvalue

References

  • Levene, Howard, "Robust tests for equality of variances". In Ingram Olkin; Harold Hotelling; et al. (eds.). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press. pp. 278–292, 1960

External links

source

Brown-Forsythe Test

HypothesisTests.BrownForsytheTestFunction
BrownForsytheTest(groups::AbstractVector{<:Real}...)

The Brown–Forsythe test is a statistical test for the equality of groups variances.

The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.

Implements: pvalue

References

  • Brown, Morton B.; Forsythe, Alan B., "Robust tests for the equality of variances". Journal of the American Statistical Association. 69: 364–367, 1974 doi:10.1080/01621459.1974.10482955.

External links

source
diff --git a/dev/time_series/index.html b/dev/time_series/index.html index 80ad9b5e..ba267869 100644 --- a/dev/time_series/index.html +++ b/dev/time_series/index.html @@ -1,2 +1,2 @@ -Time series tests · HypothesisTests.jl

Time series tests

Durbin-Watson test

HypothesisTests.DurbinWatsonTestType
DurbinWatsonTest(X::AbstractArray, e::AbstractVector; p_compute::Symbol = :ndep)

Compute the Durbin-Watson test for serial correlation in the residuals of a regression model.

X is the matrix of regressors from the original regression model and e the vector of residuals. Note that the Durbin-Watson test is not valid if X includes a lagged dependent variable. The test statistic is computed as

\[DW = \frac{\sum_{t=2}^n (e_t - e_{t-1})^2}{\sum_{t=1}^n e_t^2}\]

where n is the number of observations.

By default, the choice of approach to compute p-values depends on the sample size (p_compute=:ndep). For small samples (n<100), Pan's algorithm (Farebrother, 1980) is employed. For larger samples, a normal approximation is used (Durbin and Watson, 1950). To always use Pan's algorithm, set p_compute=:exact. p_compute=:approx will always use the normal approximation.

Default is a two-sided p-value for the alternative hypothesis of positive or negative serial correlation. One-sided p-values can be requested by calling pvalue(x::DurbinWatsonTest; tail=) with the options :left (negative serial correlation) and :right (positive serial correlation).

References

  • J. Durbin and G. S. Watson, 1951, "Testing for Serial Correlation in Least Squares Regression: II", Biometrika, Vol. 38, No. 1/2, pp. 159-177, http://www.jstor.org/stable/2332325.
  • J. Durbin and G. S. Watson, 1950, "Testing for Serial Correlation in Least Squares Regression: I", Biometrika, Vol. 37, No. 3/4, pp. 409-428, http://www.jstor.org/stable/2332391.
  • R. W. Farebrother, 1980, "Algorithm AS 153: Pan's Procedure for the Tail Probabilities of the Durbin-Watson Statistic", Journal of the Royal Statistical Society, Series C (Applied Statistics), Vol. 29, No. 2, pp. 224-227, http://www.jstor.org/stable/2986316.

External links

source

Box-Pierce and Ljung-Box tests

HypothesisTests.BoxPierceTestType
BoxPierceTest(y, lag, dof=0)

Compute the Box-Pierce Q statistic to test the null hypothesis of independence in a time series y.

lag specifies the number of lags used in the construction of Q. When testing the residuals of an estimated model, dof has to be set to the number of estimated parameters. E.g., when testing the residuals of an ARIMA(p,0,q) model, set dof=p+q.

External links

source
HypothesisTests.LjungBoxTestType
LjungBoxTest(y, lag, dof=0)

Compute the Ljung-Box Q statistic to test the null hypothesis of independence in a time series y.

lag specifies the number of lags used in the construction of Q. When testing the residuals of an estimated model, dof has to be set to the number of estimated parameters. E.g., when testing the residuals of an ARIMA(p,0,q) model, set dof=p+q.

External links

source

Breusch-Godfrey test

HypothesisTests.BreuschGodfreyTestType
BreuschGodfreyTest(X, e, lag, start0 = true)

Compute the Breusch-Godfrey test for serial correlation in the residuals of a regression model.

X is the matrix of regressors from the original model and e the vector of residuals. lag determines the number of lagged residuals included in the auxiliary regression. Set start0 to specify how the starting values for the lagged residuals are handled. start0 = true (default) sets them to zero (as in Godfrey, 1978); start0 = false uses the first lag residuals as starting values, i.e. shortening the sample by lag.

External links

source

Jarque-Bera test

HypothesisTests.JarqueBeraTestType
JarqueBeraTest(y::AbstractVector; adjusted::Bool=false)

When adjusted is false, compute the Jarque-Bera statistic to test the null hypothesis that a real-valued vector y is normally distributed.

Note that the approximation by the Chi-squared distribution does not work well and the speed of convergence is slow. In small samples, the test tends to be over-sized for nominal levels up to about 3% and under-sized for larger nominal levels (Mantalos, 2010).

When adjusted is true, compute the Adjusted Lagrangian Multiplier statistic to test the null hypothesis that a real-valued vector y is normally distributed.

Note that the use of Adjusted Lagrangian Multiplier is preferred over Jarque-Bera for small and medium sample sizes and it is a modification to the Jarque-Bera test (Urzua, 1996).

References

  • Panagiotis Mantalos, 2011, "The three different measures of the sample skewness and kurtosis and the effects to the Jarque-Bera test for normality", International Journal of Computational Economics and Econometrics, Vol. 2, No. 1, link.

  • Carlos M. Urzúa, "On the correct use of omnibus tests for normality", Economics Letters, Volume 53, Issue 3, link.

External links

source

Augmented Dickey-Fuller test

HypothesisTests.ADFTestType
ADFTest(y::AbstractVector{T}, deterministic::Symbol, lag::Int) where T<:Real

Compute the augmented Dickey-Fuller unit root test.

y is the time series to be tested, deterministic determines the deterministic terms (options: :none, :constant, :trend, :squared_trend) and lag the number of lagged first-differences included in the test regression, respectively.

Critical values and asymptotic p-values are computed based on response surface regressions following MacKinnon (2010) and MacKinnon (1994), respectively. These may differ slightly from those reported in other regression packages as different algorithms might be used.

References

  • James G. MacKinnon, 2010, "Critical values for cointegration tests," QED Working Paper No. 1227, 2010, link.
  • James G. MacKinnon, 1994, "Approximate Asymptotic Distribution Functions for Unit-Root and Cointegration Tests", Journal of Business & Economic Statistics, Vol. 12, No. 2, pp. 167-176, link.

External links

source

Clark-West test

HypothesisTests.ClarkWestTestType
ClarkWestTest(e1::AbstractVector{<:Real}, e2::AbstractVector{<:Real}, lookahead::Integer=1)

Perform the Clark-West test of equal performance of two nested prediction models, in terms of the out-of-sample mean squared prediction errors.

e1 is a vector of forecasts from the smaller (nested) model, e2 is a vector of forecast errors from the larger model, and lookahead is the number of steps ahead of the forecast. Typically, the null hypothesis is that the two models perform equally well (a two-sided test), but sometimes we test whether the larger model performs better, which is indicated by a positive test statistic, for instance, above 1.645 for the 5% significance level (right tail test).

Implements: pvalue

References

  • Clark, T. E., West, K. D. 2006, Using out-of-sample mean squared prediction errors to test the martingale difference hypothesis. Journal of Econometrics, 135(1): 155–186.
  • Clark, T. E., West, K. D. 2007, Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138(1): 291–311.
source

Diebold-Mariano test

HypothesisTests.DieboldMarianoTestType
DieboldMarianoTest(e1::AbstractVector{<:Real}, e2::AbstractVector{<:Real}; loss=abs2, lookahead=1)

Perform the modified Diebold-Mariano test proposed by Harvey, Leybourne and Newbold of the null hypothesis that the two methods have the same forecast accuracy. loss is the loss function described in Diebold, F.X. and Mariano, R.S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263. and lookahead is the number of steps ahead of the forecast.

References

  • Diebold, F.X. and Mariano, R.S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263.

  • Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of forecasting, 13(2), 281-291.

source

White test

HypothesisTests.WhiteTestType
WhiteTest(X, e; type = :White)

Compute White's (or Breusch-Pagan's) test for heteroskedasticity.

X is a matrix of regressors and e is the vector of residuals from the original model. The keyword argument type is either :linear for the Breusch-Pagan/Koenker test, :linear_and_squares for White's test with linear and squared terms only (no cross-products), or :White (the default) for the full White's test (linear, squared and cross-product terms). X should include a constant and at least one more regressor, with observations in rows and regressors in columns. In some applications, X is a subset of the regressors in the original model, or just the fitted values. This saves degrees of freedom and may give a more powerful test. The lm (Lagrange multiplier) test statistic is T*R2 where R2 is from the regression of e^2 on the terms mentioned above. Under the null hypothesis it is distributed as Chisq(dof) where dof is the number of independent terms (not counting the constant), so the null is rejected when the test statistic is large enough.

Implements: pvalue

References

  • H. White, (1980): A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica, 48, 817-838.
  • T.S. Breusch & A.R. Pagan (1979), A simple test for heteroscedasticity and random coefficient variation, Econometrica, 47, 1287-1294
  • R. Koenker (1981), A note on studentizing a test for heteroscedasticity, Journal of Econometrics, 17, 107-112

External links

source
HypothesisTests.BreuschPaganTestFunction
BreuschPaganTest(X, e)

Compute Breusch-Pagan's test for heteroskedasticity.

X is a matrix of regressors from the original model and e the vector of residuals. This is equivalent to WhiteTest(X, e, type = :linear). See WhiteTest for further details.

source
+Time series tests · HypothesisTests.jl

Time series tests

Durbin-Watson test

HypothesisTests.DurbinWatsonTestType
DurbinWatsonTest(X::AbstractArray, e::AbstractVector; p_compute::Symbol = :ndep)

Compute the Durbin-Watson test for serial correlation in the residuals of a regression model.

X is the matrix of regressors from the original regression model and e the vector of residuals. Note that the Durbin-Watson test is not valid if X includes a lagged dependent variable. The test statistic is computed as

\[DW = \frac{\sum_{t=2}^n (e_t - e_{t-1})^2}{\sum_{t=1}^n e_t^2}\]

where n is the number of observations.

By default, the choice of approach to compute p-values depends on the sample size (p_compute=:ndep). For small samples (n<100), Pan's algorithm (Farebrother, 1980) is employed. For larger samples, a normal approximation is used (Durbin and Watson, 1950). To always use Pan's algorithm, set p_compute=:exact. p_compute=:approx will always use the normal approximation.

Default is a two-sided p-value for the alternative hypothesis of positive or negative serial correlation. One-sided p-values can be requested by calling pvalue(x::DurbinWatsonTest; tail=) with the options :left (negative serial correlation) and :right (positive serial correlation).

References

  • J. Durbin and G. S. Watson, 1951, "Testing for Serial Correlation in Least Squares Regression: II", Biometrika, Vol. 38, No. 1/2, pp. 159-177, http://www.jstor.org/stable/2332325.
  • J. Durbin and G. S. Watson, 1950, "Testing for Serial Correlation in Least Squares Regression: I", Biometrika, Vol. 37, No. 3/4, pp. 409-428, http://www.jstor.org/stable/2332391.
  • R. W. Farebrother, 1980, "Algorithm AS 153: Pan's Procedure for the Tail Probabilities of the Durbin-Watson Statistic", Journal of the Royal Statistical Society, Series C (Applied Statistics), Vol. 29, No. 2, pp. 224-227, http://www.jstor.org/stable/2986316.

External links

source

Box-Pierce and Ljung-Box tests

HypothesisTests.BoxPierceTestType
BoxPierceTest(y, lag, dof=0)

Compute the Box-Pierce Q statistic to test the null hypothesis of independence in a time series y.

lag specifies the number of lags used in the construction of Q. When testing the residuals of an estimated model, dof has to be set to the number of estimated parameters. E.g., when testing the residuals of an ARIMA(p,0,q) model, set dof=p+q.

External links

source
HypothesisTests.LjungBoxTestType
LjungBoxTest(y, lag, dof=0)

Compute the Ljung-Box Q statistic to test the null hypothesis of independence in a time series y.

lag specifies the number of lags used in the construction of Q. When testing the residuals of an estimated model, dof has to be set to the number of estimated parameters. E.g., when testing the residuals of an ARIMA(p,0,q) model, set dof=p+q.

External links

source

Breusch-Godfrey test

HypothesisTests.BreuschGodfreyTestType
BreuschGodfreyTest(X, e, lag, start0 = true)

Compute the Breusch-Godfrey test for serial correlation in the residuals of a regression model.

X is the matrix of regressors from the original model and e the vector of residuals. lag determines the number of lagged residuals included in the auxiliary regression. Set start0 to specify how the starting values for the lagged residuals are handled. start0 = true (default) sets them to zero (as in Godfrey, 1978); start0 = false uses the first lag residuals as starting values, i.e. shortening the sample by lag.

External links

source

Jarque-Bera test

HypothesisTests.JarqueBeraTestType
JarqueBeraTest(y::AbstractVector; adjusted::Bool=false)

When adjusted is false, compute the Jarque-Bera statistic to test the null hypothesis that a real-valued vector y is normally distributed.

Note that the approximation by the Chi-squared distribution does not work well and the speed of convergence is slow. In small samples, the test tends to be over-sized for nominal levels up to about 3% and under-sized for larger nominal levels (Mantalos, 2010).

When adjusted is true, compute the Adjusted Lagrangian Multiplier statistic to test the null hypothesis that a real-valued vector y is normally distributed.

Note that the use of Adjusted Lagrangian Multiplier is preferred over Jarque-Bera for small and medium sample sizes and it is a modification to the Jarque-Bera test (Urzua, 1996).

References

  • Panagiotis Mantalos, 2011, "The three different measures of the sample skewness and kurtosis and the effects to the Jarque-Bera test for normality", International Journal of Computational Economics and Econometrics, Vol. 2, No. 1, link.

  • Carlos M. Urzúa, "On the correct use of omnibus tests for normality", Economics Letters, Volume 53, Issue 3, link.

External links

source

Augmented Dickey-Fuller test

HypothesisTests.ADFTestType
ADFTest(y::AbstractVector{T}, deterministic::Symbol, lag::Int) where T<:Real

Compute the augmented Dickey-Fuller unit root test.

y is the time series to be tested, deterministic determines the deterministic terms (options: :none, :constant, :trend, :squared_trend) and lag the number of lagged first-differences included in the test regression, respectively.

Critical values and asymptotic p-values are computed based on response surface regressions following MacKinnon (2010) and MacKinnon (1994), respectively. These may differ slightly from those reported in other regression packages as different algorithms might be used.

References

  • James G. MacKinnon, 2010, "Critical values for cointegration tests," QED Working Paper No. 1227, 2010, link.
  • James G. MacKinnon, 1994, "Approximate Asymptotic Distribution Functions for Unit-Root and Cointegration Tests", Journal of Business & Economic Statistics, Vol. 12, No. 2, pp. 167-176, link.

External links

source

Clark-West test

HypothesisTests.ClarkWestTestType
ClarkWestTest(e1::AbstractVector{<:Real}, e2::AbstractVector{<:Real}, lookahead::Integer=1)

Perform the Clark-West test of equal performance of two nested prediction models, in terms of the out-of-sample mean squared prediction errors.

e1 is a vector of forecasts from the smaller (nested) model, e2 is a vector of forecast errors from the larger model, and lookahead is the number of steps ahead of the forecast. Typically, the null hypothesis is that the two models perform equally well (a two-sided test), but sometimes we test whether the larger model performs better, which is indicated by a positive test statistic, for instance, above 1.645 for the 5% significance level (right tail test).

Implements: pvalue

References

  • Clark, T. E., West, K. D. 2006, Using out-of-sample mean squared prediction errors to test the martingale difference hypothesis. Journal of Econometrics, 135(1): 155–186.
  • Clark, T. E., West, K. D. 2007, Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138(1): 291–311.
source

Diebold-Mariano test

HypothesisTests.DieboldMarianoTestType
DieboldMarianoTest(e1::AbstractVector{<:Real}, e2::AbstractVector{<:Real}; loss=abs2, lookahead=1)

Perform the modified Diebold-Mariano test proposed by Harvey, Leybourne and Newbold of the null hypothesis that the two methods have the same forecast accuracy. loss is the loss function described in Diebold, F.X. and Mariano, R.S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263. and lookahead is the number of steps ahead of the forecast.

References

  • Diebold, F.X. and Mariano, R.S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263.

  • Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of forecasting, 13(2), 281-291.

source

White test

HypothesisTests.WhiteTestType
WhiteTest(X, e; type = :White)

Compute White's (or Breusch-Pagan's) test for heteroskedasticity.

X is a matrix of regressors and e is the vector of residuals from the original model. The keyword argument type is either :linear for the Breusch-Pagan/Koenker test, :linear_and_squares for White's test with linear and squared terms only (no cross-products), or :White (the default) for the full White's test (linear, squared and cross-product terms). X should include a constant and at least one more regressor, with observations in rows and regressors in columns. In some applications, X is a subset of the regressors in the original model, or just the fitted values. This saves degrees of freedom and may give a more powerful test. The lm (Lagrange multiplier) test statistic is T*R2 where R2 is from the regression of e^2 on the terms mentioned above. Under the null hypothesis it is distributed as Chisq(dof) where dof is the number of independent terms (not counting the constant), so the null is rejected when the test statistic is large enough.

Implements: pvalue

References

  • H. White, (1980): A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica, 48, 817-838.
  • T.S. Breusch & A.R. Pagan (1979), A simple test for heteroscedasticity and random coefficient variation, Econometrica, 47, 1287-1294
  • R. Koenker (1981), A note on studentizing a test for heteroscedasticity, Journal of Econometrics, 17, 107-112

External links

source
HypothesisTests.BreuschPaganTestFunction
BreuschPaganTest(X, e)

Compute Breusch-Pagan's test for heteroskedasticity.

X is a matrix of regressors from the original model and e the vector of residuals. This is equivalent to WhiteTest(X, e, type = :linear). See WhiteTest for further details.

source