From ef0f3060053b7cd06756d811ae1dec4f67c74d05 Mon Sep 17 00:00:00 2001 From: pcc-git Date: Sat, 22 Feb 2025 23:59:44 -0700 Subject: [PATCH] conf int edits --- .../002-Confidence_Intervals.qmd | 18 +++++++++--------- .../002-Confidence_Intervals.html | 16 ++++++++-------- docs/search.json | 8 ++++---- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/4-Statistical_Tests_Part1/002-Confidence_Intervals.qmd b/4-Statistical_Tests_Part1/002-Confidence_Intervals.qmd index 9840b66..4e4cf1f 100644 --- a/4-Statistical_Tests_Part1/002-Confidence_Intervals.qmd +++ b/4-Statistical_Tests_Part1/002-Confidence_Intervals.qmd @@ -30,6 +30,7 @@ The two primary methods of statistical inference are: 1. Hypothesis Testing 2. Confidence Intervals +This chapter lays the foundation for confidence intervals. # Background @@ -87,7 +88,7 @@ Recall that the distribution of sample means is normal when: 2. The sample size, n, is sufficiently large ($n<30$ for this class) for the Central Limit Theorem to apply -__Thought Question__: If we have a good sample from the population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the mean? +__Thought Question__: If we have a good sample from a population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the true population mean? Remember, the standard deviation of $\bar x$ is $\frac{\sigma}{\sqrt{n}}$. For the variable $\bar x$, two standard deviations would be equal to $2 \frac{\sigma}{\sqrt{n}}$. @@ -143,9 +144,7 @@ $$( \bar x - m, ~ \bar x + m )$$ # Confidence Intervals -Confidence intervals are a way to estimate a population parameter without assuming a Null Hypothesis. - -Recall that it is only *approximately* 95% of the area under the curve within 2 standard deviations of the mean. It turns out that +Recall that it is only *approximately* 95% of the area under the curve within 2 standard deviations of the mean. We want to be more precise in our confidence intervals and may want to choose a level of confidence different from 95%. @@ -164,19 +163,20 @@ tibble(`Conf. Level` = c(0.99, 0.95, 0.90), `Z*` = c(2.576, 1.96, 1.645)) %>% pa ``` + **Confidence Level** is related to the probability of a Type I error, $\alpha$, in hypothesis testing. In fact, __Confidence Level = 1-$\alpha$__. -A 95% confidence interval will miss the true population mean 5% of the time. +A 95% confidence interval will miss the true population mean 5% of the time because 5% of the time you will get a mean in the one tail or the other of the sampling distribution *just by chance.* ## Interpretation -Confidence intervals are typically reported using the notation: (lower limit, upper limit) and are interpreted: We are $100*(1-\alpha)\%$ confident that the true population mean is between [lower limit] and [upper limit]. +Confidence intervals are typically reported using parentheses like: (lower limit, upper limit). We say that we are $100*(1-\alpha)\%$ confident that the true population mean is between [lower limit] and [upper limit]. ### Average GRE Scores of BYU-I Students -The published, population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is $\sigma=8.3$. +The published population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is $\sigma=8.3$. Suppose we take a random sample of $n=100$ BYU-I students who have taken the GRE and find that their average score was $\bar{x}=162.1$ @@ -194,7 +194,7 @@ Consider that the published population mean for GRE test-takers if 158. __QUESTION__: Does the true population mean of all test-takers fall inside our confidence interval? -Because 158 falls below our confidence interval, we might conclude that BYU-I students score higher, on average, than the general population. +Because 158 falls below our confidence interval, we conclude that BYU-I students score higher, on average, than the general population with 99% confidence. # Margin of Error @@ -203,4 +203,4 @@ __QUESTION__: What happens to the margin of error, $z^*\frac{\sigma}{\sqrt{n}}$ __QUESTION__: What happens to the margin of error, $z^*\frac{\sigma}{\sqrt{n}}$, as our confidence level increases? (see table above about Z* and confidence level) -Consider that if I make a wide enough interval, I can be 100% confident. But to get there, my interval is useless. For example, I can be 100% confident that the true population averge height of BYU-I students is between 2 feet and 100 feet. +Consider that if I make a wide enough interval, I can be 100% confident. But to get 100% confidence, my interval will be useless. For example, I can be 100% confident that the true population averge height of BYU-I students is between 2 feet and 100 feet. More confidence means we need a wider interval. diff --git a/docs/4-Statistical_Tests_Part1/002-Confidence_Intervals.html b/docs/4-Statistical_Tests_Part1/002-Confidence_Intervals.html index 12cfb40..aef4dc3 100644 --- a/docs/4-Statistical_Tests_Part1/002-Confidence_Intervals.html +++ b/docs/4-Statistical_Tests_Part1/002-Confidence_Intervals.html @@ -527,6 +527,7 @@

Lesson Outcomes

  • Hypothesis Testing
  • Confidence Intervals
  • +

    This chapter lays the foundation for confidence intervals.

    @@ -635,7 +636,7 @@

    Review
  • The underlying population is normally distributed
  • The sample size, n, is sufficiently large (\(n<30\) for this class) for the Central Limit Theorem to apply
  • -

    Thought Question: If we have a good sample from the population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the mean?

    +

    Thought Question: If we have a good sample from a population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the true population mean?

    Remember, the standard deviation of \(\bar x\) is \(\frac{\sigma}{\sqrt{n}}\). For the variable \(\bar x\), two standard deviations would be equal to \(2 \frac{\sigma}{\sqrt{n}}\).

    ANSWER: If we collect a random sample from a population and \(\bar x\) is normally distributed, then about 95% of the time the sample mean \(\bar x\) will be less than \(2 \frac{\sigma}{\sqrt{n}}\) units away from the population mean \(\mu\). Notice that this is true, whether or not we know \(\mu\).

    @@ -662,8 +663,7 @@

    An A

    Confidence Intervals

    -

    Confidence intervals are a way to estimate a population parameter without assuming a Null Hypothesis.

    -

    Recall that it is only approximately 95% of the area under the curve within 2 standard deviations of the mean. It turns out that

    +

    Recall that it is only approximately 95% of the area under the curve within 2 standard deviations of the mean.

    We want to be more precise in our confidence intervals and may want to choose a level of confidence different from 95%.

    The generalized formula for a confidence interval is

    \[ CI = \bar{x} \pm z^*\frac{\sigma}{\sqrt{n}}\]

    @@ -700,13 +700,13 @@

    Confidence Intervals

    Confidence Level is related to the probability of a Type I error, \(\alpha\), in hypothesis testing. In fact, Confidence Level = 1-\(\alpha\).

    -

    A 95% confidence interval will miss the true population mean 5% of the time.

    +

    A 95% confidence interval will miss the true population mean 5% of the time because 5% of the time you will get a mean in the one tail or the other of the sampling distribution just by chance.

    Interpretation

    -

    Confidence intervals are typically reported using the notation: (lower limit, upper limit) and are interpreted: We are \(100*(1-\alpha)\%\) confident that the true population mean is between [lower limit] and [upper limit].

    +

    Confidence intervals are typically reported using parentheses like: (lower limit, upper limit). We say that we are \(100*(1-\alpha)\%\) confident that the true population mean is between [lower limit] and [upper limit].

    Average GRE Scores of BYU-I Students

    -

    The published, population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is \(\sigma=8.3\).

    +

    The published population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is \(\sigma=8.3\).

    Suppose we take a random sample of \(n=100\) BYU-I students who have taken the GRE and find that their average score was \(\bar{x}=162.1\)

    We can calculate the 99% confidence interval:

    \[ 162.1 \pm 2.576\frac{8.3}{\sqrt{100}} = (159.96, 164.24)\]

    @@ -719,13 +719,13 @@

    Avera

    Relationship to Hypothesis Testing

    Consider that the published population mean for GRE test-takers if 158.

    QUESTION: Does the true population mean of all test-takers fall inside our confidence interval?

    -

    Because 158 falls below our confidence interval, we might conclude that BYU-I students score higher, on average, than the general population.

    +

    Because 158 falls below our confidence interval, we conclude that BYU-I students score higher, on average, than the general population with 99% confidence.

    Margin of Error

    QUESTION: What happens to the margin of error, \(z^*\frac{\sigma}{\sqrt{n}}\), as the sample size increases?

    QUESTION: What happens to the margin of error, \(z^*\frac{\sigma}{\sqrt{n}}\), as our confidence level increases? (see table above about Z* and confidence level)

    -

    Consider that if I make a wide enough interval, I can be 100% confident. But to get there, my interval is useless. For example, I can be 100% confident that the true population averge height of BYU-I students is between 2 feet and 100 feet.

    +

    Consider that if I make a wide enough interval, I can be 100% confident. But to get 100% confidence, my interval will be useless. For example, I can be 100% confident that the true population averge height of BYU-I students is between 2 feet and 100 feet. More confidence means we need a wider interval.

    diff --git a/docs/search.json b/docs/search.json index 8036a09..8fa23fd 100644 --- a/docs/search.json +++ b/docs/search.json @@ -1306,14 +1306,14 @@ "href": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html", "title": "Confidence Intervals for a Mean", "section": "", - "text": "By the end of this lesson, you should be able to:\n\nRecognize when a one mean (sigma known) confidence interval is appropriate\nExplain the meaning of a level of confidence\nCreate a confidence interval for a single mean with \\(\\sigma\\) known using the following steps:\n\nFind the point estimate (\\(\\bar{x}\\))\nCalculate the margin of error for the given level of confidence\nCalculate a confidence interval from the point estimate and the margin of error\nInterpret the confidence interval\nCheck the requirements for the confidence interval\n\nExplain how the margin of error is affected by the sample size and level of confidence\n\nStatistical Inference is the practice of using data sampled from a population to make conclusions about population parameters.\nThe two primary methods of statistical inference are:\n\nHypothesis Testing\nConfidence Intervals" + "text": "By the end of this lesson, you should be able to:\n\nRecognize when a one mean (sigma known) confidence interval is appropriate\nExplain the meaning of a level of confidence\nCreate a confidence interval for a single mean with \\(\\sigma\\) known using the following steps:\n\nFind the point estimate (\\(\\bar{x}\\))\nCalculate the margin of error for the given level of confidence\nCalculate a confidence interval from the point estimate and the margin of error\nInterpret the confidence interval\nCheck the requirements for the confidence interval\n\nExplain how the margin of error is affected by the sample size and level of confidence\n\nStatistical Inference is the practice of using data sampled from a population to make conclusions about population parameters.\nThe two primary methods of statistical inference are:\n\nHypothesis Testing\nConfidence Intervals\n\nThis chapter lays the foundation for confidence intervals." }, { "objectID": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#lesson-outcomes", "href": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#lesson-outcomes", "title": "Confidence Intervals for a Mean", "section": "", - "text": "By the end of this lesson, you should be able to:\n\nRecognize when a one mean (sigma known) confidence interval is appropriate\nExplain the meaning of a level of confidence\nCreate a confidence interval for a single mean with \\(\\sigma\\) known using the following steps:\n\nFind the point estimate (\\(\\bar{x}\\))\nCalculate the margin of error for the given level of confidence\nCalculate a confidence interval from the point estimate and the margin of error\nInterpret the confidence interval\nCheck the requirements for the confidence interval\n\nExplain how the margin of error is affected by the sample size and level of confidence\n\nStatistical Inference is the practice of using data sampled from a population to make conclusions about population parameters.\nThe two primary methods of statistical inference are:\n\nHypothesis Testing\nConfidence Intervals" + "text": "By the end of this lesson, you should be able to:\n\nRecognize when a one mean (sigma known) confidence interval is appropriate\nExplain the meaning of a level of confidence\nCreate a confidence interval for a single mean with \\(\\sigma\\) known using the following steps:\n\nFind the point estimate (\\(\\bar{x}\\))\nCalculate the margin of error for the given level of confidence\nCalculate a confidence interval from the point estimate and the margin of error\nInterpret the confidence interval\nCheck the requirements for the confidence interval\n\nExplain how the margin of error is affected by the sample size and level of confidence\n\nStatistical Inference is the practice of using data sampled from a population to make conclusions about population parameters.\nThe two primary methods of statistical inference are:\n\nHypothesis Testing\nConfidence Intervals\n\nThis chapter lays the foundation for confidence intervals." }, { "objectID": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#point-estimators", @@ -1327,7 +1327,7 @@ "href": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#review-distribution-of-sample-means", "title": "Confidence Intervals for a Mean", "section": "Review: Distribution of Sample Means", - "text": "Review: Distribution of Sample Means\nConfidence intervals rely on the validity of the assumption that the distribution of the sample mean is normally distributed.\nRecall that the distribution of sample means is normal when:\n\nThe underlying population is normally distributed\nThe sample size, n, is sufficiently large (\\(n<30\\) for this class) for the Central Limit Theorem to apply\n\nThought Question: If we have a good sample from the population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the mean?\nRemember, the standard deviation of \\(\\bar x\\) is \\(\\frac{\\sigma}{\\sqrt{n}}\\). For the variable \\(\\bar x\\), two standard deviations would be equal to \\(2 \\frac{\\sigma}{\\sqrt{n}}\\).\nANSWER: If we collect a random sample from a population and \\(\\bar x\\) is normally distributed, then about 95% of the time the sample mean \\(\\bar x\\) will be less than \\(2 \\frac{\\sigma}{\\sqrt{n}}\\) units away from the population mean \\(\\mu\\). Notice that this is true, whether or not we know \\(\\mu\\).\n\n\n\n\n\n\n\n\n\nThis means the 95% of the time, we will get a sample mean within 2 Standard Deviations of the true population mean.\nFlipping this around, if we take our sample mean and make an interval 2 standard deviations above the mean and 2 below, the interval will overlap with the true population mean about 95% of the time." + "text": "Review: Distribution of Sample Means\nConfidence intervals rely on the validity of the assumption that the distribution of the sample mean is normally distributed.\nRecall that the distribution of sample means is normal when:\n\nThe underlying population is normally distributed\nThe sample size, n, is sufficiently large (\\(n<30\\) for this class) for the Central Limit Theorem to apply\n\nThought Question: If we have a good sample from a population and can trust that the sampling distribution of the mean is approximately normal, how frequently would a sample mean be within 2 standard deviations from the true population mean?\nRemember, the standard deviation of \\(\\bar x\\) is \\(\\frac{\\sigma}{\\sqrt{n}}\\). For the variable \\(\\bar x\\), two standard deviations would be equal to \\(2 \\frac{\\sigma}{\\sqrt{n}}\\).\nANSWER: If we collect a random sample from a population and \\(\\bar x\\) is normally distributed, then about 95% of the time the sample mean \\(\\bar x\\) will be less than \\(2 \\frac{\\sigma}{\\sqrt{n}}\\) units away from the population mean \\(\\mu\\). Notice that this is true, whether or not we know \\(\\mu\\).\n\n\n\n\n\n\n\n\n\nThis means the 95% of the time, we will get a sample mean within 2 Standard Deviations of the true population mean.\nFlipping this around, if we take our sample mean and make an interval 2 standard deviations above the mean and 2 below, the interval will overlap with the true population mean about 95% of the time." }, { "objectID": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#an-approximate-95-confidence-interval", @@ -1341,7 +1341,7 @@ "href": "4-Statistical_Tests_Part1/002-Confidence_Intervals.html#interpretation", "title": "Confidence Intervals for a Mean", "section": "Interpretation", - "text": "Interpretation\nConfidence intervals are typically reported using the notation: (lower limit, upper limit) and are interpreted: We are \\(100*(1-\\alpha)\\%\\) confident that the true population mean is between [lower limit] and [upper limit].\n\nAverage GRE Scores of BYU-I Students\nThe published, population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is \\(\\sigma=8.3\\).\nSuppose we take a random sample of \\(n=100\\) BYU-I students who have taken the GRE and find that their average score was \\(\\bar{x}=162.1\\)\nWe can calculate the 99% confidence interval:\n\\[ 162.1 \\pm 2.576\\frac{8.3}{\\sqrt{100}} = (159.96, 164.24)\\]\nThe interpretation of the above confidence interval would be:\nI am 99% confident that the true population mean GRE score for BYU-I students is between 159.96 and 164.24." + "text": "Interpretation\nConfidence intervals are typically reported using parentheses like: (lower limit, upper limit). We say that we are \\(100*(1-\\alpha)\\%\\) confident that the true population mean is between [lower limit] and [upper limit].\n\nAverage GRE Scores of BYU-I Students\nThe published population standard deviation of the quantitative portion of the Graduate Record Examination (GRE) scores is \\(\\sigma=8.3\\).\nSuppose we take a random sample of \\(n=100\\) BYU-I students who have taken the GRE and find that their average score was \\(\\bar{x}=162.1\\)\nWe can calculate the 99% confidence interval:\n\\[ 162.1 \\pm 2.576\\frac{8.3}{\\sqrt{100}} = (159.96, 164.24)\\]\nThe interpretation of the above confidence interval would be:\nI am 99% confident that the true population mean GRE score for BYU-I students is between 159.96 and 164.24." }, { "objectID": "2-Tidy_Data/03-Select.html",