Unit 8: Inference for Categorical Data: Chi-Square

Chi-square tests for goodness of fit, homogeneity, and independence

Unit Resources

Select a resource below to start studying.

📚Study Guide: Inference for Categorical Data: Chi-Square

Unit 8: Inference for Categorical Data: Chi-Square

Chi-square procedures extend inference to categorical variables with multiple categories. The chi-square goodness-of-fit test compares observed counts to expected counts based on a hypothesized distribution for a single categorical variable. The chi-square test for independence assesses whether two categorical variables are associated in a two-way table, while the chi-square test for homogeneity compares distributions across multiple populations. All chi-square tests rely on calculating a statistic that measures the discrepancy between observed and expected frequencies. The conditions--Random, Large Counts (all expected counts >= 5), and Independence--must be verified. Degrees of freedom depend on the number of categories or table dimensions.

Key Concepts

  • Chi-Square Statistic: Chi-square = sum [(Observed - Expected)^2 / Expected]. Larger values indicate greater discrepancy.
  • Goodness-of-Fit: Tests whether a single categorical variable follows a specified distribution. df = number of categories - 1.
  • Test for Independence: Tests whether two categorical variables are associated in one population. df = (rows - 1)*(columns - 1).
  • Test for Homogeneity: Tests whether distributions of a categorical variable are the same across several populations. Same procedure as independence but different design and hypotheses.
  • Expected Counts: (row total * column total) / grand total for two-way tables; n * p_i for goodness-of-fit.
  • Conditions: Random sample, all expected counts >= 5, and observations are independent.

Vocabulary

  • Observed count: The actual number of observations in a category from the sample data.
  • Expected count: The number of observations expected in a category if the null hypothesis is true.
  • Chi-square distribution: A right-skewed distribution defined by degrees of freedom, used for categorical inference.
  • Goodness-of-fit: A test comparing sample data to a theoretical distribution.
  • Two-way table: A table displaying counts for combinations of two categorical variables.

Formulas

  • Chi-square = sum [ (O - E)^2 / E ]
  • Expected = (row total * column total) / table total
  • Goodness-of-fit df = k - 1
  • Independence/Homogeneity df = (r - 1)*(c - 1)

Common Mistakes

  • Using proportions instead of counts in chi-square calculations; the test requires whole number observed counts.
  • Confusing independence and homogeneity; independence relates one sample to two variables, homogeneity compares groups on one variable.
  • Checking observed counts >= 5 instead of expected counts >= 5; the condition applies to expected frequencies.
  • Interpreting a significant result as proving causation; chi-square tests show association, not cause.

AP Exam Strategies

  • Always calculate and display expected counts in a table before computing the chi-square statistic on FRQs.
  • State the hypotheses in terms of the population or variables, not the sample counts.
  • For two-way tables, clearly specify whether you are testing independence or homogeneity and why.
  • When concluding, describe the nature of the association by comparing observed and expected counts in specific cells.

Real-World Applications

  • Genetics: Goodness-of-fit tests verify whether offspring ratios match Mendelian predictions.
  • Market Research: Independence tests determine if product preference is associated with age group.
  • Public Health: Homogeneity tests compare disease incidence distributions across different hospitals or regions.

Practice Quiz: Inference for Categorical Data: Chi-Square

Answer each question one at a time. Click an option to select your answer.

Question 1 of 150
Question
Loading...
Click to flip
Answer
Loading...
Click to flip back 🔀 Shuffle
1 / 59

🎥Free Video Lessons: Inference for Categorical Data: Chi-Square

Watch these unit review videos directly on our site.

AP Statistics Unit 8 Chi Square Tests Summary Review Video by Michael Porinchak - AP Statistics & AP Precalculus

Unit 8 Review: Inference for Categorical Data: Chi-Square (AP Statistics) by AI Podcasts & Videos (Mostly AP)

Chi-Square Tests: Crash Course Statistics #29 by CrashCourse

📄Cheat Sheet: Inference for Categorical Data: Chi-Square

Quick reference for Inference for Categorical Data: Chi-Square. Print this out and review before the exam!

Chi-Square Cheat Sheet

Essential Formulas

  • Chi-square = sum [(O - E)^2 / E]
  • Expected = (row total * column total) / grand total
  • GOF df = categories - 1
  • Independence/Homogeneity df = (r - 1)(c - 1)

Key Definitions

  • Observed: actual sample counts
  • Expected: counts predicted by H0
  • GOF: one categorical variable vs hypothesized distribution
  • Independence: association between two variables in one population
  • Homogeneity: distribution comparison across populations

Problem-Solving Steps

  1. Identify test type: GOF, independence, or homogeneity.
  2. Calculate expected counts; verify all >= 5.
  3. Compute chi-square statistic and df.
  4. Find p-value and conclude in context.

Calculator Tips

  • Use GOF-Test, chi-square Test, or chi-square 2-Way Test on TI-84.
  • Enter observed counts into a matrix; calculator computes expected counts and chi-square.

🔬Ultimate Review Packet Materials

Download official review materials for this unit.

No URP materials available for this unit yet.

Check back soon for study guides, practice questions, and review videos.

← Back to AP Statistics