📚Study Guide: Inference for Quantitative Data: Slopes
Unit 9: Inference for Quantitative Data: Slopes
The final unit of AP Statistics extends inference to the slope of a least-squares regression line. You will construct confidence intervals and perform t-tests for the slope beta of the population regression line. The conditions--Linear, Independent, Normal, Equal SD, Random (LINER)--must be checked using residual plots and scatterplots. The standard error of the slope is provided on the exam formula sheet, but you must understand its components and interpret computer regression output. This unit ties together exploratory data analysis, probability, and inference into a cohesive framework for analyzing relationships between quantitative variables.
Key Concepts
- Inference for Slope: We test H0: beta = 0 (no linear relationship) against Ha: beta != 0, or construct a confidence interval for beta.
- t-Test for Slope: t = (b - beta0) / SE_b, where b is the sample slope and SE_b is the standard error of the slope.
- Conditions (LINER): Linear relationship (scatterplot/residual plot), Independent observations (10% condition), Normal residuals (histogram/Normal probability plot of residuals), Equal standard deviation of y across x (residual plot random scatter), Random data collection.
- Standard Error of Slope: SE_b = s / (s_x * sqrt(n-1)), where s is the standard deviation of residuals.
- Computer Output: AP exams often provide regression output; identify the slope, its standard error, t-ratio, and p-value directly from the table.
- Confidence Interval for beta: b +/- t* * SE_b. Interpreted as the plausible range for the true rate of change.
Vocabulary
- Population slope (beta): The true rate of change in the response variable per unit change in the explanatory variable for the population.
- Sample slope (b): The estimated rate of change from the sample regression line.
- Standard error of the slope: A measure of the variability of the sample slope across different samples.
- Residual standard deviation (s): The standard deviation of the residuals, measuring typical prediction error.
- Regression output: Computer-generated tables showing coefficients, standard errors, t-values, and p-values.
Formulas
- t = (b - beta0) / SE_b
- CI: b +/- t* * SE_b
- df = n - 2 for regression inference
- SE_b is provided on formula sheet
Common Mistakes
- Checking linearity with the original scatterplot but forgetting to examine the residual plot for patterns.
- Using the standard error of the estimate (s) instead of the standard error of the slope (SE_b) in calculations.
- Assuming a significant slope implies a strong relationship; a small p-value can occur with a very weak but precisely estimated slope if n is large.
- Ignoring the equal SD condition, which is assessed by checking that residuals have roughly similar spread across all x-values.
AP Exam Strategies
- Check LINER conditions explicitly; for linearity and equal SD, reference the residual plot; for normality, reference a histogram or normal probability plot of residuals.
- When interpreting computer output, clearly label which number is the slope, SE_b, t-statistic, and p-value.
- For confidence intervals, interpret the slope in context: "For each additional unit of x, the predicted y changes by between [lower] and [upper] units."
- Remember df = n - 2 for regression t-procedures, not n - 1.
Real-World Applications
- Economics: Testing whether advertising spending significantly predicts sales revenue.
- Environmental Science: Assessing whether CO2 concentration is a statistically significant predictor of temperature change.
- Sports Analytics: Determining if practice hours significantly predict athlete performance metrics.