Regression slope

Regression slope

◎◉○

Live simulation

Flipper length and body mass

For this example, we want to know if flipper length predicts body mass in penguins near Palmer Station, Antarctica. Here’s a scatterplot of the relationship:

There’s a clear positive trend—penguins with longer flippers tend to be heavier. But is that relationship real, or could it just be due to random chance? Time for hypothesis testing!

First, we’ll load some packages:

library(tidyverse)
library(infer)
library(parameters)

penguins <- penguins |> drop_na(sex)

Null hypothesis inference with {infer}

Null hypothesis inference with lm()

In practice, most people do not simulate null worlds. Instead, they fit a regression model with lm(), which uses a t-distribution to approximate the null world mathematically and test whether each coefficient is different from 0. The intuition is the same: a p-value is still the probability of seeing a slope at least that extreme in a world where the true slope is 0.

model <- lm(body_mass ~ flipper_len, data = penguins)
summary(model)

Call:
lm(formula = body_mass ~ flipper_len, data = penguins)

Residuals:
     Min       1Q   Median       3Q      Max 
-1057.33  -259.79   -12.24   242.97  1293.89 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -5872.09     310.29  -18.93   <2e-16 ***
flipper_len    50.15       1.54   32.56   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 393.3 on 331 degrees of freedom
Multiple R-squared:  0.7621,    Adjusted R-squared:  0.7614 
F-statistic:  1060 on 1 and 331 DF,  p-value: < 2.2e-16

Buried in that output is the p-value for the flipper_len coefficient: p < 2.2e-16, or p < 2.2 × 10−16. That’s really tiny. In a world where flipper length had no relationship with body mass, it would be virtually impossible to see a slope as extreme as 50.15. We have enough evidence to declare that the relationship is statistically significant.

If you don’t like all that text output, you can feed the model to the model_parameters() function from the {parameters} package:

model |>
  model_parameters() |>
  display(caption = "")
Parameter Coefficient SE 95% CI t(331) p
(Intercept) -5872.09 310.29 (-6482.47, -5261.71) -18.92 < .001
flipper len 50.15 1.54 (47.12, 53.18) 32.56 < .001

Footnotes

  1. Kind of—in common law systems, defendants are presumed innocent until proven guilty, so if there’s not enough evidence to prove guilt, they are innocent by definition. ↩︎