Day 14 - 09/20/2024

From last class

Assignment expectations.
- Model homework.
- Uncertainty estimates.
Resources on the statistical models we’re fitting
Rmd resources:
- Rmd cookbook [link]
- Rmd cheatsheet [link]
- R (tidyverse) YouTube guide [link]

Effects, interactions, notation

library(tidyverse)
url <- "https://raw.githubusercontent.com/jlacasa/stat705_fall2024/main/classes/data/lotus_part3.csv"
dd <- read.csv(url)

dd %>% 
  ggplot(aes(doy, crown_g))+
  geom_point(aes(fill = species), shape = 21, size =3.5)+
  scale_fill_manual(values = c("#DB504A", "#43AA8B", "#FAD4C0"))+
  labs(x = "Day of the Year", 
       y = "Crown biomass (g)", 
       fill = "Species")+
  theme_classic()+
  theme(aspect.ratio = 1)

Let’s fit the model \[y_{ij} \sim N(\mu_{ij}, \sigma^2),\]

\[\mu_{ij} = \beta_{0j} + \beta_{1j}x_{ij},\] where \(y_{ij}\) is the observation of crown biomass for the \(i\)th observation of the \(j\)th species, \(\mu{ij}\) is its mean and \(\sigma^2\) its variance. The means is described with \(\beta_{0j}\), the intercept of the \(j\)th species and \(\beta_{1j}\), the slope of the \(j\)th species

m <- lm(crown_g ~ species * doy, data = dd)
summary(m)


Call:
lm(formula = crown_g ~ species * doy, data = dd)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.29416 -0.09688 -0.00858  0.06046  0.73550 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -1.5809069  0.1438577 -10.989  < 2e-16 ***
speciesC      0.8764868  0.2034455   4.308 3.11e-05 ***
speciesD      0.6864100  0.2034455   3.374 0.000962 ***
doy           0.0103322  0.0007297  14.159  < 2e-16 ***
speciesC:doy -0.0054042  0.0010320  -5.237 5.96e-07 ***
speciesD:doy -0.0043320  0.0010320  -4.198 4.80e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1467 on 138 degrees of freedom
Multiple R-squared:  0.7207,    Adjusted R-squared:  0.7105 
F-statistic: 71.21 on 5 and 138 DF,  p-value: < 2.2e-16

It’s getting much harder to compute the E(y)s by hand!

beta_hat <- coef(m)
beta_hat

 (Intercept)     speciesC     speciesD          doy speciesC:doy speciesD:doy 
-1.580906870  0.876486785  0.686410011  0.010332195 -0.005404234 -0.004331957

What’s \(E(y|x=200, \text{Species A})\)?

Compute \(\beta_{0A} + 200 \cdot \beta_{1A}\) (or \(\beta_{01} + 200 \cdot \beta_{11}\)):

beta_hat[1] + 200*beta_hat[4]

(Intercept) 
  0.4855322

What’s \(E(y|x=200, \text{Species C})\)?
Compute \(\beta_{0C} + 200 \cdot \beta_{1C}\) (or \(\beta_{02} + 200 \cdot \beta_{12}\)):

(beta_hat[1] + beta_hat[2]) + 200*(beta_hat[4] + beta_hat[5])

(Intercept) 
  0.2811723

What’s \(E(y|x=200, \text{Species D})\)?
Compute \(\beta_{0D} + 200 \cdot \beta_{1D}\) (or \(\beta_{02} + 200 \cdot \beta_{12}\)):

(beta_hat[1] + beta_hat[3]) + 200*(beta_hat[4] + beta_hat[6])

(Intercept) 
  0.3055508

Can we use the predict function?

X_new <- data.frame(species = c("A", "C", "D"),
                    doy = 200)
predict(m, X_new, interval = "confidence")

        fit       lwr       upr
1 0.4855322 0.4430602 0.5280043
2 0.2811723 0.2387003 0.3236443
3 0.3055508 0.2630787 0.3480228

Hypothesis tests

In the summary, we’re testing if whatever coefficient \(\beta = 0\). That means, the null Hypothesis \(H_0 : \beta_0=0\), and the alternative Hypothesis \(H_1 : \beta_0\ne0\).

summary(m)


Call:
lm(formula = crown_g ~ species * doy, data = dd)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.29416 -0.09688 -0.00858  0.06046  0.73550 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -1.5809069  0.1438577 -10.989  < 2e-16 ***
speciesC      0.8764868  0.2034455   4.308 3.11e-05 ***
speciesD      0.6864100  0.2034455   3.374 0.000962 ***
doy           0.0103322  0.0007297  14.159  < 2e-16 ***
speciesC:doy -0.0054042  0.0010320  -5.237 5.96e-07 ***
speciesD:doy -0.0043320  0.0010320  -4.198 4.80e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1467 on 138 degrees of freedom
Multiple R-squared:  0.7207,    Adjusted R-squared:  0.7105 
F-statistic: 71.21 on 5 and 138 DF,  p-value: < 2.2e-16

model.matrix(m)

    (Intercept) speciesC speciesD doy speciesC:doy speciesD:doy
1             1        0        0 155            0            0
2             1        0        0 155            0            0
3             1        0        0 155            0            0
4             1        0        0 155            0            0
5             1        0        0 155            0            0
6             1        0        0 155            0            0
7             1        0        0 155            0            0
8             1        0        0 155            0            0
9             1        0        0 155            0            0
10            1        0        0 155            0            0
11            1        0        0 155            0            0
12            1        0        0 155            0            0
13            1        1        0 155          155            0
14            1        1        0 155          155            0
15            1        1        0 155          155            0
16            1        1        0 155          155            0
17            1        1        0 155          155            0
18            1        1        0 155          155            0
19            1        1        0 155          155            0
20            1        1        0 155          155            0
21            1        1        0 155          155            0
22            1        1        0 155          155            0
23            1        1        0 155          155            0
24            1        1        0 155          155            0
25            1        0        1 155            0          155
26            1        0        1 155            0          155
27            1        0        1 155            0          155
28            1        0        1 155            0          155
29            1        0        1 155            0          155
30            1        0        1 155            0          155
31            1        0        1 155            0          155
32            1        0        1 155            0          155
33            1        0        1 155            0          155
34            1        0        1 155            0          155
35            1        0        1 155            0          155
36            1        0        1 155            0          155
37            1        0        0 194            0            0
38            1        0        0 194            0            0
39            1        0        0 194            0            0
40            1        0        0 194            0            0
41            1        0        0 194            0            0
42            1        0        0 194            0            0
43            1        0        0 194            0            0
44            1        0        0 194            0            0
45            1        0        0 194            0            0
46            1        0        0 194            0            0
47            1        0        0 194            0            0
48            1        0        0 194            0            0
49            1        1        0 194          194            0
50            1        1        0 194          194            0
51            1        1        0 194          194            0
52            1        1        0 194          194            0
53            1        1        0 194          194            0
54            1        1        0 194          194            0
55            1        1        0 194          194            0
56            1        1        0 194          194            0
57            1        1        0 194          194            0
58            1        1        0 194          194            0
59            1        1        0 194          194            0
60            1        1        0 194          194            0
61            1        0        1 194            0          194
62            1        0        1 194            0          194
63            1        0        1 194            0          194
64            1        0        1 194            0          194
65            1        0        1 194            0          194
66            1        0        1 194            0          194
67            1        0        1 194            0          194
68            1        0        1 194            0          194
69            1        0        1 194            0          194
70            1        0        1 194            0          194
71            1        0        1 194            0          194
72            1        0        1 194            0          194
73            1        0        0 194            0            0
74            1        0        0 194            0            0
75            1        0        0 194            0            0
76            1        0        0 194            0            0
77            1        0        0 194            0            0
78            1        0        0 194            0            0
79            1        0        0 194            0            0
80            1        0        0 194            0            0
81            1        0        0 194            0            0
82            1        0        0 194            0            0
83            1        0        0 194            0            0
84            1        0        0 194            0            0
85            1        1        0 194          194            0
86            1        1        0 194          194            0
87            1        1        0 194          194            0
88            1        1        0 194          194            0
89            1        1        0 194          194            0
90            1        1        0 194          194            0
91            1        1        0 194          194            0
92            1        1        0 194          194            0
93            1        1        0 194          194            0
94            1        1        0 194          194            0
95            1        1        0 194          194            0
96            1        1        0 194          194            0
97            1        0        1 194            0          194
98            1        0        1 194            0          194
99            1        0        1 194            0          194
100           1        0        1 194            0          194
101           1        0        1 194            0          194
102           1        0        1 194            0          194
103           1        0        1 194            0          194
104           1        0        1 194            0          194
105           1        0        1 194            0          194
106           1        0        1 194            0          194
107           1        0        1 194            0          194
108           1        0        1 194            0          194
109           1        0        0 237            0            0
110           1        0        0 237            0            0
111           1        0        0 237            0            0
112           1        0        0 237            0            0
113           1        0        0 237            0            0
114           1        0        0 237            0            0
115           1        0        0 237            0            0
116           1        0        0 237            0            0
117           1        0        0 237            0            0
118           1        0        0 237            0            0
119           1        0        0 237            0            0
120           1        0        0 237            0            0
121           1        1        0 237          237            0
122           1        1        0 237          237            0
123           1        1        0 237          237            0
124           1        1        0 237          237            0
125           1        1        0 237          237            0
126           1        1        0 237          237            0
127           1        1        0 237          237            0
128           1        1        0 237          237            0
129           1        1        0 237          237            0
130           1        1        0 237          237            0
131           1        1        0 237          237            0
132           1        1        0 237          237            0
133           1        0        1 237            0          237
134           1        0        1 237            0          237
135           1        0        1 237            0          237
136           1        0        1 237            0          237
137           1        0        1 237            0          237
138           1        0        1 237            0          237
139           1        0        1 237            0          237
140           1        0        1 237            0          237
141           1        0        1 237            0          237
142           1        0        1 237            0          237
143           1        0        1 237            0          237
144           1        0        1 237            0          237
attr(,"assign")
[1] 0 1 1 2 3 3
attr(,"contrasts")
attr(,"contrasts")$species
[1] "contr.treatment"

We also tested \(H_0: \mu_1 = \mu_2\), \(H_1: \mu_1 \ne \mu_2\) using t-tests. What happens when we have more than 2 categories?

ANOVA

In ANOVA:
- \(H_0: \mu_1=\mu_2=\mu_3\) (all \(\mu\)s are the same)
- \(H_1\): at least one \(\mu\) is different.

`car::Anova` versus `anova` in R

library(car)
anova(m)

Analysis of Variance Table

Response: crown_g
             Df Sum Sq Mean Sq F value    Pr(>F)    
species       2 0.9100  0.4550  21.156 9.681e-09 ***
doy           1 6.0858  6.0858 282.964 < 2.2e-16 ***
species:doy   2 0.6614  0.3307  15.376 9.368e-07 ***
Residuals   138 2.9680  0.0215                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

car::Anova(m, type = 3)

Anova Table (Type III tests)

Response: crown_g
            Sum Sq  Df F value    Pr(>F)    
(Intercept) 2.5974   1 120.766 < 2.2e-16 ***
species     0.4419   2  10.272 6.937e-05 ***
doy         4.3120   1 200.491 < 2.2e-16 ***
species:doy 0.6614   2  15.376 9.368e-07 ***
Residuals   2.9680 138                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In class code.

For next week