Does the model seem to have the right form? ```{r echo=c(3,4)} #The 'lines' command only works here because the observations are ordered #If using Session window, may have to copy the next two lines together plot(KYDerby23$speed~KYDerby23$Year) lines(cbind(KYDerby23$Year, model2$fitted.values), col="red") ``` ```{r} #An alternative approach to show the model on the graph for visual inspection using tidyverse/ggplot. You can comment one out. KYDerby23 %>% ggplot(aes(x = Year, y = speed)) + geom_point() + geom_line(aes(x = Year, y = model2$fitted.values), color = "red") + labs(title = "Quadratic model") + theme_bw() ```

Are the model conditions more adequately met? ```{r echo=c(2, 4)} #Remember the first line, to produce the 2x2 display, is optional (graphs might end up too small in the Word file). Or view the graphs here and write comments and then doesn't need to be larger in the word file. Or just resize in the Word file. par(mfrow=c(2,2)) plot(model2) #Can also produce residual plots directly, e.g., plot(model2$residuals ~ model2$fitted.values) ``` (b) Which of the regression model conditions/plots changed? Improvement?

#### Log transformation If we consider the relationship "monotonic" with $Y$ increasing at a slower and slower rate, we can try a log transformation of the X variable (to "slow it down"). ```{r echo=-6} #Like many other packages "log" refers to natural log log.year = log(KYDerby23$Year) model3 = lm(speed ~ log.year, data = KYDerby23) #May have to copy the next two lines together into the session window par(mfrow=c(1,1)) plot(KYDerby23$speed~KYDerby23$Year) lines(cbind(KYDerby23$Year, model3$fitted.values), col="green") ``` This model does not appear to be very helpful! The model we are fitting is curved, but not curved in the right place. We can often solve this by first shifting the data... ```{r} #Let's make the first year = 1 (we could start at zero but then couldn't take the log) shiftedyear = KYDerby23$Year - 1874 logx = log(shiftedyear) model3b = lm(KYDerby23$speed~logx) plot(KYDerby23$speed~logx) ``` (c) Is the association between speed and log(year) linear?

Does the model seem to have the right form? ```{r} #May have to copy the next two lines together into the session window plot(KYDerby23$speed~KYDerby23$Year) lines(cbind(KYDerby23$Year, model3b$fitted.values), col="blue") #tidyverse version KYDerby23 %>% ggplot(aes(x = Year, y = speed)) + geom_point() + geom_line(aes(x = Year, y = model3b$fitted.values), color = "red") + labs(title = "Speed vs. log(Year - 1874)") + theme_bw() ``` (d) Are the model conditions more adequately met? ```{r} par(mfrow=c(2,2)) plot(model3b) ```

#### Part of Quiz 2 ```{r} par(mfrow=c(1,1)) plot(KYDerby23$speed~KYDerby23$Year) lines(cbind(KYDerby23$Year, model2$fitted.values), col="red") lines(cbind(KYDerby23$Year, model3b$fitted.values), col="blue") ``` (e) Which model would you recommend and why?

(f) Which model