**Stat 414 – Review 1 Problems**

**The following are
previous exam problems and application problems. The exam this quarter will also involve some more
“conceptual” problems as you have been seeing on the quizzes. I also expect
interpretation of output I provide. You won’t be using R live on the exam but I could ask you questions about R commands. You should assume all of
the questions below have “Explain” after them.
**

(a) I have fit a rather complicated single-level nonlinear model to these data (using days and group as explanatory variables). Assess the validity of my model. Be very clear how you are evaluating each assumption:

Form
of the model: Because the *residuals vs. fits* graph does not show any leftover pattern, the
form of the model I used appears to be adequate.

Independence: We have repeated observations on the same mouse so
independence is violated.

Normality:
The *normal
probability plot* looks reasonably linear, so the normality of the errors
condition is met.

Equal
variance: The *residuals vs. fits *graph shows
increasing variability in the residuals with increasing fitted values,
indicating a violation of equality of the error variances at each *x* (though not super severe)

(b) Which of the following would you consider doing next to improve the validity of the model? Briefly justify your choice(s).

·
Transformation to improve linearity No, model form was fine.

·
Quadratic model to improve linearity No, model form was fine

·
Transformation of response to improve normality No, normality was fine

·
Transformation of explanatory to improve
normality No, normality was fine

·
Include *days*
as a variance covariate Yes, the variability in the
residuals appears to increase with the number of data

·
Include *group*
as a variance covariate No, the variability in the two
treatment groups appears reasonably equal

·
Multilevel model using mouse as a grouping
variable (Level 2 units) Yes, this will allow us to
model the repeat observations over time as well as on each mouse (two knees)

**2)** Here is another model for the FEV data

(a) Interpret the interaction between age and height in this context.

1) The positive “effect” of age on
FEV is larger on average for taller individuals than for shorter individuals.
In other words, taller individuals increase in FEV at a faster rate than shorter
individuals.

2) The positive “effect” of height
on FEV is larger for older individuals than for younger individuals.

You can pick either
interpretation. You could also give some
numbers, like “when height = 0, the slope of age is 0.172, but when height =
10, the slope of age is 0.172 + 1.34 = 1.52” (ideally you would choose values
that are more meaningful for your data values, but the point is because you
have a 2^{nd} quantitative variable rather than telling me the slope
for each group, just pick a few of the quantitative values).

(b) How do you decide whether the interaction between age and height is statistically significant?

Because this corresponds to a
single term in the model, we can use the p-value shown, “< 2e-16” to decide
that the interaction is highly significant in this model.

(c) How do you decide whether the *association *between
age and height is statistically significant?

You would actually have to do a separate analysis that looks at the correlation coefficient (or slope) p-value from regressing one of these variables on the other. That would tell you whether there was a significant linear association between the two variables.

(d) Smoker doesn't appear to be very significant in the above model. Can I just remove it from the model?

While the p-value for smoker is
large (0.5892), smoker is involved in two interaction terms which are
significant. So
this isn’t a question about a single coefficient and only indicates that the
intercepts are not significantly different *after adjusting for the other
terms in the model.*

(e) State the null and alternative hypotheses for removing Smoker from the model. Is the p-value for this test in the above output?

H_{0}:

We would have to carry out a
partial F-test comparing the above model to the reduced model that does not
include these 3 terms.

(f) What do you learn from the output below?

This is comparing the models with
the smoker terms (3 of them as discussed in e) to the model that only has age,
height, and the interaction between them. The p-value is statistically
significant so at least one of the terms involving the smoking variable is
significant to the model and we should not remove the smoker variable from the
model.

Part of the reminder here is a question like “can I remove smoker”
is not as simple as just taking out the one smoker term.

**3)** Recall our Squid data

Squid$fMONTH = factor(Squid$MONTH)

plot(Testisweight ~ fMONTH,
data=Squid)

(a) Why did
we create fMONTH?

So
that R would treat month as a categorical variable (11 terms) rather than a
quantitative variable (one term, assuming a linear association)

(b) Is there
seasonality in the data? Does the variability in the response appear to vary by
month? Identify 3 months where you think our predictions of Testisweight
will be most accurate. Least?

We
do see evidence in the boxplots that the median Testisweight
varies noticeably across the months suggesting seasonality.

We
also see evidence in the boxplots that the box widths differ noticeably across
the months, suggesting unequal variances in the Testis weights among the
different months.

The graph
below shows the predicted values for each month (along with standard errors).

(c) If this
model was fit with indicator coding and fMONTH = 1 as
the reference group, is the coefficient of fMONTH2 positive or negative?

Because
the predicted value for fMonth1 is larger than the predicted value for fMonth2,
the coefficient of fMonth2 will be negative.

(d) If this
model was fit with effect coding, is the coefficient of fMONTH2 positive or
negative?

Because
the predicted value for fMonth2 appears to be above the overall average, the
coefficient of fMonth2 is positive.

(e) If
fMONTH1 is the missing category, will its coefficient be positive or negative?

Because
the predicted value for fMonth1 appears to be above the overall average, the
coefficient of fMonth1 is positive.

These are
the fitted lines for the model that includes the interaction between fMONTH and DML

(f) How many
terms does including this interaction add to the model?

Multiplying
the quantitative DML term with the 11 indicator terms for the 12 months will
give us 11 interaction terms to add to the model.

(g) Will the
coefficient of fMONTH10*DML be positive or negative?

The
purple line appears to have a larger slope than the average slope with DML for
the 12 lines, and all slopes are positive, so I predicte
a positive coefficient on the interaction term.

But for
addressing the unequal variance: We don't want to assume a "linear
relationship" between the variability in the residuals and month number,
so we will estimate the variance for each month. We can do that by finding the
sample variance for each month.

(h) Which
months do we want to 'downweight' in estimating the
model?

Going
back to the boxplots, we would like smaller weights on months 9 and 10 because
they have the largest sample variances.

(i) Conjecture
what changes you would expect to see in the previous two graphs in this
weighted regression model.

Now
we are going to let the variances vary by month, so the graph of the fitted
model would have much larger SEs for months 9 and 10.

Right
now, we are essentially fitting a separate line for each month, so the weighted
regression is really only expected to impact the
standard errors, so the interaction plot above should look largely the same.

(j) How do
you expect the residual standard error to change?

We expect months 7 and 8 to have pretty small values and then the other months will be multipliers for based on the larger month SDs. The small corresponds to a smaller residual standard error.

·
price = price for one night (in dollars)

·
overall_satisfaction = rating on a 0-5 scale

·
room_type = Entire home/apt, Private room, or
Shared room

·
neighborhood = neighborhood where unit is
located (1 of 43)

(a) Identify the Level 1 units and the Level 2
units

Level 1 = Airbnb listing

Level 2 = neighborhood

Consider the following output (Indicator parameterization was used for room size)

`Fixed effects:`

` Estimate Std. Error t value`

`(Intercept) 25.353 26.454 0.958`

overall_satisfaction 24.919 5.508 4.524

room_typePrivateroom -82.739 3.831 -21.598

room_typeSharedroom -105.875 10.960 -9.660

> anova(model1)

`Analysis of Variance Table`

` Df Sum Sq Mean Sq F value`

overall_satisfaction 1 41558 41558 8.0542

room_type 2 2593431
1296715 251.3102

(b) Is the type of
room statistically significant? State
the null and alternative hypothesis in terms of regression parameters, and
clearly justify your answer.

We want to
test H_{0}: _{private} = _{shared} = 0 (no room
type effect) vs. H_{a}: at least one = 0, after adjusting
for overall satisfaction (and neighborhood).
To test these two coefficients simultaneously, we need a partial F-test.
The corresponding F value in the output is 251.31. This is considered quite large (e.g., larger
than 4) and would lead us to reject the null hypothesis. We conclude that after adjusting for overall
satisfaction, there is a significant room effect (averaging across the
neighborhoods).

Keep in
mind: this model used indicator coding for room type. One approach is to look at the three slopes
of satisfaction, 33.36 for entire home/apt; 33.36 –
17.54 for private rooms, and 33.36 – 39.99 for shared rooms. So the “effect” of
satisfaction on price is largest for the entire home/apt rentals and lowest,
and in fact slightly negative, for the shared rooms. Price increases with satisfaction rating but
at a much lower rate for private rooms compared to entire homes (and even
negative for shared rooms).

**5)** Consider the following two models for
predicting language scores for 9 different schools. IQ_verb is the
student’s performance on a test of verbal IQ.

Which model demonstrates more school-to-school variability in language scores?

On average, the slope coefficients are larger in magnitude for the modelling including IQ_verb. It’s counter intuitive, but we will see that after adjusting for IQ_verb, there is actually more school-to-school variation. The main cause is that within school and between school relationships are not consistent, schools with lower language scores tended to have higher IQ_verb scores, so after adjusting for IQ_verb, the “additional contribution” to match the school means is larger.