Workshop Statistics: Discovery with Data, Second Edition

Topic 10: Least Squares Regression I

Activity 10-1: Airfares

(a) Answers will vary from student to student. Might choose the mean, 166.92.
(b) Answers will vary from student to student, but some examples would be which airline you choose to fly with, or the distance to the destination city, how far in advance you book.
(c) Based on the scatterplot, knowing the distance would be useful because there appears to be a fairly strong association between distance and airfare.
(e) Answers will vary from student to student, but $130 is a good estimation.
(f) Answers will vary from student to student, but $260 is a good estimation.
(g) Answers will vary from student to student, but using our answers from (e) and (f), slope = (260-130)/(1500-300)=13/120=.10833.
(h) Answers will vary from student to student, but using our answers from (e) and (g), intercept = 97.5.
(i) Answers will vary from student to student, but using our answers from (g) and (h), airfare = 97.5 + (.108) * distance.
 

Activity 10-2: Airfares (cont.)

Click here for Calculator version solution.
Click here for Minitab version solution.
(a)
 
mean
std. dev.
airfare (y)
166.9
59.5
distance (x)
713
403
r = .795
(b) b =.795(59.5/403) = .117;  a = 166.9-.117(713) = 83.479
(c) airfare = 83.479 + .117 * distance
(d) 83.479+.117(300) = $118.58
(e) $258.98
(f)
(g) Answers will vary from student to student, but a good estimate would be $190.
(h) $188.78
(i) $415.99;  This is probably not a reliabe estimate since a distance of 2,842 miles is well beyond our data set.
(j)
distance
900
901
902
903
predicted airfare
$188.78
$188.89
$189.01
$189.13
(k) Each mile adds about another $0.11, which is close to the slope of our least squares regression line, .117.
(l) $11.70

Activity 10-3: Airfares (cont.)

(a) $150.87
(b) 178-150.87 = $27.13
(c) Atlanta: fitted - $150.87,  residual - $27.13;  Boston: residual - $11.30
(d) St. Louis: distance - 737,  airfare - $98,  error - overestimate of $71.77
(e) greater
(f) below
(g) $111.08
(h) 4: Atlanta, Detroit, Pittsburgh, St. Louis
(i) Most cities have a smaller residual than their deviation from the mean.  This suggests that predictions from the regression line are generally better than the airfare mean because least squares regression takes the explanatory variable into account.
(j) sum of squared residuals: $14,308.09;  sum of squared deviations from overall mean: $38,882.92
(k) .632
(l) .632;  This is the same as the proportion of variability in the response variable that is explained by the regression model.
(j)-(m) for the Calculator version

Activity 10-4: College Tuitions (cont.)

(a)-(b) public: tuition = -13,138 + 9.59 * founded;  r2 = .257

private: tuition = 84,719 - 37.1 * founded;  r2 = .255

        The two values for r2 are very similar.
(c) The line on the private college scatterplot appears to do a better job of summarizing the relationship between tuition and founding year.  The points follow the linear relationship much more closely.
(d) public: $5,083;  private: $14,229;  Judging from the scatterplot, the private school prediction seems more reasonable because the points fall closer to the line in the area of 1900 on this scatterplot.
(c)-(e) for the Calculator version