This documents serves as an addendum to the Workshop
Statistics Guide for Instructors, covering issues that are specific
to the Fathom editon of the text. We thank Sue Peters for her assistance
in reviewing these comments. The latest modifications were made on
August 21, 2001.
In case your class is small enough that you would prefer to combine your data with those from other classes, sample data collected on students may be found here.
Activity 2-1 introduces students to the use of Fathom. Data entry is accomplished using a Case table. The text's instructions guide students to insert a blank Case table via the menus (Insert>CaseTable), although many students are just as comfortable dragging the Case table icon (which looks like a small spreadsheet - but easy to confuse with the Summary table icon that has an "S") as an alternate method to produce a Case table.
When students first enter something into a blank Case table, Fathom generates a second object, known as a Collection. In the early stages, students can easily be confused about the difference between the Collection and the Case table. They should think of the Case table as a convenient window to view and manipulate the data in the Collection. They may delete the Case table and the data are still present in the Collection, however deleting the Collection would also empty the Case table. The Collection may also contain other information such as descriptions of the attributes or measures defined for the entire collection that will be encountered in later topics. You may also have several Case tables opened for the same Collection, for example with different filters or sortings.
Once the data has been entered students should save it. Note that Fathom saves the entire document - not just the data. So a retrieved file should take a student back to exactly the state they were in when the document was saved, with the same data, graphs, and other objects on the screen. If your students need special instructions on where to save data at your installation, now is the time to give it to them.
This activity also includes the first look at Fathom graphs. As with the Case table, students may insert a generic blank graph by choosing Insert>Graph from the menus or dragging the graph icon (looks like a scatterplot) to a blank area on the screen. They also get experience with the "drag & drop" nature of Fathom for specifying an attribute to plot. They need to pay attention to the cursor which changes to an open hand when over a variable name in the Case table, then a closed hand when they click to "grab" that variable and drag it to the graph. When the (closed hand) cursor is properly positioned over the axis a black box appears around the axis and the attribute can be "dropped" into place. The dotplot is Fathom's default form for quantitative data. Items (e) and (f) of this activity pont out the linkage between Fathom objects, so that clicking on an extreme point in the dotplot automatically highlights that point in the Case table and any other plots.
The final important Fathom object to be introduced in this activity is the Formula Editor. This is the primary tool for creating new attributes that are functions of existing attributes. In opening the editor, one should note that Fathom's menus are sensitive to what object is currently selected. For example, if the cursor is in the blank space below the variable name (as it would be to enter the first value), the Edit>Edit Formula menu item will not be active (since a formula has to apply to the entire attribute - not just a single cell). Student's can easily overlook the instruction to "click on the name Ratio to highlight the column..." before choosing Edit>Edit Formula. Note that a right-click on the attribute (in Windows machines) will also bring up a menu with the editor option on it. In using the Formula Editor, students should find their own comfort level between the options of typing items directly into the editor and choosing them via point & click from the menus of items provided in the editor. The Formula Editor is used extensively in later activities so it is important for students to develop facility with its use.
In Activity 2-2 students get their first opportunity to use an existing dataset in a Fathom document (GenPhys.ftm). You will need to direct them to wherever the Fathom documents for your course are stored. This activity reinforces the use of the Formula Editor to calculate values for a new attribute and the procedure to create a dotplot while also introducing the sort procedure (Data>Sort Descending or a right-click on the attrribute to bring up the pop-up menu). Note that sorting by one column automatically rearranges all of the attributes, leaving each case intact.
Activity 2-3 gives more practice with using the Formula Editor to create new attributes from existing attributes. This would also be a good point to suggest that students save their final document (under a different name and/or location) at the conclusion of an activity for later reference.
Activity 2-4 contains a more complicated use of the Formula Editor, using the "if" option to assign categorical values "Lots" and "Few" to cases (states) based on the percentage of students taking the SAT exam. Getting the desired information into the proper places in the template provided for the "if" command can be a bit tricky. Students then see how the default plot for a categorical attribute is a bar chart and finally experience how easily they can drag & drop in Fathom to generate side-by-side plots for one attribute (SAT average) when the data are grouped by a second attribute (Takers="Few" or "Lots").
Homework activities 2-5 through 2-12 let students practice their newly acquired Fathom skills. Most introduce students to additional existing datasets, while two (2-5 and 2-12) require them to create the dataset themselves from scratch with their own data.
Activity 3-4 introduces the stemplot. There is no facility in the current version of Fathom to produce a stemplot.
Activity 3-5 introduces the histogram. This is the students first look at the pop-up list of alternate graphs that are available from clicking the upper right corner of a graph. Be prepared for inquisitive students who will naturally want to try the other types of graphs in the list. This activity also gives students a chance to manipulate a Fathom graph by dragging the edge of a rectangle in the histogram to change the bin size. This can be a tricky operation at first. Students should again be encouraged to pay attention the form of the cursor. A "hand" icon in the middle of a rectangle selects the rectangle itself, showing the boundaries and frequency in the lower left corner of the Fathom screen. A "double-arrow" indicates a boundary value (shown in the lower left). Click and drag when it's a double-arrow to move the boundaries. Click when it's a hand to highlight all cases in that class. This is a good point to remind students about the "undo" feature to restore a histogram to its original condition after they have dragged the boundaries around. For a histogram with horizontal bars, drag the attribute to the vertical (rather than horizontal) axis. The default scale is the frequency count in each class, but choosing Graph>Scale can convert this to Relative Frequency or Density Frequency. The latter adjusts the scale so that the total area of the rectangles is equal to one. This is useful when you'd like to superimpose a density function (for example, by choosing Graph>Plot Function and entering a function like NormalDensity(x,mean(x),s(x)), where x is the attribute being plotted).
Homework Activities 3-8, 3-10, 3-16, and 3-18 ask students to create histograms for new data from Fathom files, while 3-12 and 3-19 allow them to use their own data. Be aware that 3-16(c) involves a Fathom-specific issue dealing with quantitative vs. categorical attributes. By default, dragging a quantitative variable to a graph produces a dotplot (and thus a histogram, boxplot, or other related graph). A categorical attribute produces a bar chart (which can be changed to a ribbon chart). In some cases, one might want to treat a quantititative variable (e.g. year in school) as a categorical object. This is accomplished in Fathom by holding down the Shift key while doing the dragging. The comparison between a bar chart and histogram for the same data in 3-16(c) can help students understand the distinction between the two types of graphs.
Activity 4-2 introduces the Summary table as the primary Fathom object to compute summary statistics. As with a graph, students may choose Insert>Summary Table from the menus or drag the appropriate icon from the icon bar to a blank space on the screen. Students tend to prefer the latter method as they become more experienced since it allows them greater control over exactly where the new table is placed. Variables are dragged from a Case table to the Summary table in a manner analagous to producing a graph. The default statistic is the mean, but 4-2(b) quickly shows students how to access other statistics such as the median. A right-click on the summary table provides a short cut to the "Add Formula" menu item. Note that these functions should be entered as mean( ) and median( ) with no argument in the parentheses. This allows Fathom to apply the same function to multiple attributes in the same Summary table.
This activity also demonstrates 4-2(c) how students can obtain summary statistics for subgroups determined by a categorical variable. This process should remind students of a similar procedure to create side-by-side plots that they encountered in Activity 2-4.
Note that the Fathom documents for 4-2(f)-(i) are identified in the table below 4-2(i).
Activity 4-3 shows students how to show numeric values (in this case the mean and median) directly on a plot. The dynamic linkage between the Case table, graph and calculated statistics is exploited in this activity - first by allowing the students to delete cases and observe the effects on the graph, mean and median, then by encouraging them, in 4-3(h), to drag one data point in the dotplot to observe the effect on the mean and median. This provides a dynamic illustration of the concept of resistance that students find very useful.
Fathom is needed for Homework Activities 4-9, 4-11, and 4-16 and may be useful for 4-6, 4-7, 4-12, 4-13 and 4-15. A clever student might choose to exploit the "drag a point" feature of Fathom to help construct examples for Activities 4-12 and 4-13.
Activity 5-1 contains instructions for adding the formulas for Q1( ), Q3( ), and IQR( ) to a Summary table. Fathom uses a slightly different algorithm for computing the quartiles, so "hand" calculations will sometimes differ slightly from Fathom's results. For an odd number of data values, Fathom includes the unique median point in each "half" of the data when computing the quartiles, while the text's instructions would use only observations falling above or below the median. The lack of standardization for the computation of the median is apparent when various technologies (e.g. Fathom, Minitab, and the TI-83 calculator) each use a slightly different scheme for computing quartiles. This activity also has the first look at a boxplot in Fathom. Be aware that Fathom always draws a "modified" boxplot (with potential outliers shown as individual dots), although the students won't encounter this idea formally until Activity 6-2.
Activity 5-2 asks students to use the s( ) function to verify their "hand" calculation of a standard deviation.
Activity 5-4 compares the effects of an outlier on the standard deviation, interquartile range, and range. Note that Fathom does not have an built-in "range" function, but one can easily be constructed with a max( ) - min( ) calculation. This activity provides another instance where Fathom's dynamic linking provides good illustration of the concept of resistance as an extreme point is dragged.
Homework Activities 5-8, 5-10, 5-13, 5-16, 5-22, 5-24 and 5-26 all call for the use of Fathom, particularly whenever a standard deviation is to be calculated. Activity 5-25 asks for a five number summary of the numbers 1, 2, 3, 4,and 5. Interestingly, Fathom's five-number summary is exactly those five numbers, while a "hand" calculation by the book's method would yield 1, 1.5, 3, 4.5, and 5.
Activity 6-2 introduces the procedure for constructing a modified boxplot (by hand) - checking for points that lie more than 1.5 IQR beyond the quartiles. This is the procedure used for any boxplot produced in Fathom. Although the activity only asks students to produce plots by hand, you might suggest that they check their results by producing the same boxplots in Fathom, after doing Activity 6-4 since the data in Golfers.ftm are already in the form required for that procdedure.
Activity 6-4 contains the only new Fathom instructions in this topic. Students may easily split an existing plot of a quantitative variable or sample statistics in a summary table to show separate subgroups by dragging a categorical attribute identifying the groups to the other axis on either the plot or summary table. Watch for students who might drop the categorical atrribute in the middle of the graph, rather than the vertical axis. If a plot shows individual points, dropping on the middle of the graph will show the points for the groups with different colors and symbols, but still in a single plot. This has little effect on a boxplot unless outliers are present. They should remember to wait for the axis to be highlighted to know they are dropping to the correct spot
This is also a good point for a hint about setting up the data. Comparisons will work much more efficiently in Fathom (and many other statistics packages) if all the data for an attribute are in the same column and a second attribute identifies the groups (rather than having one attribute for the ages of Oscar winning films and the ages for non-Oscar winning films in a separate columnm).
Students should access existing Fathom documents with data for Homework Activities 6-5, 6-7, 6-10, and 6-18. Using Fathom is more optional for 6-8, 6-9, 6-20 and 6-21. Other homework activities either specifically request work to be done by hand (by necessity in instances with stemplots) or provide graphical displays in the text.
Sample data collected on students about the "Preliminaries" questions may be found as a Fathom document here.
Activity 7-5 contains the instructions for using Fathom to analyze gender and party affiliation in the 1999 U.S. Senate. Dragging one attribute to a blank Summary table produces a table of counts. Dragging the second attribute to the other dimension converts to a two-way table with marginal totals. 7-5 (d) goes through similar steps with a graph. The first categorical attribute produces a bar chart (note that the bars are ordered alphabetically). Changing this to a ribbon chart produces a single horizontal "bar" that is divided to show the proportion in each group on the horizontal axis. When students drop the gender attibute to the middle of the graph the rectangles are further subdivided (vertically) to show the gender distribution within each party. Many instructors may be less familiar with the two-dimensional ribbon plot, but it presents essentially the same information as the segmented bar graphs shown in Activities 7-2 and 7-3. The only difference is that, in the ribbon chart, the widths of the bars are in proportion to the frequencies of the horizontal categories, while the segmented bar graph uses the same bar width for each group. Students should look back at their two-way table to help see the connections. Some students may drop the second variable on the vertical axis rather than into the middle of the graph. This produces a breakdown plot, showing the count in each cell of the table graphically with a dot representing each individual. Finally, 7-5(e) asks the students to reverse the roles of gender and party affiliation in the ribbon chart, thus showing the conditional distribution of party within gender (rather than gender within party). This distinction between which attribute is the "condition" is difficult for students to grasp, although the extreme gender split in the senate helps them see the difference - particularly in the ribbon charts.
Homework Activitiy 7-15 includes a note about adding column
proportions to a two-way table in Fathom to show a conditional distribution
numerically, then compares the results to a ribbon plot. You may
want to point out the corresponding function to do row proportions
to show the conditional distribution in the other direction. Most of the
other homework actvities provide just summary data for the two-way table,
thus should not require Fathom. Possible exceptions are 7-16, 7-17,
and 7-18 where students may already have the raw data in Fathom files.
Activity 8-4 contains the instructions for using Fathom to produce a scatterplot. This is an easy process for students who should now be adept at dragging an attribute and dropping it on an axis. Part (d) of this activity introduces another new Fathom tool - the filter. Students should think of a filter as a mechanism to restrict the analysis to only those cases that satisfy its condition. It is important to note that a filter applies only to the object that was selected when it was created. Thus a filter applied to a graph would restrict the data points shown, but all of the cases would still be visible in a case table that wasn't filtered. This allows multiple graphs (or other Fathom objects) to be displayed with different filters for the same data. The one exception is that a filter applied to the collection box itself affects any object that is derived from that collection. Note also that the cases are not actually deleted from the collection, so they may be restored by deleting or modifying the filter.
Activity 8-5 contains the other new Fathom techniques in this topic - two methods for highlighting categorical groups to produce a labelled scatterplot. The first method, dragging a categorical attribute to the middle of the graph (not an axis), should be familiar to students who used a similar operation to produce the ribbon chart in the previous topic. The resulting plot uses both different symbols and colors to distinguish the points and includes a key. The second method for highlighting points in a scatterplot, creating a barchart and then clicking on a bar (or multiple bars) to highlight the points in the plot, is an effective alternative when many categories (or lots of points in the scatterplot) might make a labelled scatterplot too confusing to view well. The second method can also be generalized to exploit the linkage between Fathom objects to highlight the points selected in one object in all other objects where those cases appear.
Each of the Homework Activities have associated Fathom documents, although 8-7, 8-8, 8-10, 8-18, and 8-19 do not require the use of Fathom. Homework Activity 8-9 contains Fathom instructions for putting a "y=x" line on a scatterplot. This activity also contains a hint that directs students to click on an unusual point in the scatterplot and find the corresponding case in the case table. Students will encounter many of these datasets in the homework activities of the next topic (correlation). They will benefit from exposure to these data situations graphically before they move to the numerical summaries in Topic 9.
The other option is to enter the correlation formula as a measure for the collection. Double-click on the Collection Box and select the "Measures" tab from the Collection Inspector that appears. Replace the <new> with a name for the correlation (a simple r would work), then double-click in the space under "Formula" to bring up the Formula Editor and enter the same formula that would be used in the summary table.
Activity 9-1 asks students to have Fathom compute a correlation as a formula within a Summary table. Students find the process of re-editing the correlation formula to produce the correlations required for 9-1(c) to be a bit tedious - having been spoiled by the drag & drop ease with which attributes are selected for most other objects in Fathom. It helps to remind them that the attributes are all listed in the "menu" of items that appears within the Formula Editor. In 9-1(j) students will again use a filter to select only the public four year colleges (Type="pub4") and then the private colleges (Type="priv4") in 9-1(k). Although the instructions are explicit, you will need to remind some students that the filter needs to be added to the Collection Box in order to affect both the graph and summary table.
Activity 9-2(d) makes good use of the Fathom's linkage by allowing students to drag a single point around the scatterplot and see how the correlation is affected.
Activity 9-5 uses two pre-made Fathom documents. The first (CorrSlider.ftm) is very simple, but gives students their first look at a Fathom slider. They should click on the slider (at zero on startup) and drag it back and forth to change the correlation and note how the corresponding scatterplot changes. (Note: If students change the slider's scale, values beyond +/- 1 are treated as correlations of +/- 1.) This gets them started at judging the value of the correlation from the scatterplot, then they are ready to play the Correlation Guessing Game (CorrGuess.ftm). The game is also quite simple - they see a scatterplot and guess its correlation, but a bit tricky to implement in Fathom since we want to collect their guesses and the actual correlations to analyze when the "game" is over. There are instructions for playing the game that pop-up as a text box when students click at the appropriate spot. They can click on the graph to hide those instructions. The first step of the instructions describe how to bring up the Collection Inspector. The inspector will appear at different locations on the screen, depending on individual display characteristics, so students will probably need to move it to a convenient location. The actual correlation will be displayed as a meaure in this inspector. Naturally, students should make their guess before clicking on the "Measures" tab to reveal the answer. They need to take special care to click on the "Cases" tab in the inspector to hide the measures before requesting a new plot (Ctrl-Y) and making their next guess. Students have a lot of fun with this activity and frequently develop friendly competitions with their partners or neighboring groups.
A Java applet that implements this activity is available here.
Since we are assuming that students will use Fathom to compute correlations, the Homework Activities will almost all require Fathom. The exceptions are Activities 9-8, 9-15, 9-16, that look only at the direction of association or examine issues of cause and effect.
Activity 10-1 walks students through a unique introduction to the concept of a least squares line. Fathom allows students to place a "moveable" line on a scatterplot and adjust the slope and position of the line by dragging it. After they settle on a good "eyeball" fit, they can obtain the sum of squared residuals, together with a display of squares drawn at each data point to represent the squared errors. They may then make further adjustments to their line while monitoring the effect on the sum of squared residuals and trying to minimize that quantity. The goal is to give them an intuitive feel for the idea behind least squares esitmation. You might encourage a bit of competition among neighboring groups or the class as a whole to see who can obtain the smallest sum of squares value. Students are usually impressed with their own work when they finally ask Fathom to display the actual least squares line and it lines up so nicely with their own line. A word of warning when applying the "Show Squares" option to other data sets - keep the number of data points relatively small to avoid confusing clutter when the error boxes are added to the graph.
Activity 10-3 presents two ways to get at the idea of using r^{2} to measure the proportion of variability in the response variable that is explained by the predictor. The first, 10-3(j), is primarily numeric. Students create a new attribute with the deviations from the mean of the response attribute (airfare) then define a measure to compute the sum of squared deviations. Note the typo in 10-3(j) on page 227 where the "2" at the end of "sum(Deviation2)" should be an exponent, i.e. sum(Deviation^{2}). Since they already know the sum of squared residuals from the regression line, they can easily see how much improvement is gained over using just the mean as a predictor.
Activity 10-3(k) presents a graphical look at this same phenomenon that takes advantage of Fathom's moveable line technology. Students are asked to rotate a moveable line to be horizontal (thus graphically showing no influence due to the predictor variable) and move it up and down to obtain the best fit, again using the sum of squared residuals as their criterion. You might want to switch the order of the last two bullets in 10-3(k) so that students discover that the point at which the horizontal line minimizes the sum of squares is the mean of the response variable. Having both the horizontal (mean) line and the least squares line displayed on the same plot, with the sum of squared deviations for both shown at the bottom of the screen, gives students a good start at understanding what r^{2} is measuring. Note also that the value of r^{2} is always displayed next to the equation of the line when a least squares line is requested for a Fathom scatterplot.
All of the Homework Activities for this topic (except 10-5, 10-12, and 10-17) assume that students will use Fathom and existing datasets to answer questions about least squares lines.
Activity 11-1(g) directs students to use Fathom's default residual plot. The plot appears conveniently right below the scatterplot (using the same horizontal scale) for easy comparison. Students will probably want to enlarge the entire plot in the vertical direction to view both graphs more clearly.
The last part of Activity 11-1(n) takes advantage, once again, of Fathom's graphical linkage to allow students to watch how the regression line reacts as a data point is dragged around a scatterplot. This provides a dynamic illustration for students to experience the concept of an influential point in regression. Be sure that they also try a point with a more typical predictor value to see that the influence is not nearly so strong when the "x"-coordinate is closer to the mean. You might mention that the potential to influence a least squares line is often referred to as the "leverage" of a data point, since the dynamic interplay of a single extreme point and the least squares line makes the analogy to a "teeter-totter" so compelling.
In Activity 11-3 students see how one can easily transform either the predictor or response attribute to try to improve the linearity of the fit. Although some software (and calculators) allow students to pick from a menu of automated transformed regression models, they need to explicitly create the transformed variables in Fathom. This makes the process slightly more cumbersome, but helps ensure that the students understand what they are doing to the data - rather than relying on a "black box" to spit out a model.
All of the Homework Activities for this topic, except 11-15, require the use of Fathom. You might adapt Activity 11-5, which looks for the best power function to relate distance from the sun to period of revolution for planets, to use a Fathom slider to control the exponent. To do this, replace the "2" in the exponent of the formula Distance^{2}with a letter "N" to give Distance^{N}. Then drag a Fathom slider to the screen, change the name of the slider from "V1" to "N" and adjust the scale to cover numbers from around 1 to about 2. Dragging the slider will automatically tranform the predictor variable until a good linear fit is obtained (at a power near 3/2). You can obtain a similar effect by starting with a scatterplot of just Distance vs. Revolution and using "Plot Function" to place a Distance^{N} plot on the graph.
Activity 12-5 describes the process for using Fathom to generate a random sample from an existing collection. This will probably be students first exposure to a collection that is derived from another collection - a concept that is very important in later topics that deal with sampling distributions. With multiple collections in the same Fathom document, they usually need to be reminded to select the object they want to work with. For example, dragging in a new case table could show the contents of the original collection, the new sample collection, or a blank (empty) collection, depending on which collection was selected. Students also see in this activity how to use the Inspector for the sample collection to adjust the details of the method of sampling (with/without replacement, sample size) and to perform repeated samples. If you find the sound associated with repeated sampling gets a bit overwhelming, you can either click off the "animation On" option, or turn the sound off with Edit>Preferences, or turn the volume down on individual machines. On Windows machines, hitting Ctrl-Y is a convenient way to quickly generate new samples once the sample collection has been established - but be sure the sample collection has been selected..
Activity 12-6 takes the sampling procedure one step further by introducing a third collection to keep track of the results for repeated samples. Encourage your students to work through this activity very carefully since its easy to confuse the roles of the three collections. The first is the original collection (i.e. the population in this case), the second contains a typical sample (that changes each time) from that population, and the third keeps track of some aspects of the samples (as defined by the measures on the sample collection). We find it helpful for students to organize their screens into thirds (as shown on page 276) and keep the information related to each of the collections together. The Inspector of the third (Measures from...) collection contains a "Collect Measures" tab that alows students to specify the number of samples to collect. For larger numbers of samples, you will probably want to turn the "Animation" off to avoid delays (depending on the speed of your computers). Although a bit complicated at first, a thorough understanding of this process will help students significantly when they use Fathom to explore characteristics of samples and sampling distributions in later topics.
Homework Activity 12-18 is the only homework activity that requires Fathom. It asks students to repeat what they did in Activities 12-5 and 12-6 for samples of size 5 and 20 senators. If you'd like to give your students additional practice, you might suggest that they use Fathom to select samples from the Cal Poly football roster (HW Activity 12-16 uses the random number table). The data can be found in CPFootball.ftm.
Activity 14-4 has students use Fathom's randomBinomial(n,p)
function to simulate the number of girls in families with four and ten
children. Although the instructions have the attributes and formulas created
through the Collection Inspector, your students may be just as comfortable
working from a Case Table. Be sure that they observe the note to
hold down the shift key while dragging the attributes to a Summary Table
so that Fathom will treat them as categorical variables and give counts
(rather than a mean for a quantitative variable).
Most of the Homework Activities do not require Fathom - with a couple of notable exceptions. HW Activity 14-12 uses random binomial values and is very similar to in-class Activity 14-4. HW Activity 14-13 is a much more ambitious simulation that uses several new Fathom functons, including randompick( ) to simulate individual coin flips by randomly generating "H" and "T" values, the runlength( ) function to keep track of the number of consecutive "flips" with the same outcome, and the max( ) function (as a measure) to find the longest run in a set of 10 simulated flips. This activity also uses Fathom to collect the measures from repeated trials into a new collection to study the distribution of maximum run lengths. While Activity 14-13 is quite involved, it can give movitated students enough experience with using Fathom to simulate a probabilistic situation that they might be able to use it on their own for simulating some other situation of interest.
HW Activity 14-9 is a pre-made Fathom simulation (RaquetSpins.ftm) to show a long run probability settling down to a fixed value. Students can use a slider to generate repeated trials and adjust the probability value to obtain various graphs such as those shown in activity 14-3(f).
HW Activity 14-15 asks students to develop a method (using random numbers,
a die, or Fathom) to simulate the number of solitaire games one needs to
play to get a first win when the probability of a win is 1/6 for each game
- then repeat to count the number of tries for each of 25 wins. Here are
a couple of Fathom approaches that one might consider.
Method #1: Use randomBinomial(1,1/6)
function in an attribute called game to simulate each game with
the result being 1 for a "win" and 0 for a "loss". Include lots of cases
(at least 500), then add a second attribute (call trials) with the
formula runlength(game).
Scroll through the case table looking for 1's (wins) and record the number
of the previous trials plus one as the number of games played to
get that win.
Method #2: For a winning probability of 1/6, create a collection with
a single "W" and five "L" values in its only attribute (call it result).
With the collection selected, choose Analyze>Sample Cases, double-click
on the new "Samples of..." collection to bring up its Inspector,
choose the "Sample" tab, and select the button next to "Until". Be
sure that "With replacement" is checked and double-click in the space under
"Until" to bring up the Formula Editor and enter the formula result="W".
Now whenever you "sample more cases" the number of cases will be the number
of "games" until the first "W" appears. To automate the process even further,
you could create a measure for the "Samples of..." collection with
the formula count(result)
to automatically count the number of games for each win, then use Analyze>Collect
Measures to keep track of those counts in a new collection.
Activity 15-1(d) uses a pre-made Fathom document to help students see how changing the mean and standard deviation affect a normal curve. With sliders to control either parameter, students can very quickly investigate these effects.
Activity 15-2 introduces students to Fathom's normalCumulative(x,mean,std dev) function in part (e) for calculating a normal probability. Although part (e) puts the formula in a blank summary table, it could be just as easily entered as a measure for an existing collection. The "reverse" process, normalQuantile(p,mean,std dev), for finding a normal endpoint with a given probability is described in 15-2(k).
None of the Homework Activities specifically ask for students
to use Fathom, although they could do so for any that require normal calculations.
You may want to request that a particular technology (Fathom, normal table,
or calculator) be used for certain problems if you want to be sure that
students get some practice in that setting.
Activity 16-3 uses Fathom to simulate proportions from samples
of Reese's Pieces. Note the importance of first doing the simulation "by
hand" with real candies in Activity 16-2 before moving to technology to
simulate the same process many more times and much more efficiently. The
randomBinomial(25,0.45)
function computes the count for an entire sample - so each case
represents a new sample, then a new attribute that divides each count by
25 gives the sample proportion. Note that 16-3(f) uses a shift-drag
to display the sample proportions as categorical data in a summary table
to make it easier to count how many are in each group. The default summary
table size is too small to show the whole table, so you may need to remind
your students to resize it. The Fathom note (at the top of page 372) describes
how to automate this counting process as a measure on the collection. After
working with samples of size 25, 16-3(i)-(m) look at a larger sample size
(75). Finally, 16-3(n) introduces a pre-made Fathom document (SamplePhat.ftm)
that automates this process by attaching both the sample size and population
proportion to sliders. One thing to notice when playing with samplePhat.ftm
is that the normality assumption visually breaks down as the proportions
become more extreme (near 0 or 1) - especially for small sample sizes.
This helps motivate the conditions for the CLT on page 375. Since the mean
and standard deviation of the sampling distribution is displayed, students
can also watch for the standard deviation to decrease as the proportion
moves away from 0.5.
We have also developed a Java applet available here for simulating the sampling of Reese's Pieces that students seem to find appealing.
Homework Activities 16-9, 16-11, 16-12, and 16-15 give students
more practice using Fathom to simulate sample proportions.
Activity 17-2 has students use Fathom to simulate sample
means drawn from a population of 1000 pennies. While the concept is similar
to the sample proportions in Activity 16-3 the process is more involved
since we need to select the individual samples - rather than rely on the
random binomial values to simulate the counts. As the introduction in 17-2(a)
indicates, students will eventually have three collections - the original
population, a sample from that population, and the sample means obtained
from repeated samples. You might remind them that they've already seen
this procedure back in Activity 12-6. Encourage them to work through the
directions carefully and keep track of which collection they are using
at each point (e.g. pay attention that the proper object is highlighted).
The latter parts of Activity 17-2 have students explore the effects of changing sample size and the nature of the underlying distribution. Parts 17-2(k) through 17-2(n) are a bit easier since the generation of a sample via Fathom's random number generators removes the necessity of sampling from an existing collection - thus students will need only two collections rather than three. Depending on your students facility with using Fathom, you might automate this activity even more by giving them a pre-made Fathom document and allowing them to just adjust the parameters.
Homework Activity 17-6 is the only one that asks students to use Fathom to simulate a sampling distribution by methods similar to Activity 17-2. Students may also use Fathom for constucting visual displays in HW Activity 17-5.
Activity 19-4 introduces students to using Fathom to calculate a confidence interval for a proportion within an Estimate Box. In contrast to other Fathom procedures students have encountered earlier, this may be done without first defining any collection or data. Most of Fathom's inference procedures can work with summary statistics entered directly into the Estimate Box. Any terms that appear in blue type can be replaced with user supplied values. Numeric values such as the confidence level, count, or sample size are actually Fathom formulas (so they could, for example, be linked to sliders). They require an extra step to specify since one must first click on the blue value, then click again when the "Change formula for value..." button appears. Note the graphical "f(x)" that appears next to the cursor when pointing at one of these values. Other values, such as the <Attribute Name> or <Category> can be changed simply by typing over the text in blue. You might also need to remind students to resize the box if all the results are not visible.
If the raw data do appear in a Fathom collection, students can use the usual drag operation to specify a variable to be analyzed in the Estimate Box (after specifying the type of parameter to estimate). Note that the variable(s) should be "dropped" at the top of the Estimate Box (where it says "Attribute(categorical)") - not at the blue <AttributeName> text. When dragging a variable from an existing collection, Fathom computes the proportion from the "first" category (alphabetically). Click on the (blue) category name to bring up a menu of other categories (values for the attribute).
Activity 19-5 uses an existing Fathom document, SimReeses.ftm, to begin the process of simulating many confidence intervals. The original collection in this document consists of a single attribute that is generated by a formula to simulate a sample drawn from a population with a fixed proportion of each of the colors orange (45%), brown (20%) and yellow (35%). Although students don't need to see the details of this formula, you can check to see that it uses Fathom's conditional "if" function to determine if a random number is less than a preset proportion of oranges (POrange) in which case the value is set to "Orange". Otherwise, a second "if" statement determines whether the color is set to "Brown" or "Yellow", based on the value of PBrown. The values for the parameters POrange and PBrown are specified as measures for the collection and the original number of cases is set to 75. This is a somewhat more explicit procedure for simulating a sample of candies than the students encountered in Activity 16-3 when we had them use the randomBinomial function to generate counts for several samples. In this case we want them to use the Estimate Box to find the confidence interval for proportion of orange candies based on the raw data in a sample - not just the count and sample proportion. Getting the interval in this manner will allow Fathom to collect the intervals for repeated simulated samples automatically with the "Collect Measures" procedure. As part of the new "Measures from..." collection, Fathom records the sample count, sample proportion, lower and upper confidence limits, and a number of other attributes. When setting up the original confidence interval in an Estimate Box, students will need to use the "drag an attribute" method as described above and remember to change the category from "Brown" (the default alphabetically) to "Orange". The goal of this simulation is to have students generate enough 95% confidence intervals (200) to see that about 5% will fail to cover the "true" proportion of 45%. The sorting procedure described in 19-5(c) helps them find and count the intervals that "fail" in the "Measures from..." collection. Be sure that they check both ends of the list for extreme intervals.
The Homework Activities provide a number of opportunites for students to practice constructing and interpreting confidence intervals for proportions. Most contain summary data with sample size and count (or sample proportion) where students could use either the formula on page 420 or Fathom to calculate the interval. Particularly in HW Activities such as 19-14 and 19-15 that systematically vary a parameter such as the sample size or confidence level, we recommend using Fathom to perform the calculations.
Activity 20-1 contains the introduction to the t-distribution and starts off with a Fathom document, Tplot.ftm, that displays density plots for both the normal and t-distributions - allowing students to change the degrees of freedom via a slider and compare the distributions.
Activity 20-5 asks students to use Fathom to calculate a confidence interval for a mean. The process is very similar to what they encountered in Activity 19-4 for a proportion confidence interval. When working with a mean, they will more often drag an attribute to the Estimate Box, rather than enter summary statistics directly. An example of the latter can be found in HW Activity 20-10.
Although Fathom can be used to find critical values of the t-distribution, the text never asks students to do so. If you would like to add this, we recommend a formula of the form tquantile(0.975,df), where the first argument is the probability below the desired t* value. This might be done in conjunction with HW Activity 20-7 where students are asked to find lots of t* values from the t-table. You can also download a Fathom document, T-Star.ftm, that has sliders for both degrees of freedom and confidence level - showing a display of the t-density with t* values marked.
Homework Activities 20-7(e), 20-9(c), 20-10, and 20-18 all require Fathom and the software may also be used for 20-8, 20-12, 20-13, 20-15, 20-16, and 20-17 - depending on how you want to balance hand calculation vs. technology.
HW Activity 20-18 takes students through a simulation of many
confidence intervals for repeated samples from a population. This follows
a similar procedure to Activity 19-5 (for proportions), but requires three
collections- the original population (1000Pennies.ftm),
a sample of those pennies, and a collection of measures drawn from an Estimation
Box that has a confidence interval for the mean age in the sample collection.
Students may refer back to Activity 17-2 that used a similar set of three
collections to illustrate the sampling distribution of the mean for the
same penny population.
Activity 21-3 would use Fathom only in part (f), although you might choose to have students do that test and provide the sketch by hand.
Activity 21-4 uses Fathom to very quickly check tests for different sample sizes where the sample proportion stays fixed. If you or your students are feeling especially confident about your Fathom abilities, you might want to automate this activity by creating a slider for the sample size. A technical hint: since the slider can include non-integer values, use Fathom's round( ) function to produce formulas like round(n)and round(0.30*n)for the sample size and count in the Estimate Box.
The Homework Activities all involve proportion tests from summary data and can be done by hand, with Fathom, or a mixture of the two.
Activity 22-2 has students complete the details of a test for a mean by hand and then confirms those calculations using a test via Fathom in 22-2(g). Although most of the proportion tests in the previous activity involved entering the summary statistics directly into a Fathom Test Box, the test for a mean in this section uses the "drag & drop" tehniques to specify the attribute to be analyzed. Students may need to be warned to drag the Points attribute to the top of the window where it says "Attribute(continuous):<unassigned>" rather than the blue "Attribute name" that is just below this line - watch for the border to appear when at the right place to drop it. They should also be sure to edit both the direction and value of the hypotheses (shown in blue).
Activity 22-3 starts by asking students about their intuition for how significance depends on sample size and variability of some hypothetical data and then lets them use Fathom to perform several tests quickly to check their conjectures.
Activity 22-4 has students use Fathom to perform a paired difference in means test. Fathom does not have a procedure to do this automatically, rather the students must use the Formula Editor to explicitly create a new attribute with the differences and then perform a test for a single mean on those differences. This activity also has them use Fathom to show the distribution of the test statistic (under Ho) and compute a confidence interval for the average difference.
Homework Activities 22-6, 22-7, 22-10, 22-11, 22-12 and 22-16 each require Fathom and the software might also be used for testing from summary statistics in 22-9, 22-13, 22-14, 22-15 and 22-17. HW Activity 22-6(d) shows students how to use Fathom as a "t-table" - to find exact p-values using the built-in tCumulative(x,df)function, while 22-6(e) uses a pre-made Fathom document (T-Table.ftm) that attaches sliders to both parameters of the tcumulative function. HW Activity 22-16 is a fairly involved simulation to show the distribution of p-values for repeated samples based on the "Baby Match" problem introduced in Activity 14-2. This uses the trinity of collections (Population, Sample of ..., Measures from...) that students have encountered in several previous simulations, so they might be ready to do this on their own as part of HW at this point. If not, you might want to add this as a regular "in-class" activity for this topic.
Sample data collected on students about the "Preliminaries" questions may be found as a Fathom document here.
Activities 23-1, 23-2, and 23-3 each have students compute confidence intervals and tests for a proportion from summary data with Fathom.
Activity 23-4 introduces students to the concept of power through a simulation with Fathom. They should produce a collection with two attributes containing 1000 cases and the formulas randomBinomial(30,0.250) and randomBinomial(30,0.333), respectively, to simulate the number of hits in 30 at bats for two different hitting abilities. In 23-4(b) they are asked to look at a histogram of results for the 0.250 hitter and find a point that marks the upper 5% of the distribution (thus serving as a critical value for an upper tail test). They want about 50 values in this upper tail, so moving the cursor over each bar in the tail of the histogram would allow them to read the count for each number of hits in the lower left corner of the Fathom screen (see Activity 3-5). As an alternate procedure, they might sort the values in the Case Table and count there. They should find about 12 hits (out of 30) works well. The next item, 23-4(c) asks them to produce a similar display for a 0.333 hitter using the same scale. There are a couple of ways to approach this in Fathom. One could produce two plots and adjust the scales (by dragging or double-clicking on the scale itself to set its bounds). A slightly more complicated method is to click to select the collection containing the simulated values and choose Analyze>Stack Attributes from the top menus. This creates a new collection with both of the original attributes appearing in a single column (named Value) and a second attribute identifying the Group each value came from. Students may then produce a histogram of the Values and drag the Group attribute to the vertical axis to provide separate histograms for the two groups using the same scale. This second approach takes a bit more time to set up, but the plots will update autoamtically when students edit the formulas of the original collection to simulate 100 at bats in 23-4(g). Note that students should not feel compelled to give overly precise answers in (d) or (g) - we do mean just a rough estimate of the power in each case.
Activity 23-6 uses Fathom to provide summary statistics, compute confidence intervals for means, and then produce visual displays to demonstrate that data with the same summary characteristics might have drastically different appearances.
Almost all of the Homework Activities involve constructing confidence intervals and/or performing tests for proportions or means. You may decide how to balance these between hand calculations and Fathom - but we especially recommend using Fathom in HW Activities such as 23-15, 23-20, 23-21, and 23-23 that require multiple inferences with varying parameters. HW Activities 23-25 and 23-26 definitely require Fathom to have students work out the details of simuations (similar to Activity 23-4) to verify the effects of moving the "true" proportion or changing the significance level on the power of a test. These are both concepts that they considered intuitively in 23-4(h) and 23-4(i).
Activity 24-3 introduces the two sample test for proportions with Fathom. Students, already experienced in filling in "blue" values with summary counts for the test of a single proportion, have little difficulty adapting to the two sample comparison.
All of the Homework Activities involve doing either a confidence interval or test for difference in two proportions (or both). HW Activity 24-6 is the first place students will encounter using Fathom to do a confidence interval for the difference in two proportions. As with earlier topics, you may determine whether you want students to use Fathom to help with the calculations. Data for summary counts are given in each problem. HW Activities 24-7, 24-17, 24-19, 24-22 and 24-23 require calculating several tests or intervals that would be most appropriate to do with Fathom.
Activity 25-2 introduces the use of Fathom to do both t-tests and confidence intervals to compare two sample means. Dragging & dropping pairs of attributes to the same test box makes the comparisons of 25-2(k) especially convenient.
Activity 25-3 asks students to conduct two tests from summary statistics rather than raw data. Although we intend for these to be done by hand, if time is short you could have students enter the summary statistics into a Fathom test box.
Many of the Homework Activities (for example 25-11 thru 25-21) contain raw data in Fathom files. Others have summary data that could be analyzed by hand or with Fathom.
Activity 26-5 has students do a simulation of the distribution of the chi-square statistic by randomizing one of the categorical attributes (attitude towards spending on the space program) while fixing a second attribute (political viewpoint). Note that the catgegories are ordered alphabetically in Fathom's display and may differ from the order shown in the text. Be sure that your students understand that the "scrambling" of one attribute produces a situation where the two attributes are independent. Once again, they will obtain three collections - the original sample, the scrambled version, and a collection of the chi-square test results from each new randomization. In this case the measures in the third collection are derived from the "Test of Independence" test box, rather than user-defined measures on the intermediate collection.
Most of the Homework Activities contain two-way tables that could be analyzed by hand or with Fathom. HW Activity 26-8 is specifically designed with partial calculations given to facilitate hand calculations. HW Activities 26-11 and 26-13 involve generating new tables from existing tables and are best done with Fathom.
Activity 27-2 might be viewed as your Fathom final exam. Start with some basics by producing a scatterplot (27-2(b)), then find least squares lines and correlations (27-2(c) and 27-2(f)). One more simulation starts with 27-2(g) by choosing samples from a large bivariate population to see how the sample slopes behave for repeated samples. We break with Fathom tradition a bit by asking students to record the slope and intercept by hand (27-2(h)), rather than collecting measures from the sample into a third collection. A new Fathom function is introduced (in 27-2(l)) to compute the standard error of the slope - with probably the longest formula name they will encounter, LinRegrSESlope(X,Y) - a good time to remind students that they can click within the formula editors menus to find hard to remember or long to type formula names. After guiding students through a hand calculation of the t-test for slope, they see the Fathom test (27-2(o)) and then the confidence interval for slope (27-2(s)). Finally, with a bit of effort, they can use Fathom to produce a neat plot showing each of their sample regression lines, giving a visual image of how sample regression lines can vary.
All of the Homework Activities require Fathom to analyze existing
data files. HW Activity 27-9 includes one more simulation to see how the
correlation varies when one of the attributes in a pair is repeatedly scrambled.