![]() |
Chapter 9Linear Regression |
![]() |
Back to List | Introduction | Physical Measurements | Odds in Sports | Product Pricing | Exercises
The correlation coefficient and regression line are useful tools when analyzing the relationship between data sets. Remember the data sets must be paired, meaning that they consist of two measurements tied to the same experimental subject. For example, the height and weight of a person or their years of education and current salary. In this project you will find paired data on the Web and apply regression methods.

A perfect example of paired data can be found in the field of medicine. The site maintained by The Journal of Statistics Education contains a data archive that stores an interesting variety of data sets. In particular, there is a data set consisting of the heart rates and body temperatures for a group of 100 people.
Proceed to http://www.amstat.org/publications/jse/archive.htm and find the link to the "Normal Body Temperature, Gender, and Heart Rate" data as described by Allen Shoemaker of Calvin College. A text file describes the arrangement of the data. This data set clearly represents paired data, as each set of two measurements is associated with one person.
In the exercises you will be asked to study the correlation between the two physical measurements contained in this data. Be sure you understand the arrangement of the data and what the individual columns represent.

If you look at a newspaper sports page you see talk of "the spread" and in particular "beating the spread." Football analysts associated with the business of sports oddsmaking establish what is called the spread in the week prior to a football game. The pointspread or spread is an attempt to even out the bets. For example, if you bet on the team favored to win, your team must not only win, but it must win by at least a specified number of points.
Data have been collected which pair the specified spread before a game with the actual outcome of the game. Go to the data archives of the Journal of Statistics Education at http://www.amstat.org/publications/jse/archive.htm and locate the data sets that begin with "NFL."
The corresponding text file explains the format of the data set. In the exercises you will be using the data from 1996. In particular you will be concerned with the spread for each indicated game and the score of the game. For example, one row of the data looks like
09/15/96 Jacksonville 3 Oakland 17 - -6.5 40.0
So on September 15, 1996, the Jacksonville Jaguars lost to the Oakland Raiders by a score of 17 to 3. The predicted pointspread value was -6.5 . The pointspread may be a positive or negative value. We define the actual score differential to be
Visiting team score (listed first) - Home team score (listed second) = 3 - 17 = -14
So for this game we have the data pair (-6.5,-14), the pregame pointspread and the actual pointspread. The exercises will have you evaluate how well the oddsmakers do.

Time series data are quite simply data measured over at specific time period. Often such data are collected for the purpose of trend analysis and prediction. You can view any set of time series data as paired data where the independent variable is time and the dependent variable is whatever quantity of interest you are measuring.
A great assortment of time series data can be found at the Bureau of Labor Statistics Web site http://stats.bls.gov/data/. The site contains data not only related to the labor force but interesting consumer pricing information. In particular, from the main page proceed to the Most Requested Series and there click on the link for Average Price Data.
At that point you will see a list of common items and utilities, basic staples, that everyone spends money on. From fruit to electricity, the data are interesting but our focus is on gasoline prices. Using the various check boxes on the Average Price data page, request a table of data of unleaded gasoline prices for all years available (it's not that large.) Review and understand the table returned in your browser, because it will be the basis of the exercises below.
When you've completed each exercise, click "Submit for Grade" in order to submit your answers to your professor.
|
© 2000 by Addison Wesley Longman A division of Pearson Education |