Chapter 7

Hypothesis Testing

Back to List | Introduction | Batting Performance | Utility Prices | IQ Test | Physical Constant | Exercises

Introduction

You have learned about a very important class of statistical methods called hypothesis tests. The techniques of hypothesis testing provide a formal way to use sample data to test claims made about population parameters. In this project you will setup and conduct tests on Internet based data. First you must collect and understand the data and its source.

Back to the Top

Batting Performance

In baseball, a player's batting average is computed by dividing the number of hits (H) the player has divided by the player's "at bats" (AB). For example, a player who was at bat a total of 437 times during the season and who gets 145 hits has a batting average of

.

Is this number an estimate of the probability that a player will get a hit when he or she steps to the plate?

Not exactly. The number AB does not represent the number of times the player steps to the plate. If for example, the player is awarded a base on balls (BB) this is not counted as an at bat. So if the hypothetical player above earned 94 bases on balls during the season, the probability of their getting a hit at any plate appearance is

.

The reasoning behind omitting the bases on balls (walks) in the batting average computation is a good one. The batting average is intended to be a measure of a player's ability to hit the ball. A player who is frequently walked is exhibiting another skill. That player has a good eye for bad pitches and does not swing arbitrarily at the ball. Also, pitchers often intentionally walk really good hitters. Including BB in the batting average computation would necessarily lower the player's batting average, punishing them for being a good player! This is why the two numbers AB and BB are kept separate.

For the purpose of probability computations, however, we need to combine AB and BB for a total number of times at the plate.

Remember the home run data for Mark McGwire from the first project? This data can be found at http://www.homerunchase.com/mcgwirehrs.html. The link for Career Statistics will give additional data from Mark McGwire's career. In the exercises you will be asked to compute probabilities based on these statistics.

Back to the Top

Utility Prices

At the Bureau of Labor Statistics site http://www.bls.gov/data/, locate the table of average price data for electricity measured per 500 KWH (kilowatt-hour) between 1990-2000.

In this table you will see prices collected each month over the course of a year. In the exercises you will be studying the variation within a year as well as across years.

Back to the Top

Intelligence and the IQ Test

A person's IQ (Intelligence Quotient) is defined as follows
.

Thus an IQ of 100 implies one's mental age is in agreement with one's actual age in years so such a person is very normal and of average intelligence. A 10-year-old with an IQ of 130 has the intelligence level of a typical 13-year-old.

The definition of IQ is not really in debate but rather the controversy arises over the testing and interpretation of results. Intelligence takes a great many forms and it is difficult to design a test which can accurately measure such.

Below is a set of scores compiled from an on-line IQ test, a number of which exist on the web.

101, 93, 122, 88, 99, 96, 100 ,127 ,99, 93, 116, 162, 78,119, 83

The validity of these tests is unknown but the data can still be used in hypothesis tests, as you will do in the exercises.

Back to the Top

Testing a Physical Constant

You would not think a physical constant could be the subject of a hypothesis test. After all, it's a constant. You simply measure it and that is its value, right? Well not exactly.

Even an act as simple as using a ruler to measure a length is subject to error. The ruler may be imprecise or perhaps the length falls between two markings on the ruler so it is up to the measurer to estimate. When one thinks of measuring quantities that are on a much grander scale, the prospect for instrument or human error becomes greater as well.

Further, the set of possible measurements including errors can usually be assumed to follow a normal distribution with the mean being the true measurement. The reason being that most measurements would be pretty close to the true value, either a little over or a little under, while fewer would be way off the mark.

Our subject here is the speed of light. In the last half of the 19th century, A.A. Michelson collected 100 experimental measurements of the speed of light through air. Modern day methods estimate the value Michelson should have observed to be 299,734.5 km/s (as opposed to the speed of light in a vacuum, which is accepted as 299,792.5 km/s.) Michelson's original measurements can be found by going to the Data and Story Library http://lib.stat.cmu.edu/DASL and searching for the Michelson data file. Note that 299,000 has been subtracted from each data point.

Back to the Top

Exercises

When you've finished reviewing this project, go on to the exercises below. When you've completed each exercise, click "Submit for Grade" in order to submit your answers to your professor.

1.  

Using a 0.01 level of significance, test the hypothesis that the probability of Mark McGwire’s hitting a home run when he steps to the plate is 10%. Base the test on 1998 data. Repeat the test using only the 1995 data. How do the conclusions compare? Discuss.



2.  

Using the IQ test data, test the hypothesis that the mean IQ is 100. Also test the hypothesis that the standard deviation in IQ is 15 points. Summarize your results using a 0.05 level of significance.



3.  

Test the hypothesis that the standard deviation of the price of 500 KWH of electricity is $1.00. Use the 1995 data.



4.  

Is the electrical rate in October typically greater than $47/500 KWH? Justify your answer.



5.  

Formulate and conduct a hypothesis test that determines whether Michelson’s speed of light measurements were in the ballpark of the true value. That is, that the set of 100 measurements reflects a true value of 299,734.5 km/s with measurement error.


   


© 2000 by Addison Wesley Longman
A division of Pearson Education