Chapter 10

Contingency Tables

Back to List | Introduction | How To Be Popular In School | The President on Trial | Exercises

Introduction

Are voters biased? Are boys or girls more interested in education? Does age affect starting salary? Data used to investigate such questions are called qualitative data as opposed to the quantitative data you have used so far. Quantitative data correspond to characteristics (quantities) which can be measured numerically, such as age, height, and IQ. Qualitative data is not arrived at through measurement, but rather through observation and survey. Hair color, gender, and political persuasion are examples of quantitative data categories.

The contingency table method allows us to draw conclusions regarding the relationships among qualitative characteristics. In this project you will collect data from two different sites and conduct a contingency table analysis to answer questions.

In the next few pages you will find descriptions of the data sets and the sites where you will find them.

Back to the Top

How To Be Popular In School

A survey was conducted among students in grades 4 through 6 in several school districts in Michigan. Besides collecting basic information from the children (age, gender) the students were asked about their personal goals along and to rank the traits they feel help make a person more popular in school. The results of this survey can be found in the Data and Story Library.

The Data and Story Library is an archive of data sets and related text for use in learning and experimenting with statistics. The library is maintained by the statistics department of Carnegie Mellon University and can be found at http://lib.stat.cmu.edu/DASL. Use the Power Search tool to locate a collection of data under the heading "Popular Kids". The page containing data begins with a description of the survey and describes the data. A typical line of data in this set looks like:

girl     5     10    White     Rural     Elm     Sports     4    3    2    1

This is typical of a lot of data you will find on the Web and elsewhere. The data is partitioned into columns (whose headers may not line up) in a spreadsheet or grid format. You may have to look closely at the data to extract the information you want.

The line of data above indicates that one of the students surveyed was a 10-year-old Caucasian girl who is in the fifth grade at Elm School in a rural school district in Michigan. Her own personal goal is to be good in sports, but she ranks being good in sports fairly low as a factor in being popular. Her ranking of important traits for popularity from most important to least is, in fact,

  1. having lots of money
  2. being handsome or pretty
  3. being good at sports
  4. making good grades

Make sure you understand how this data is organized and how to read a particular entry. Then bookmark the data page for use in the Exercises.

Back to the Top

The President on Trial

The year 1999 was a historical year in American politics. That year marked only the second time in history a US President, in this case Bill Clinton, was impeached and brought up on charges before Congress. The Senate voted on two articles of impeachment, one charging the President with perjury and one with obstruction of justice. As you know, the charges were dropped but it is interesting to examine the details of the Senate vote.

One site containing details of the Senate vote is found at http://www.amstat.org/publications/jse/archive.htm, a data archive maintained by the Journal of Statistics Education. Once at the archive page, you should find the data set impeach.dat and the corresponding file impeach.txt, which describes the organization of the data.

For example, one line reads:

lugar    IN    1    1    2    1    64    42    2000    0

and this line is interpreted as follows:

Senator Lugar (Richard Lugar) is a Republican Senator from the state Indiana, a state in which 42% of the votes in the 1996 Presidential election were for Bill Clinton. He has served for more than one term and will be up for re-election in the year 2000. On a scale of 0-100, with 100 being the most conservative, Senator Lugar was assigned a conservatism rating of 64. He believed President Clinton guilty of both perjury and obstruction of justice for a total of 2 guilty votes at the hearings.

The exercises will have you analyze the relationship between the captured characteristics of the Senate. Be sure you understand how to read a line of data in the data set and then bookmark the appropriate page. For more information on a Senator, such as their full name, visit http://www.senate.gov.

Back to the Top

Exercises

When you've completed each exercise, click "Submit for Grade" in order to submit your answers to your professor.

In all Exercises use contingency tables when applicable.

Exercises 1-4 use the "Popular Kids" data.

1.  

Are gender and age independent variables? Why would such a question be asked at the beginning of a study such as this?



2.  

Is a student's choice of Sports as a personal goal dependent on the school they attend? Justify your answer.



3.  

Are rural school children or urban school children more likely to believe Money is the most popular popularity trait? Justify your answer.



4.  

Choose any two columns of data and examine the dependence of the corresponding variables. Discuss your conclusions.



5.  

Did the Senate vote along party lines? That is, did Republicans vote guilty while Democrats did not? Note that this may seem like a no-brainer, but there were a majority of Republicans in the Senate at the time of the vote, so they must not have all voted guilty.



6.  

Define a "liberal" as someone with a conservatism rating of 50 or less and a "conservative" as having a rating greater than 50. Was the action of 2 guilty votes independent of being a "liberal" or "conservative"?



7.  

Among senators serving their first term, was the vote on the perjury count dependent on whether he or she was up for re-election in the year 2000?


   


© 2000 by Addison Wesley Longman
A division of Pearson Education