Lung Cancer and Smoking Data

 

 

 

The Problem:

 

By examining data on levels of smoking and mortality from lung cancer for people in different professions, can a link be found between smoking and lung cancer death?

 

 

 

This is an example of importing data that is in an "HTML Table" format.  Excel has a "Web Query" feature to facilitate the data transfer.

 

For this example, we will use data from a web site produced by mathematics educators at the University of Virginia.  Numerous web sites now present data in "HTML Table" format.  An example involving traffic safety can be found at

http://www.pbs.org/teachersource/mathline/concepts/transportation/activity1.shtm

 

 


 

Harvesting the Data from Excel:

 

Step 1:            The data as well as a classroom activity description is available from the following web page:

http://curry.edschool.virginia.edu/teacherlink/content/math/activities/ex-smoking/

 

                        The data may be access directly at:

http://curry.edschool.virginia.edu/teacherlink/content/math/activities/ex-smoking/data.html

Copy URL (address of the web page).

 

 

 

Step 2:            Open the Excel spreadsheet.

 

From the menu item Data, use Import External Data, New Web Query

Paste the URL in the box titled Address and press Go.    Scroll down to find the data.  Click on the yellow arrow for the area that contains the data you wish to import (it changes to a check mark). 

 

                    

                     Click the Import button at the bottom. A dialog box will appear asking you where you want the data located within Excel.  Indicate cell A1 and press "ok."   It may take a few minutes for Excel to pull the data from the web page to the spreadsheet.  

 

 

 

 


 

Representing and Analyzing the Data in Excel:

 

Step 1:             Representing the data in a graph.

                        Highlight data in last two columns. 

           

                       

Click on Chart tool (it's a color bar below the word Help).

                        Click XY (Scatter) then click on Finish button at the bottom.

                        You should now have a scatter plot of the data on the screen similar to the window in the figure on the left below.

 

   

 

 

Step 2:            With the scatter plot chart selected, select the Chart menu item and pull down to Add Trendline· .

Select Linear and then click on Options tab at the top.

Click on the boxes in front of Display equation on chart and Display R-squared value on chart (checks will appear in each box).  Press OK.  You might need to move the equation and r-squared value so it can be seen.  You should have a screen similar to the window in the figure to the right above.


 

 

Questions to answer based on the analysis:

 

  1. If the Smoking Index for Students were 80, what would be the Mortality Index?

 

  1. R-squared represents the percentage of "variance" explained by the linear equation.  In this case, it's 51%.  What other factors besides smoking might contribute to mortality?

 

  1. Is there a strong link between smoking and mortality?  Use the data and the chart to explain your answer.