Assignment 3
Exploring Correlations & Regression

  

The objective for this week is to delve further into correlation analysis and begin multivariate analysis. After you complete reading and doing Chapters 17 & 19 in Babbie & Halley’s Adventures in social research, complete the following brief exercises. These questions assume you have already gone through the assigned chapters. Submit your work next week (November 5) at the end of the class session. You will need to use both the GSS.SAV data supplied by Babbie and the assigned GSS-year that you put into in your P-drive. Please do not use the file in the K-drive folder.  

  1. First, open your Babbie data file. Nearly all of you said that you were troubled by the distinction between the correlation coefficient (the number that describes the strength of the observed relationship between two variables) and the p-value (the probability-statistic that reveals the reliability of the correlation coefficient). This first exercise is designed to again walk you through the process of assessing a relationship between two variables. For this exercise, you will compute the correlation and then discuss the meaning of the findings. 
  2. Recall, when you set out to discuss a correlation, your objective is to do more than report the correlation coefficient. The coefficient is computed for you by SPSS and your responsibility (as the human mind) is to make sense of the coefficient. First, you describe the strength of the observed relationship (e.g., weak, moderate, etc….) according to the size of the coefficient. THEN you interpret the coefficient (i.e., you discuss the relationship between the two variables). For example, you might find that the empirical relationship between education (EDUC) and attitude toward capital punishment (CAPPUN). If you (take a moment to) compute the correlation coefficient, you find it is .066; you also see that the assessed statistical significance is .013. Thus, the size of the coefficient shows us a weak but significant (or reliable) relationship exists between education and attitude; people with more education are more likely to oppose capital punishment. Notice my interpretation emphasizes on the probability of the relationship. 

    Now, compute and interpret the correlation between POLVIEWS and CLASS. [Pssst: You might want to compute a frequency distribution of each variable to see the way the data were coded and thus understand how to interpret a positive or negative correlation coefficient. And, you are computing a correlation using CORRELATE à BIVARIATE.] 


  3. The primary objective for this next task is to describe the relationship between one variable and three others. You will compute a correlation matrix (i.e., in this case, a table involving four variables with the diagonal showing correlations of 1.00 [a variable correlated with itself]). You discuss only three correlation coefficients. Make sure you understand which three before you try to make sense of the table, just as you made sense of the single correlation coefficient in exercise 1 (above). 
  4. Compute and interpret the relationship of XMOVIE and SEX, AGE, and EDUC. 


     

  5. Change data, and open your GSS year from your P-drive. You are to repeat the multivariate analysis of CHATT, AGECAT, and SEX (found in Babbie, pp. 179-181) for you GSS-year. But use CHATT2, AGECAT, and SEX. (If you have not built the first two variables, you will need to go through the recode ATTEND into a different variable CHATT2 as noted in assignment 1; and, you will need to recode AGE into AGECAT as instructed by Babbie, pp. 55-56)
  6. Percentage who attend worship services about weekly 
     

                   Under 21       21-39         40-64       65+ 

    men 

    women 
     

  7. In your GSS-year data set, build SEX2, as you did in Babbie, pp. 185-186. Then, compute a linear regression with SATHEALT (satisfaction with health and physical condition) as the dependent variable and RELITEN, AGE, CLASS, and SEX2, as the independent variables. Be use to make "stepwise" visible in the "method window." Delete the first table in this output (variables entered/removed), then print the rest of this output. [Whoever is using GSS85, make HAPPY (taken all together, how would you say things are these days – very happy, pretty happy, or not too happy) your dependent variable, not SATHEALT]

  8.  
      a. Which of the four variables proved to be the strongest predictor of people’s satisfaction with their health and physical condition? ________________________ 

      b. Which of the four variables proved to be the second strongest predictor of people’s satisfaction with their health and physical condition? ________________________ 

      c. Which of the four variables proved to be the weakest predictor of people’s satisfaction with their health and physical condition? ________________________ 

      d. What is the R (statistic) after the three variables are entered into the regression equation (find it in the R column of the table titled "model summary.") ________ 

      e. Recompute this linear regression, adding INCOME as a fifth independent variable. [Just add income to the independent variables you already have included.] Examining the "model summary" table in your new output, identify which variable proved to be the second strongest predictor and which proved to be the weakest predictor. 

     
  9. Now, visit a couple of web-sites.
    If you are reading this assignment using email, you can toggle the hyperlinks below and off you go. If you are reading the assignment by having first downloaded it into MicroSoft Word as a separate document, you can highlight a web-address, copy it, paste it into the bar-line on the Holy Cross frontpage that loads when you open Netscape, tap "enter" and you will automatically head to the web-site.] 

    http://csa.berkeley.edu:7502/D3/GSS96/Doc/gss9h01.htm 
    The steers you toward a site a Berkeley that has regrouped the General Social Survey variables for 1972-1996 into categories. Explore it, finding the "personal concerns" that are classified as "religion." Toggle this link, and then scroll down the page until you find "pray." Print the page. (You can get up to 8 pages, so use a college printer with its paper.) 

    Visit next  
    http://www.soc.qc.edu/QC_Software/GSS.html 
    Toggle the "Online search of the cumulative GSS codebook" and search for "religiosity". How many variables were linked to this concept? _____  
    Next, search "religious" and identify how many variables are associated with this keyword. _____ 
    (Did you need to change the maximum number of variables to display?) 

    Return to the frontpage for this site, and toggle the "online search of the GSS subject index to questions." Scroll down to "religion," toggle this link, and print the page that appears. 

    Again, return to the frontpage for the site, and toggle the "online search of the GSS annotated bibliography." Search the keyword "religiosity." How many articles, paper, etc. were identified? ______ 
    Ask the system to present the abstracts for the articles and papers. What is the fourth article listed? 
    ______________________________________________________________________________ 
    ______________________________________________________________________________