SPSS II

††††††††††† Course Instructor Guide

 

 

Outline:

 

  1. Review last class, SPSS I
  2. Simple Regression
  3. Create a chart
  4. Recode variables
  5. Cross tabulations
  6. Output interpretation
  7. ANOVA vs. T-Test
  8. Compare Means
  9. One-Way ANOVA
  10. Importing
  11. Exporting
  12. Open class for Q&A

 

 

In Detail:

 

  1. Review last class, SPSS I
    1. What is SPSS?
    2. Data Entry
    3. Frequency Distributions and Descriptive Statistics
    4. Editing a Chart
    5. One Sample t-Test
    6. Independent Sample T-Test
    7. Correlation
    8. Similar process
  2. Simple Regression
    1. Open SPSS
    2. In the SPSS II folder in the Course Folder on the desktop, open regression.sav
    3. Example

                                                               i.      A second-grade teacher at Lake Merced elementary school believes that the amount of time parents spend reading to or with their children is a fairly accurate predictor of overall school performance.

                                                             ii.      At the end of the school year, each childís letter grades for the entire year were used to calculate an overall GPA.During a PTA meeting, parents were given a questionnaire on which they were asked to indicate the average amount of time per week (in hours) that they spend reading to or with their children. The results are in this file.

                                                            iii.      In this example, we wish to (1) find the equation that best represents the linear relationship between variables X and Y and that allows us to best predict Y scores (GPA) from X scores (Reading Time); (2) determine the strength of this relationship; and (3) test the null hypothesis that, in the population from which this sample was drawn, the slope of the prediction line is zero (that is, X scores and Y scores are unrelated).

    1. Analysis

                                                               i.      Choose Analyze > Regression > Linear

1.      Send GPA to the dependent variable box

2.      Send Time to the independent variable box

3.      Click on Statistics

a.       Check Descriptives

b.      Click continue

4.      Choose OK

                                                             ii.      Output

1.      Discuss table contents

a.       Good information but a chart or graph would be very helpful

  1. Create a chart
    1. When doing a linear regression analysis, it is always a good idea to visually examine the scatterplot of the two variables
    2. In the Output View

                                                               i.      Choose Graphs > Scatter > Simple > Define

1.      Move GPA to the Y Axis

2.      Move Time to the X Axis

3.      Click OK

                                                             ii.      Right click on the Chart

1.      Choose SPSS Chart Objects > Open

a.       Choose the Chart Options button

                                                                                                                                       i.      Fit Line Total

1.      Fit Options > Linear Regression

2.      Choose Continue

                                                                                                                                     ii.      Click OK

                                                            iii.      Examine the chart and table information and discuss output

    1. Output

                                                               i.      The means, standard deviations, number of cases, and a correlation matrix are produced for the variables GPA and Time.The correlation between these two variables is .860, which indicates a strong, positive relationship; children whose parents spend more time reading to or with them tend to perform better in school.

                                                             ii.      On the scatterplot, each point represents a subject.Notice that in this particular example the points seem to fall in a curved pattern, which can be seen especially clear with the regression line on the graph.The subjects with low Times and with high Times all fall below the regression line.This suggests the existence of a non-linear relationship between GPA and Time, contrary to the usual linearity assumption of regression.

  1. Recode variables
    1. In the SPSS II folder, Open sfwomen.sav
    2. Do a frequency run on the variable, Age

                                                               i.      Too many ages

1.      Discuss even distribution when running an analysis

                                                             ii.      For a crosstabs we want no more than 4 or 5 groups

    1. Must recode Age

                                                               i.      Go back to the Data Editor

    1. Choose Transform > Recode > Into different variables

                                                               i.      Select Age as the input variable

                                                             ii.      Create an Output Variable name, agegroup

                                                            iii.      Create a Label, Age recoded into age groups

                                                           iv.      Choose Old and New Values

1.      Using the information from the frequency run

2.      Create 5 groups

a.       Group 1

                                                                                                                                       i.      Lowest through 19

b.      Group 2

                                                                                                                                       i.      20 through 24

c.       Group 3

                                                                                                                                       i.      25 through 30

d.      Group 4

                                                                                                                                       i.      31 through 41

e.       Group 5

                                                                                                                                       i.      42 through 98

f.        Value 99 is System-missing

3.      Click Continue

                                                             v.      Choose the Change button

                                                           vi.      Click OK

    1. Discuss the new variable
    2. Change the variable information in Variable View

                                                               i.      Decimals

                                                             ii.      Values

                                                            iii.      Missing

  1. Cross tabulations
    1. The most commonly used analytical method in the social sciences

                                                               i.      Enables us to test whether there is a relationship between variables, and to examine the strength and direction of that relationship

                                                             ii.      Used primarily with nominal and ordinal variables

    1. Choose Analysis > Descriptive Statistics > Crosstabs

                                                               i.      Move variables

1.      agegroups into the Row(s) box

2.      American or Foreign Born (usforn) into the Column(s) box

                                                             ii.      Choose Display Clustered Bar Charts

                                                            iii.      Choose Statistics

1.      Check Chi-square

a.       The size of the difference between the observed and the expected values, in conjunction with the sample size, determines whether or not there is a statistically significant relationship between the two variables

b.      Should always be run with a Crosstabs

2.      Check Lambda and Uncertainty coefficient

a.       Both indicate the ability to predict the value of one variable when we know the value of the other.

3.      Check Correlations

4.      Click Continue

                                                           iv.      Choose Cells

1.      Check

a.       Observed

b.      Expected

c.       All percentage boxes

2.      Click Continue

                                                             v.      Choose Format

1.      Ascending

                                                           vi.      Click OK

  1. Output interpretation
    1. Look at the numbers in the tables

                                                               i.      Provides us with raw frequency values

                                                             ii.      The clustered bar chart shows us these numbers

    1. It is generally easier to analyze the data if one looks at percentages

                                                               i.      Show that percentages add up to 100%

                                                             ii.      Break down the percentages of each group and demonstrate how these help.You should see that the percentage change from age group to group there is on increase and decrease

1.      We see that the older the person is, the more likely they are Foreign born, conversely the younger the female the more likely she is born in the United States

2.      During the time this census was taken, 1900, this fact makes sense

    1. Expected Count

                                                               i.      The values that would exist if there were no relationship between the two variables

    1. Chi-square

                                                               i.      In most cases, the relevant statistic is the Pearsonís Chi-square

                                                             ii.      Start with the Null Hypothesis: That there is no relationship between the two variables

1.      If the the Asymp. Sig is above .05 there is no relationship.If it is below .05 there is a significant relationship

2.      If more than 20% of the cells have an expected frequency below five, the Chi-square measure is not valid, regardless of the significance level

a.       The Chi-square is sensitive to the number of cells in the table and the size of the sample

b.      The larger the number of cells in the table, the less likely you are to get a significant result

c.       The greater the number of cases in the sample, the more likely you are to get a significant result

                                                            iii.      How Strong is the relationship?

1.      The Chi-square tells you whether or not there is a relationship between two variables, but it does not reveal the strength of that relationship

a.       If the Chi-square indicates a relationship you must look at other measures to indicate the strength of that relationship

2.      When either of the variables you are looking at are nominal variables, the appropriate measures are Lambda and the Uncertainty coefficient

  1. ANOVA vs. T-Test
    1. Both are used to compare the differences in means

                                                               i.      Between two groups

1.      Use a T-Test

                                                             ii.      More than two groups

1.      Use an ANOVA

a.       Stands for Analysis of Variance

    1. Go back to the Data Editor
  1. Compare Means
    1. Choose Analyze > Compare Means > Means

                                                               i.      Check Options

                                                             ii.      Click OK

    1. Output

                                                               i.      Shows the average age for women in each marital group

                                                             ii.      Doesnít show if these differences are statistically significant or not

1.      To find significance run a ANOVA

  1. One-Way ANOVA
    1. Choose Analyze > Compare Means > One-Way ANOVA

                                                               i.      Age is the dependent variable

                                                             ii.      Marital Status is the factor (independent variable)

                                                            iii.      Choose Options

1.      Choose Descriptives

                                                           iv.      Choose Post Hoc

1.      Choose LSD

2.      Significance Level .05

                                                             v.      Click OK

    1. Output

                                                               i.      Each row shows the relationship between the marital groups

1.      Examples

a.       Single women are indeed younger than the other groups

b.      There is no significance between married women and divorced or married but not living with their husbands, or between widowed and divorced women

                                                             ii.      Challenge the class to add a Simple Error bar graph

                                                            iii.      Help those in need

                                                           iv.      Show how to do this

1.      Age is the Variable

2.      Status is the Category Axis

                                                             v.      If there is an overlap in the bars, there is no significant relationship

  1. Importing
    1. Close the sfwomen.sav file

                                                               i.      Do not save changes

    1. Open Microsoft Excel

                                                               i.      Choose File > Open

1.      Open surevey.xls which is in the SPSS II folder which is in the Course Folder on the desktop

                                                             ii.      Examine and discuss the contents

    1. In SPSS, Data Editor
    2. Choose File > Open > Data

                                                               i.      File of type, change to .xls

                                                             ii.      Choose survey.xls

                                                            iii.      Select Read Variable Name

                                                           iv.      Click Okay

    1. Discuss import and information
  1. Exporting
    1. With the new survey file open

                                                               i.      Choose File > Save As

1.      Choose file type .xls

                                                             ii.      Save as new_survey.xls on the desktop

                                                            iii.      Open the .xls file in Excel

1.      Discuss the process and contents

  1. Open class for Q&A