SPSS II
Course Instructor Guide
Outline:
In Detail:
i.
A second-grade teacher at
ii.
At the end of the school year, each child’s letter
grades for the entire year were used to calculate an overall GPA. During a PTA meeting, parents were given a
questionnaire on which they were asked to indicate the average amount of time
per week (in hours) that they spend reading to or with their children. The
results are in this file.
iii.
In this example, we wish to (1) find the equation that
best represents the linear relationship between variables X and Y and that
allows us to best predict Y scores (GPA) from X scores (Reading Time); (2)
determine the strength of this relationship; and (3) test the null hypothesis
that, in the population from which this sample was drawn, the slope of the
prediction line is zero (that is, X scores and Y scores are unrelated).
i.
Choose Analyze
> Regression > Linear
1. Send
GPA to the dependent variable
box
2. Send
Time to the independent variable
box
3. Click
on Statistics
a. Check
Descriptives
b. Click
continue
4. Choose OK
ii.
Output
1. Discuss table contents
a. Good information but a chart or graph would
be very helpful
i.
Choose Graphs
> Scatter > Simple > Define
1. Move
GPA to the Y Axis
2. Move
Time to the X Axis
3. Click
OK
ii.
Right click on the Chart
1. Choose SPSS Chart Objects > Open
a. Choose
the Chart Options button
i.
Fit Line Total
1. Fit Options > Linear
Regression
2. Choose Continue
ii.
Click OK
iii.
Examine the
chart and table information and discuss output
i.
The means, standard deviations, number of cases, and a
correlation matrix are produced for the variables GPA and Time. The correlation between these two
variables is .860, which indicates a strong, positive relationship; children
whose parents spend more time reading to or with them tend to perform better in
school.
ii.
On the scatterplot, each
point represents a subject. Notice that
in this particular example the points seem to fall in a curved pattern, which
can be seen especially clear with the regression line on the graph. The subjects with low Times and with high Times
all fall below the regression line. This
suggests the existence of a non-linear relationship between GPA and Time, contrary to the usual linearity assumption of regression.
i.
Too many ages
1. Discuss
even distribution when running an analysis
ii.
For a crosstabs we want no
more than 4 or 5 groups
i.
Go back to the Data Editor
i.
Select Age as
the input variable
ii.
Create an Output Variable name, agegroup
iii.
Create a Label, Age recoded into age groups
iv.
Choose Old and New Values
1. Using
the information from the frequency run
2. Create
5 groups
a. Group
1
i.
Lowest through 19
b. Group
2
i.
20 through 24
c. Group
3
i.
25 through 30
d. Group
4
i.
31 through 41
e. Group
5
i.
42 through 98
f.
Value 99 is System-missing
3. Click
Continue
v.
Choose the Change button
vi.
Click OK
i.
Decimals
ii.
Values
iii.
Missing
i.
Enables us to test whether there is a relationship
between variables, and to examine the strength and direction of that
relationship
ii.
Used primarily with nominal and ordinal variables
i.
Move variables
1. agegroups into the Row(s) box
2. American or Foreign Born (usforn)
into the Column(s) box
ii.
Choose Display Clustered Bar Charts
iii.
Choose Statistics
1. Check
Chi-square
a. The
size of the difference between the observed and the expected values, in
conjunction with the sample size, determines whether or not there is a
statistically significant relationship between the two variables
b. Should
always be run with a Crosstabs
2. Check
Lambda and Uncertainty coefficient
a. Both
indicate the ability to predict the value of one variable when we know the
value of the other.
3. Check
Correlations
4. Click
Continue
iv.
Choose Cells
1. Check
a. Observed
b. Expected
c. All
percentage boxes
2. Click
Continue
v.
Choose Format
1. Ascending
vi.
Click OK
i.
Provides us with raw frequency values
ii.
The clustered bar chart shows us these numbers
i.
Show that percentages add up to 100%
ii.
Break down the percentages of each group and
demonstrate how these help. You should
see that the percentage change from age group to group there is on increase and
decrease
1. We
see that the older the person is, the more likely they are Foreign born,
conversely the younger the female the more likely she is born in the
2. During
the time this census was taken, 1900, this fact makes sense
i.
The values that would exist if there were no
relationship between the two variables
i.
In most cases, the relevant statistic is the Pearson’s
Chi-square
ii.
Start with the Null Hypothesis: That there is no
relationship between the two variables
1. If
the the Asymp. Sig is above .05 there is no relationship. If it is below .05 there is a significant
relationship
2. If
more than 20% of the cells have an expected frequency below five, the
Chi-square measure is not valid, regardless of the significance level
a. The
Chi-square is sensitive to the number of cells in the table and the size of the
sample
b. The
larger the number of cells in the table, the less likely you are to get a
significant result
c. The
greater the number of cases in the sample, the more likely you are to get a
significant result
iii.
How Strong is the relationship?
1. The
Chi-square tells you whether or not there is a relationship between two
variables, but it does not reveal the strength of that relationship
a. If
the Chi-square indicates a relationship you must look at other measures to
indicate the strength of that relationship
2. When
either of the variables you are looking at are nominal variables, the
appropriate measures are Lambda and the Uncertainty coefficient
i.
Between two groups
1. Use
a T-Test
ii.
More than two groups
1. Use
an ANOVA
a. Stands
for Analysis of Variance
i.
Check Options
ii.
Click OK
i.
Shows the average age for women in each marital group
ii.
Doesn’t show if these differences are statistically
significant or not
1. To
find significance run a ANOVA
i.
Age is the dependent variable
ii.
Marital Status is the factor (independent variable)
iii.
Choose Options
1. Choose
Descriptives
iv.
Choose Post Hoc
1. Choose
LSD
2. Significance
Level .05
v.
Click OK
i.
Each row shows the relationship between the marital
groups
1. Examples
a. Single
women are indeed younger than the other groups
b. There
is no significance between married women and divorced or married but not living
with their husbands, or between widowed and divorced women
ii.
Challenge the class to add a Simple Error bar graph
iii.
Help those in need
iv.
Show how to do this
1. Age
is the Variable
2. Status
is the Category Axis
v.
If there is an overlap in the bars, there is no
significant relationship
i.
Do not save changes
i.
Choose File > Open
1. Open
surevey.xls which is in the SPSS II folder which is in the Course Folder on the
desktop
ii.
Examine and discuss the contents
i.
File of type, change to .xls
ii.
Choose survey.xls
iii.
Select Read Variable Name
iv.
Click Okay
i.
Choose File > Save As
1. Choose
file type .xls
ii.
Save as new_survey.xls on the desktop
iii.
Open the .xls file in Excel
1. Discuss
the process and contents