SPSS Assignment 5 Instructions

Part 1

From Blackboard, download the SPSS HW #5-1 data file. Dr. Z is interested in researching if there is a statistically significant difference in the mean lifespan in years between those who do not correctly wash their hands after using the bathroom and those who correctly wash their hands after using the bathroom.

Before we can begin independent sample t-test analysis, there are several assumptions we need to test for:

Measurement of variables (2 categorical/independent (not related) groups; DV must be interval/ratio)

No outliers (using the Outlier Labeling Rule)

Normal distribution of dependent variable (tested by Shapiro-Wilk, calculation of skewness/kutosis, inspection)

Homogeneity of variance (tested by Levene’s test)

Since we know that we have 2 categorical groups that are not the same people (independent from each other), we have 2 independent groups: correct handwashers and non-correct handwashers. Also, since we know that the DV is interval/ratio (years of life lived), we have met assumption A above.

We will check for outliers (using the Outlier Labeling Rule).

Click on “Analyze,” “Descriptive Statistics,” and “Explore.”

Select the dependent variable…and move it…over to the “Dependent List” box

…and then click on “Statistics.”

When this pop-up pops up…select “Descriptives”…“Outliers”…“Percentiles”

…and then click “Continue”

…and then click on “Plots.”

When this pop-up pops select “Stem-and-leaf”

… and select “Histogram”…select “Normality plots with tests”

…click “Continue”

…click “OK.”

The first indication that you might have outliers will be through inspection of the histogram, stem and leaf, and box plot.

If you see values on the outskirts of the histogram…

… or dots outside your whiskers, you may have outliers.

To actually test for outliers, we are going to use the Outlier Labeling Rule. We will do this through the “Percentiles” table in your SPSS output via a hand calculation to find the upper/lower limits for finding outliers in our data.

Take the 75th percentile value (Q3) and subtract the 25th percentile value (Q1) from it.

Q3 – Q1 = #

So, in our example:

72 – 58 = 14

Then, take the subtracted value and multiply it by 2.2.

(Q3 – Q1) x 2.2

So, in our example:

(14) x 2.2 = 30.8

Then, take that value and add it to the 75th percentile value (Q3) and subtract it from the 25th percentile value (Q1).

Q3 + ((Q3 – Q1) x 2.2) = upper limit to find outliers

Q1 – ((Q3 – Q1) x 2.2) = lower limit to find outliers

So, in our example:

72 + 30.8 = 102.8

58 – 30.8 = 27.2

Then, right-click on the dependent variable…and click “Sort Descending”

…delete every case that is above the upper limit that you calculated.

Then, right-click on the dependent variable and click “Sort Ascending”

…then delete the cases that are below the lower limit that you calculated.

Now that we have located and deleted outliers, we have met the assumption of no outliers.

Now, we need to test the assumption of a normal distribution of the dependent variable (tested by Shapiro-Wilk, calculation of skewness/kutosis, inspection).

Click on “Analyze,” “Descriptive Statistics,” and “Explore.”

When this pop-up pops up…click on “Statistics.”

When this pop-up pops up…un-select “Outliers”…un-select “Percentiles”

…make sure that descriptives are selected

…and then click “Continue.”

Click on “Plots.”

When this pop-up pops up…make sure that “Histogram” and “Normality plots with tests” are selected

…click “Continue”

…and click “OK.”

On the line for “Skewness”…use a calculator and divide the “Statistic” by the “Std. Error”

…which in this example is .052 / .122…

…if the calculated value is within -1.96 to +1.96, then the variable is within the acceptable range for skewness.

Repeat this calculation for the statistic and std. error of the kurtosis. If the calculated value is within -1.96 to +1.96, then the variable is within the acceptable range for kurtosis.

Look to see if the dependent variable has a “Shapiro-Wilk” “Sig.” value that is > .05. If so, then the dependent variable is normally distributed and it meets the assumption of normal distribution. Make a note in your output whether or not the assumption was met.

Inspect the histogram to see if the dependent variable looks approximately normally distributed.

Inspect the “Normal Q-Q plot” to see if the dots are close to the line. Make a note of this.

Inspect the box plot to see if the whiskers are symmetrical. Make note of your inspections.

The last assumption that we will test is the homogeneity of variance, tested by Levene’s test. We will look at the Levene’s test in the same steps used to run the independent sample t-test.

To do so, click “Analyze”…“Compare Means”…“Independent Samples T Test.”

When this pop-up pops up…

Select the dependent variable…and move it…over to the “Test Variable(s)” box

…select the categorical variable…and move it…over to the “Grouping Variable” box

… click on “Define Groups.”

When this pop-up pops up…

Type “1” in the “Group 1” box…type “2” in the “Group 2” box

…click “Continue.”

(NOTE: Only type in “1” and “2” if you made your values 1 and 2 when you defined your values.)

Click “OK.”

To test the assumption that we have homogeneity of variance (tested by Levene’s test), we want the “Sig.” value of the Levene’s test to be greater than .05. If it is, then we have met this assumption and we can continue with the independent sample t-test. If this value was less than .05, we would adjust for this by reading the bottom row of the output when interpreting the t-test.

Now, we need to report that we analyzed the assumptions and the assumptions were met.

We conducted a preliminary analysis to ensure that the assumptions of an independent samples t-test were not violated, including normal distribution, the absence of outliers, and homogeneity of variance.

To test for normality, we conducted a Shapiro-Wilk’s test (SW = ___, df = ___, p >.05) in addition to a inspection of the normal Q-Q plot, box plot, and histogram, all of which indicated that the dependent variable was approximately normally distributed. In addition, the dependent variable had a skewness of ___ (SE = ___) and a kurtosis of ___ (SE = ___).

[type in the value of the Shapiro-Wilk’s test statistic and the df value; type in the statistic and standard error for skewness and kurtosis]

A Levene’s test was used to verify homogeneity of variance, F(___) = , p > .05.

Since we have met all of the assumptions, we can now analyze and report the t-test using this template.

We performed an independent samples t-test in order to compare the difference of [DV] among [group 1] and [group 2].

Remember the following format for reporting results and use this format when reporting results in the research project later in the course.

There was a statistically significant [or not significant] difference of [DV] between [group 1] (M=___, SD=___) and [group 2] (M=___, SD=___) (t(__)=____, p < .05; d = ___).

To find Cohen’s d, you need to use an online calculator, such as http://www.polyu.edu.hk/mm/effectsizefaqs/calculator/calculator.html.

Enter these values from the “Group Statistics” table into the calculator:

Then, click “Compute”…and look at the Cohen’s d value that appears.

As a refresher, a Cohen’s d of 0.20 is considered a small effect, 0.50 is a medium effect, and 0.80 is a large effect.

Finally, export your output to Word and submit it to Blackboard.

To export, click the export symbol

…click “Browse,” and then save your document with a name and location that you will remember.

Part 2

From Blackboard, download the SPSS HW #5-2 data file. Dr. C is interested in researching if there is a statistically significant difference in the score of a handwashing scale (based from the CDC’s recommend procedure for handwashing) between males and females. The scale is based on the following Likert-based scale:

When you wash your hands, how often do you use soap?

All of the time

Most of the time

Half of the time

Some of the time

None of the time

When you wash your hands, how often do you lather the backs of your hands?

All of the time

Most of the time

Half of the time

Some of the time

None of the time

When you wash your hands, how often do you lather between your fingers?

All of the time

Most of the time

Half of the time

Some of the time

None of the time

When you wash your hands, how often do you lather under your nails?

All of the time

Most of the time

Half of the time

Some of the time

None of the time

When you wash your hands, how often do you scrub your hands for at least 20 seconds?

All of the time

Most of the time

Half of the time

Some of the time

None of the time

The scale ranges in potential score from 5 to 25. Those who answer “All of the time” for an item receive a score of 5 for that item. Likewise, those who answer “Most of the time” for an item receive a score of 4 for that item, “Half of the time” is a score of 3, “Some of the time” is a score of 2, and “None of the time” is a score of 1.

Since we want to determine a difference in independent groups of ordinal data, we want to run a Mann-Whitney U-test.

Before we can begin independent sample t-test analysis, there are several assumptions we need to test for:

Measurement of variables (2 categorical/independent [not related] groups; DV must be interval/ratio or ordinal); if the DV is interval/ratio, it must not be normally distributed; if the DV is ordinal, it can be normally distributed or not normally distributed

Homogeneity of variance (tested by a non-parametric version of Levene’s test)

Since we know that we have 2 categorical/independent (not related) groups (male, female) and that our DV is ordinal, we have met assumption A above.

Since SPSS does not let us perform the Levene’s test for non-parametric data in one step, we have to create 3 new columns in our data set and then run an ANOVA as a form of the Levene’s test.

To do this, we first have to create ranked data…click “Transform”…“Rank Cases”

Select your dependent variable…move it…to the “Variable(s)” box.

…then click “OK.”

Look at your data view…SPSS just created a new column labeled with an “R” in front of the name of your dependent variable. In our example, the new column is called “Rscale”…the R stands for “rank” because each person is given their own rank based upon their value of the DV. Lower values of the DV result in lower rank scores than those that have higher DV values.

Now, we have to create a column for the mean rank scores for each our groups (in our example, the groups are males and females).

To do so…click “Data”…“Aggregate.”

Select the ranked column variable that you just created…move it…to the “Summaries of Variable(s)” box

…click on “Function.”

When this pop-up pops up…make sure that “Mean” is selected…click “Continue”

(NOTE: You might have to click “Median” and then click back on “Mean” in order to get the “Continue” button to appear.)

Then, select your categorical/group variable…move it…to the “Break Variable(s)” box

…and then click “OK.”

Look at your data view…SPSS just created a new variable column called “R[DV]_mean_1.” In our example, the new column is “Rscale_mean_1.” All people of the same group have been given the same value (the mean rank for the group).

Now, we need to create a third variable…each person’s spread away from the group mean rank. Because the Levene’s test is only performed on positive numbers, we have to make sure that this new column is in absolute values (all positive numbers). For each person, we will subtract his/her individual rank value from the group mean rank value.

To do so, click “Transform”…“Compute Variable.”

When this pop-up pops up…create a name for the calculated variable in the “Target Variable” box.

Let’s call it “case_minus_group” so that we know it is the individual cases minus the group rank mean.

Select the group mean rank “R[DV]_mean_1”…move it … to the “Numeric Expression” box

…then click the minus sign.

Select the individual case rank “Rscale”…move it…to the “Numeric Expression” box

…and then click “All”

…and then highlight the numeric expression that you have entered so far

…then double-click on “Abs,” which stands for “absolute values”… you will see that the numeric expression changes

…and then click “OK.”

Look at your data view and you will see that the new “case_minus_group” variable has been created.

Now we will perform an ANOVA as a form of doing a Levene’s test in SPSS.

To do so…click “Analyze”…“Compare Means”…“One-Way ANOVA.”

When this pop-up pops up…select “case_minus_group”…move it…to the “Dependent List” box

…and select your categorical group variable…move it…to the “Factor” box

… and click “OK.”

Look at the ANOVA table on your SPSS output. If the sig. value is > .05, then we have met the assumption.

Since we have met all of the assumptions, we can run the Mann-Whitney U test.

Select “Analyze”…“Nonparametric Tests”…“Legacy Dialogs”…“2 Independent Samples.”

When this pop-up pops up…

Select the dependent variable…and move it…over to the “Test Variable List” box.

Select the categorical group variable…move it…to the “Grouping Variable” box

…and then click on “Define Groups.”

When this pop-up pops up … type “1” in the “Group 1” box … type in “2” in the “Group 2” box.

(NOTE: Only type in “1” and “2” if you made your values 1 and 2 when you defined your values)

…click “Continue.”

Make sure “Mann-Whitney U” is selected…click “OK.”

If the Asymp. Sig. (2-tailed) value is less than .05, then it is significant.

Finally, export your output to Word and submit it to Blackboard.

To export, click the export symbol

…click “Browse,” and then save your document with a name and location that you will remember.

Page 44 of 44