# CJUS 745 ECC Advantages of Applying Moderation and Mediation Discussion

One of the most advanced quantitative methods that can be applied to criminal justice data is mediation and moderation analysis.

## After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?

## How do an inadequate design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?

## How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?

Reply:

After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?

**According to MacKinnon (2011), there are several reasons for adding mediating variables into studies, including:**

Moderation analysis can also provide groups in which the intervention has the grea*test* effect or no effect.

**How does a preliminary design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?**

**Inadequate design is often caused by the researcher(s) failing to take into consideration the probable effect of bias in the study they are executing. Bias may cause the researchers to ignore the requirements needed to conduct the study fairly and instead interject their own inclinations into the study, thereby negating the results (Kline, 2015). Kline (2015) states that it is crucial that the researchers utilize the research design that is best suited so that the results of the analysis will be more accurate when utilizing mediation and moderating analysis.**

**An inadequate research design possesses the likelihood of the mediating/moderating and predictor variables affecting the accuracy of the analysis (Baron & Kenny, 1986). Poor study design selection may lead to inaccurate results. It can also contribute to presenting erroneous data that may lead to other researchers performing their own analysis to reach their own results (Kline, 2015).**

**The flawed analysis strategy negatively affects mediation analysis and moderating by increasing bias and reducing data accuracy, thereby leading to a maximum marginal error occurring (Baron & Kenny, 1986).**

**The lack of attention to assumptions can lead to errors in error independence, linearity, and collinearity (Baron & Kenny, 1986). The errors and the lack of attention to assumptions can cause incorrect and misguided results (Baron & Kenny, 1986, Kline, 2015).**

How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?

The researcher’s theoretical framework identifies crucial aspects that affect a phenomenon of concern and highlight the significance of examining how those important factors may vary and in what circumstances **(Baron & Kenny, 1986). Poor theor**etical framework suggests that the researcher cannot grasp ideas and information that is pertinent to the research topic (Kline, 2015). The researcher may also not understand a broader area of knowledge under investigation, thereby affecting mediation and moderation analysis (Baron & Kenny, 1986). The lack of a theoretical framework can also hamper the researcher from finding specific and significant variables and how they impact or are correlated in different circumstances (Kline, 2015).

“Whatever you do, work heartily, as for the Lord and not for men, knowing that from the Lord you will receive the inheritance as your reward. You are serving the Lord Christ” (Colossians 3:23-24, ESV).

*References*

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. *Journal of Personality and Social Psychology*, *51*(6), 1173–1182. https://doi.org/10.10*37*//0022-3514.51.6.1173

Kline, R. B. (2015). The Mediation Myth. *Basic and Applied Social Psychology*, 37(4), 202–*21*3.

https://doi.org/10.1080/01973533.2015.1049349

MacKinnon, D. P. (2011). Integrating Mediators and Moderators in Research Design. *Research on Social Work Practice*, 21(6), 675–681.

https://doi.org/10.1177/1049731511414148

Here is Jeff

## Mediation & Moderation Analysis

Both moderation and mediation allow researchers to address questions concerning contingencies and mechanisms that can better reveal the complexities of how a set of variables is interrelated. In recent years, applications of statistical mediation have become more prevalent in social science research for testing assumptions about why or how an independent variable is associated with an outcome of interest. However, mediation may not hold in all conditions or for all groups of people. In this paper, we reviewed and illustrated how moderated mediation analysis can be used to test whether an indirect effect is conditional on values of a proposed moderating variable. Despite its advantages for modeling complex relationships among variables, moderated mediation is under-utilized in the substantive literatures. Instead, researchers typically analyze interactions and mechanisms separately, or rely on other outdated methods for testing moderated mediation (Blair, 2022) Methods for testing mediation and moderation effects in a dataset, both together and separately. Investigations of this kind are especially valuable in prevention research to obtain information on the process by which a program achieves its effects and whether the program is effective for subgroups of individuals (Fairchild & MacKinnon, 2008).

After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?

According to Pais in 2017, advances in mediation analysis are used to examine the legacy effects of racial residential segregation in the United States on neighborhood attainments across two familial generations. The findings are supported by a comprehensive mediation analysis that provides a formal sensitivity analysis, deploys an instrumental variable, and assesses effect heterogeneity. Methodological advancements in mediation analysis are used on data from the Panel Study of Income Dynamics (PSID) and the U.S. Census to assess the relative explanatory power of these pathways for white and black families that have origins in the United States dating back at least to the height of racial residential segregation in the late 1960s. In the context of causal mediation analysis, confounders can be observed or unobserved variables that affect the mediator and the outcome and are affected by the treatment variable. Causal mediation analysis requires researchers to inspect the validity of the no “treatment-mediator” interaction assumption (Pais, 2017).

According to Pick & Teo in 2016, a number of limitations should be considered when assessing the generalizabiltiy of the findings. To minimize the effect of common method bias, several tests were employed. One-way ANOVA testing was also applied to assess whether the merging of the surveys from two different time periods was appropriately performed. While the response rate was low, the sample size was sufficient to allow the findings of the model to have general relevance. In retrospect it would have been useful to collect data about the quality of information and trust in sources of information. Furthermore, as indicated in the *final* path model, external change initiatives were excluded from the analysis due to low discriminant validity. Future study should examine if this particular construct is valid and present in the Australian public sector context. Consistent with the call by Oreg, Vakola, and Armenakis (2011, 514), future work should use research design appropriate for studying the longitudinal effects of change at the individual level, collect data from multi-raters and use objective indicators to supplement self-report information (Pick & Teo, 2016).

How do an inadequate design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?

The cost of the generalizability of the general model to test mediation and moderation effects is possible inflation of Type I error, lack of power, and difficulty with interpretation of model parameters if several effects are present. The model may be simplified, however, to represent more specialized cases of mediation and moderation joint effects such as baseline by treatment interactions by constraining paths in the model to be zero. An additional limitation of models with moderation and mediation is the extensive assumptions required for accurate *assessment* of relations among variables (Holland, 1988). The sensitivity of conclusions to violations of assumptions is not yet known and correct conclusions will likely require repeated applications in any substantive research area. In particular, often the X variable is the only variable that represents random *assignment*, making interpretation of causal relations between other variables in the model susceptible to omitted variable bias. In many applications, the model results may represent descriptive information about how variables are related rather than elucidating true causal relations among variables. Information on true causal relations will require programs of research to replicate and extend results as well as information from other sources such as qualitative information and replication studies in different substantive research areas (Fairchild & MacKinnon, 2008).

How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?

Relations between variables are often more complex than simple bivariate relations between a predictor and a criterion. Rather these relations may be modified by, or informed by, the addition of a third variable in the research design. Examples of third variables include suppressors, confounders, covariates, mediators, and moderators (MacKinnon et al. 2000). Many of these third variable effects have been investigated in the research literature, and more recent research has examined the influences of more than one third variable effect in an analysis. The importance of investigating mediation and moderation effects together has been recognized for some time in prevention science, but statistical methods to conduct these analyses are only now being developed. Investigations of this kind are especially valuable in prevention research where data may present several mediation and moderation relations (MacKinnon et al., 2000). The statistical similarities among mediation, confounding, and suppression. Each is quantified by measuring the change in the relationship between an independent and a dependent variable after adding a third variable to the analysis. Mediation and confounding are identical statistically and can be distinguished only on conceptual grounds. Methods to determine the confidence intervals for confounding and suppression effects are proposed based on methods developed for mediated effects. Although the statistical estimation of effects and standard errors is the same, there are important conceptual differences among the three types of effects (MacKinnon et al., 2000).

CJUS 745DISCUSSION ASSIGNMENT INSTRUCTIONS

Thread

You will take part in 3 Discussions in which you will post a thread presenting your scholarly

response on the assigned topic, writing 750–850 words. For each thread, students must support

their assertions with at least four (4) scholarly citations in APA format. The original thread must

incorporate ideas and several scholarly citations from all of the Learn material for the assigned

Module: Week.

Replies

Then, you will post replies of 250–300 words (supported with at least two cites) each to 3 or

more classmates’ threads. Each reply must incorporate at least two (2) scholarly citation(s) in

APA format. The reply posts can integrate ideas and citations from the Learn material

throughout the course.

Any sources cited must have been published within the last five years. Integrate Biblical

principles in your personal thread and in all replies to peers.

16

MEDIATION

16.1 ♦ Definition of Mediation

Chapter 10 examined research situations that involve three variables and described several possible forms of interrelationship. One of these is mediation; this involves a set of

causal hypotheses. An initial causal variable X1 may influence an outcome variable Y

through a mediating variable X2. (Some books and websites use different notations for the

three variables; for example, on Kenny’s mediation Web page, http://www.davidakenny

.net/cm/mediate.htm, the initial causal variable is denoted X, the outcome as Y, and the

mediating variable as M.) Mediation occurs if the effect of X1 on Y is partly or entirely

“transmitted” by X2. A mediated causal model involves a causal sequence; first, X1 causes

or influences X2; then, X2 causes or influences Y. X1 may have additional direct effects on

Y that are not transmitted by X2. A mediation hypothesis can be represented by a diagram

of a causal model. Note that the term causal is used because the path diagram represents

hypotheses about possible causal influence; however, when data come from nonexperimental designs, we can only test whether a hypothesized causal model is consistent or

inconsistent with a particular causal model. That analysis falls short of proof that any

specific causal model is correct.

16.1.1 ♦ Path Model Notation

Path model notation was introduced in Chapter 10 (see Table 10.2), and it is briefly

reviewed here. We begin with two variables (X and Y). Arrows are used to correspond to

paths that represent different types of relations between variables. The absence of an

arrow between X and Y corresponds to an assumption that these variables are not related

in any way; they are not correlated or confounded, and they are not directly causally connected. A unidirectional arrow corresponds to the hypothesis that one variable has a

causal influence on the other—for example, X → Y corresponds to the hypothesis that X

causes or influences Y; Y → X corresponds to the hypothesis that Y causes or influences

X. A bidirectional or double-headed arrow represents a noncausal association, such as

correlation or confounding of variables that does not arise from any causal connection

between them. In path diagrams, these double-headed arrows may be shown as curved

lines.

645

646——CHAPTER 16

If we consider only two variables, X and Y, there are four possible models: (1) X and Y

are not related in any way (this is denoted in a path diagram by the absence of a path

between X and Y), (2) X causes Y (X → Y), (3) Y causes X (Y → X), and (4) X and Y are

correlated but not because of any causal influence (X

Y).1 When a third variable is

added, the number of possible relationships among the variables X1, X2, and Y increases

substantially, as discussed in Chapter 10. One theoretical model corresponds to X1 and X2

as correlated causes of Y. For this model, the appropriate analysis is a regression to predict

Y from both X1 and X2 (as discussed in Chapter 11). Another possible hypothesis is that

X2 may be a moderator of the relationship between X1 and Y; this is also described as an

interaction between X2 and X1 as predictors of Y. Statistical significance and nature of

interaction can be assessed using the procedures described in Chapter 15. Chapter 10

outlined procedures for preliminary exploratory data analyses that can help a data analyst

decide which of many possible patterns of relationship need to be examined in further

analysis.

16.1.2 ♦ Circumstances When Mediation May Be a Reasonable Hypothesis

Because a mediated causal model includes the hypothesis that X1 causes or influences

X2 and the hypothesis that X2 causes or influences Y, it does not make sense to consider

mediation analysis in situations where one or both of these hypotheses would be nonsense. For X1 to be hypothesized as a cause of X2, X1 should occur before X2, and there

should be a plausible mechanism through which X1 could influence X2. For example, suppose we are interested in a possible association between height and salary (a few studies

suggest that taller people earn higher salaries). It is conceivable that height influences

salary (perhaps employers have a bias that leads them to pay tall people more money). It

is not conceivable that a person’s salary changes his or her height.

16.2 ♦ A Hypothetical Research Example

Involving One Mediating Variable

The hypothetical data introduced in Chapter 10 as an illustration of a mediation hypothesis involved three variables: X1, age; X2, body weight, and Y, systolic blood pressure

(BloodPressure or SBP). The data are in an SPSS file named ageweightbp.sav; the scores

also appear in Table 10.3. The hypothetical dataset has N = 30 to make it easy to carry out

the same analyses using the data in Table 10.3. Note that for research applications of

mediation analysis, much larger sample sizes should be used.

For these variables, it is plausible to hypothesize the following causal connections. Blood

pressure tends to increase as people age. As people age, body weight tends to increase (this

could be due to lower metabolic rate, reduced activity level, or other factors). Other factors

being equal, increased body weight makes the cardiovascular system work harder, and this

can increase blood pressure. It is possible that at least part of the age-related increase in

blood pressure might be mediated by age-related weight gain. Figure 16.1 is a path model

that represents this mediation hypothesis for this set of three variables.

To estimate the strength of association that corresponds to each path in Figure 16.1, a

series of three ordinary least squares (OLS) linear regression analyses can be run. Note

Mediation——647

Figure 16.1 ♦ Hypothetical Mediation Example: Effects of Age on Systolic Blood Pressure (SBP)

c

X1 Age

Y SBP

X2 Weight

a

X1 Age

b

c’

Y SBP

NOTES: Top panel: The total effect of age on SBP is denoted by c. Bottom panel: The path coefficients (a, b, c′) that estimate the

strength of hypothesized causal associations are estimated by unstandardized regression coefficients. The product a × b

estimates the strength of the mediated or indirect effect of age on SBP, that is, how much of the increase in SBP that occurs as

people age is due to weight gain. The c′ coefficient estimates the strength of the direct (also called partial) effect of age on SPB,

that is, any effect of age on SBP that is not mediated by weight. The coefficients in this bottom panel decompose the total effect

(c) into a direct effect (c′) and an indirect effect (a × b). When ordinary least squares regression is used to estimate

unstandardized path coefficients, c = (a × b) + c′; the total relationship between age and SBP is the sum of the direct

relationship between age and SBP and the indirect or mediated effect of age on SBP through weight.

that a variable is dependent if it has one or more unidirectional arrows pointing toward it.

We run a regression analysis for each dependent variable (such as Y), using all variables that

have unidirectional arrows that point toward Y as predictors. For the model in Figure 16.1,

the first regression predicts Y from X1 (blood pressure from age). The second regression

predicts X2 from X1 (weight from age). The third regression predicts Y from both X1 and

X2 (blood pressure predicted from both age and weight).

16.3 ♦ Limitations of Causal Models

Path models similar to the one in Figure 16.1 are called “causal” models because each

unidirectional arrow represents a hypothesis about a possible causal connection between

two variables. However, the data used to estimate the strength of relationship for the paths

are almost always from nonexperimental studies, and nonexperimental data cannot prove

causal hypotheses. If the path coefficient between two variables such as X2 and Y (this coefficient is denoted b in Figure 16.1) is statistically significant and large enough in magnitude to indicate a change in the outcome variable that is clinically or practically important,

this result is consistent with the possibility that X2 might cause Y, but it is not proof of a

causal connection. Numerous other situations could yield a large path coefficient between

648——CHAPTER 16

X2 and Y. For example, Y may cause X2; both Y and X2 may be caused by some third variable, X3; X2 and Y may actually be measures of the same variable; the relationship between

X2 and Y may be mediated by other variables, X4, X5, and so on; or a large value for the b

path coefficient may be due to sampling error.

16.3.1 ♦ Reasons Why Some Path Coefficients May Be Not Statistically Significant

If the path coefficient between two variables is not statistically significantly different

from zero, there are also several possible reasons. If the b path coefficient in Figure 16.1 is

close to zero, this could be because there is no causal or noncausal association between X2

and Y. However, a small path coefficient could also occur because of sampling error or

because assumptions required for regression are severely violated.

16.3.2 ♦ Possible Interpretations for a Statistically Significant Path

A large and statistically significant b path coefficient is consistent with the hypothesis

that X2 causes Y, but it is not proof of that causal hypothesis. Replication of results (such

as values of a, b, and c′ path coefficients in Figure 16.1) across samples increases confidence that findings are not due to sampling error. For predictor variables and/or hypothesized mediating variables that can be experimentally manipulated, experimental studies

can be done to provide stronger evidence whether associations between variables are

causal (MacKinnon, 2008). By itself, a single mediation analysis only provides preliminary nonexperimental evidence to evaluate whether the proposed causal model is plausible (i.e., consistent with the data).

16.4 ♦ Questions in a Mediation Analysis

Researchers typically ask two questions in a mediation analysis. The first question is

whether there is a statistically significant mediated path from X1 to Y via X2 (and whether

the part of the Y outcome variable score that is predictable from this path is large enough to

be of practical importance). Recall from the discussion of the tracing rule in Chapter 10

that when a path from X to Y includes more than one arrow, the strength of the relationship

for this multiple-step path is obtained by multiplying the coefficients for each included

path. Thus, the strength of the mediated relationship (the path from X1 to Y through X2 in

Figure 16.1) is estimated by the product of the a × b (ab) coefficients. The null hypothesis

of interest is H0: ab = 0. Note that the unstandardized regression coefficients are used for

this significance test. Later sections in this chapter describe test statistics for this null

hypothesis. If this mediated path is judged to be nonsignificant, the mediation hypothesis

is not supported, and the data analyst would need to consider other explanations.

If there is a significant mediated path (i.e., the ab product differs significantly from

zero), then the second question in the mediation analysis is whether there is also a significant direct path from X1 to Y; this path is denoted c′ in Figure 16.1. If c′ is not statistically significant (or too small to be of any practical importance), a possible inference is

that the effect of X1 on Y is completely mediated by X2. If c′ is statistically significant and

large enough to be of practical importance, a possible inference is that the influence of X1

Mediation——649

on Y is only partially mediated by X2 and that X1 has some additional effect on Y that is

not mediated by X2. In the hypothetical data used for the example in this chapter (in the

SPSS file ageweightbp.sav), we will see that the effects of age on blood pressure are only

partially mediated by body weight.

Of course, it is possible that there could be additional mediators of the effect of age on

blood pressure, for example, age-related changes in the condition of arteries might also

influence blood pressure. Models with multiple mediating variables are discussed briefly

later in the chapter.

16.5 ♦ Issues in Designing a Mediation Analysis Study

A mediation analysis begins with a minimum of three variables. Every unidirectional

arrow that appears in Figure 16.1 represents a hypothesized causal connection and must

correspond to a plausible theoretical mechanism. A model such as age → body weight →

blood pressure seems reasonable; processes that occur with advancing age, such as slowing metabolic rate, can lead to weight gain, and weight gain increases the demands on the

cardiovascular system, which can cause an increase in blood pressure. However, it would

be nonsense to propose a model of the following form: blood pressure → body weight →

age, for example; there is no reasonable mechanism through which blood pressure could

influence body weight, and weight cannot influence age in years.

16.5.1 ♦ Type and Measurement of Variables in Mediation Analysis

Usually all three variables (X1, X2, and Y) in a mediation analysis are quantitative. A

dichotomous variable can be used as a predictor in regression (Chapter 12), and therefore

it is acceptable to include an X1 variable that is dichotomous (e.g., treatment vs. control)

as the initial causal variable in a mediation analysis; OLS regression methods can still be

used in this situation. However, both X2 and Y are dependent variables in mediation

analysis; if one or both of these variables are categorical, then logistic regression is

needed to estimate regression coefficients, and this complicates the interpretation of outcomes (see MacKinnon, 2008, chap. 11).

It is helpful if scores on the variables can be measured in meaningful units because

this makes it easier to evaluate whether the strength of influence indicated by path coefficients is large enough to be clinically or practically significant. For example, suppose

that we want to predict annual salary in dollars (Y) from years of education (X1). An

unstandardized regression coefficient is easy to interpret. A student who is told that each

additional year of education predicts a $50 increase in annual salary will understand that

the effect is too weak to be of any practical value, while a student who is told that each

additional year of education predicts a $5,000 increase in annual salary will understand

that this is enough money to be worth the effort. Often, however, measures are given in

arbitrary units (e.g., happiness rated on a scale from 1 = not happy at all to 7 = extremely

happy). In this kind of situation, it may be difficult to judge the practical significance of a

half-point increase in happiness.

As in other applications of regression, measurements of variables are assumed to be

reliable and valid. If they are not, regression results can be misleading.

650——CHAPTER 16

16.5.2 ♦ Temporal Precedence or Sequence of Variables in Mediation Studies

Hypothesized causes must occur earlier in time than hypothesized outcomes (temporal precedence, as discussed in Chapter 1). It seems reasonable to hypothesize that “being

abused as a child” might predict “becoming an abuser as an adult”; it would not make

sense to suggest that being an abusive adult causes a person to have experiences of abuse

in childhood. Sometimes measurements of the three variables X1, X2, and Y are all

obtained at the same time (e.g., in a one-time survey). If X1 is a retrospective report of

experiencing abuse as a child, and Y is a report of current abusive behaviors, then the

requirement for temporal precedence (X1 happened before Y) may be satisfied. In some

studies, measures are obtained at more than one point in time; in these situations, it

would be preferable to measure X1 first, then X2, and then Y; this may help to establish

temporal precedence. When all three variables are measured at the same point in time and

there is no logical reason to believe one of them occurs earlier in time than the others, it

may not be possible to establish temporal precedence.

16.5.3 ♦ Time Lags Between Variables

When measures are obtained at different points in time, it is important to consider the

time lag between measures. If this time lag is too brief, the effects of X1 may not be apparent yet when Y is measured (e.g., if X1 is initiation of treatment with either placebo or

Prozac, a drug that typically does not have full antidepressant effects until about 6 weeks,

and Y is a measure of depression and is measured one day after X1, then the full effect of

the drug will not be apparent). Conversely, if the time lag is too long, the effects of X1 may

have worn off by the time Y is measured. Suppose that X1 is receiving positive feedback

from a relationship partner and Y is relationship satisfaction, and Y is measured 2 months

after X1. The effects of the positive feedback (X1) may have dissipated over this period of

time. The optimal time lag will vary depending on the variables involved; some X1 interventions or measured variables may have immediate but not long-lasting effects, while

others may require a substantial time before effects are apparent.

16.6 ♦ Assumptions in Mediation Analysis

and Preliminary Data Screening

Unless the types of variables involved require different estimation methods (e.g., if a

dependent variable is categorical, logistic regression methods are required), the coefficients (a, b, and c′) associated with the paths in Figure 16.1 can be estimated using OLS

regression. All of the assumptions required for regression (see Chapters 9 and 11) are also

required for mediation analysis. Because preliminary data screening has been presented

in greater detail earlier, data screening procedures are reviewed here only briefly. For each

variable, histograms or other graphic methods can be used to assess whether scores on

all quantitative variables are reasonably normally distributed, without extreme outliers. If

the X1 variable is dichotomous, both groups should have a reasonably large number of

cases. Scatter plots can be used to evaluate whether relationships between each pair of

variables appear to be linear (X1 with Y, X1 with X2, and X2 with Y) and to identify bivariate outliers. Decisions about handling any identified outliers should be made at an early

stage in the analysis, as discussed in Chapter 4.

Mediation——651

Baron and Kenny (1986) suggested that a mediation model should not be tested unless

there is a significant relationship between X1 and Y. In more recent treatments of mediation, it has been pointed out that in situations where one of the path coefficients is negative, there can be significant mediated effects even when X1 and Y are not significantly

correlated (A. F. Hayes, 2009). This can be understood as a form of suppression; see

Section 10.12.5.3 for further discussion with examples. If none of the pairs of variables in

the model are significantly related to each other in bivariate analyses, however, there is not

much point in testing mediated models.

16.7 ♦ Path Coefficient Estimation

The most common way to obtain estimates of the path coefficients that appear in Figure 16.1

is to run the following series of regression analyses. These steps are similar to those recommended by Baron and Kenny (1986), except that, as suggested in recent treatments of

mediation (MacKinnon, 2008), a statistically significant outcome on the first step is not

considered a requirement before going on to subsequent steps.

Step 1. First, a regression is run to predict Y (blood pressure or SBP) from X1 (age). (SPSS

procedures for this type of regression were provided in Chapter 9.) The raw or unstandardized regression coefficient from this regression corresponds to path c. This step is sometimes omitted; however, it provides information that can help evaluate how much

controlling for the X2 mediating variable reduces the strength of association between X1

and Y. Figure 16.2 shows the regression coefficients part of the output. The unstandardized

regression coefficient for the prediction of Y (BloodPressure—note that there is no blank

within the SPSS variable name) from X1 (age) is c = 2.862; this is statistically significant,

t(28) = 6.631, p < .001. (The N for this dataset is 30; therefore, the df for this t ratio is N – 2
= 28.) Thus, the overall effect of age on blood pressure is statistically significant.
Step 2. Next a regression is performed to predict the mediating variable (X2, weight) from
the causal variable (X1, age). The results of this regression provide the path coefficient
for the path denoted a in Figure 16.1 and also the standard error of a (sa) and the t test for
the statistical significance of the a path coefficient (ta). The coefficient table for this
regression appears in Figure 16.3. For the hypothetical data, the unstandardized a path
coefficient was 1.432, with t(28) = 3.605, p = .001.
Figure 16.2 ♦ Regression Coefficient to Predict Blood Pressure (Y) From Age (X1)
Coefficientsa
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
Age
Coefficients
Std. Error
Beta
10.398
26.222
2.862
.432
t
.782
a. Dependent Variable: BloodPressure
NOTE: The raw score slope in this equation, 2.862, corresponds to coefficient c in the path diagram in Figure 16.1.
Sig.
.397
.695
6.631
.000
652——CHAPTER 16
Figure 16.3 ♦ Regression Coefficient to Predict Weight (Mediating Variable X2) From Age (X1)
Coefficientsa
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
Age
Std. Error
78.508
24.130
1.432
.397
Coefficients
Beta
t
.563
Sig.
3.254
.003
3.605
.001
a. Dependent Variable: Weight
NOTE: The raw score slope from this equation, 1.432, corresponds to the path labeled a in Figure 16.1.
Step 3. Finally, a regression is performed to predict the outcome variable Y (blood pressure) from both X1 (age) and X2 (weight). (Detailed examples of regression with two
predictor variables appeared in Chapter 11.) This regression provides estimates of the
unstandardized coefficients for path b (and sb and tb) and also path c′ (the direct or
remaining effect of X1 on Y when the mediating variable has been included in the analysis). See Figure 16.1 for the corresponding path diagram. From Figure 16.4, path b = .49,
t(27) = 2.623, p = .014; path c′ = 2.161, t(27) = 4.551, p < .001. These unstandardized
path coefficients are used to label the paths in a diagram of the causal model (top panel of
Figure 16.5). These values are also used later to test the null hypothesis H0: ab = 0. In many
research reports, particularly when the units in which the variables are measured are not
meaningful or not easy to interpret, researchers report the standardized path coefficients
(these are called beta coefficients on the SPSS output); the lower panel of Figure 16.5 shows
the standardized path coefficients. Sometimes the estimate of the c coefficient appears in
parentheses, next to or below the c′ coefficient, in these diagrams.
In addition to examining the path coefficients from these regressions, the data analyst
should pay some attention to how well the X1 and X2 variables predict Y. From Figure 16.4,
R2 = .69, adjusted R2 is .667, and this is statistically significant, F(2, 27) = 30.039, p < .001.
These two variables do a good job of predicting variance in blood pressure.
16.8 ♦ Conceptual Issues: Assessment of
Direct Versus Indirect Paths
When a path that leads from a predictor variable X to a dependent variable Y involves
other variables and multiple arrows, the overall strength of the path is estimated by multiplying the coefficients for each leg of the path (as discussed in the introduction to the
tracing rule in Section 11.10).
16.8.1 ♦ The Mediated or Indirect Path: ab
The strength of the indirect or mediated effect of age on blood pressure through weight
is estimated by multiplying the ab path coefficients. In many applications, one or more of
the variables are measured in arbitrary units (e.g., happiness may be rated on a scale from
1 to 7). In such situations, the unstandardized regression coefficients may not be very
Mediation——653
Figure 16.4 ♦ Regression Coefficient to Predict Blood Pressure (Y) From Age (X1) and Mediating Variable
Weight (X2)
Model Summary
Adjusted R
Model
R
R Square
.831a
1
Std. Error of the
Square
.690
Estimate
.667
36.692
a. Predictors: (Constant), Weight, Age
ANOVAb
Model
1
Sum of Squares
Regression
Mean Square
F
2
40441.066
36349.735
27
1346.286
117231.867
29
Residual
Total
df
80882.132
Sig.
30.039
.000a
a. Predictors: (Constant), Weight, Age
b. Dependent Variable: BloodPressure
Coefficientsa
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
Coefficients
Std. Error
Beta
-28.046
27.985
2.161
.475
.490
.187
Age
Weight
t
Sig.
-1.002
.325
.590
4.551
.000
.340
2.623
.014
a. Dependent Variable: BloodPressure
NOTE: The raw score slope for a in this equation, 2.161, corresponds to the path labeled c′ in Figure 16.1; the raw score slope for weight in this
equation, .490, corresponds to the path labeled b.
Figure 16.5 ♦ Path Coefficients for the Age/Weight/Systolic Blood Pressure (SBP) Mediation Analysis
Unstandardized Path Coefficients
Weight
a
1.432**
Age
b
.490*
2.161*
C’
SBP
(c = 2.862***)
Standardized Path Coefficients
Weight
a
.563**
Age
b
.340*
.590***
C’
(c = .782)
*p < .05, **p < .01, ***p < .001, all two-tailed.
SBP
654——CHAPTER 16
informative, and research reports often focus on standardized coefficients.2 The stan
dardized (β) coefficients for the paths in the age/weight/blood pressure hypothetical data
appear in the bottom panel of Figure 16.5. Throughout the remainder of this section, all
path coefficients are given in standardized (β coefficient) form.
Recall from Chapter 10 that, when the path from X to Y has multiple parts or arrows,
the overall strength of the association for the entire path is estimated by multiplying the
coefficients for each part of the path. Thus, the unit-free index of strength of the mediated effect (the effect of age on blood pressure, through the mediating variable weight) is
given by the product of the standardized estimates of the path coefficients, ab. For the
standardized coefficients, this product = (.563 × .340) = .191. The strength of the direct
or nonmediated path from age to blood pressure (SBP) corresponds to c′; the standardized coefficient for this path is .590. In other words, for a one–standard deviation increase
in zAge, we predict a .191 increase in zSBP through the mediating variable zWeight. In addition,
we predict a .590 increase in zSBP due to direct effects of zAge (effects that are not mediated
by zWeight); this corresponds to the c′ path. The total effect of zAge on zSBP corresponds to
path c, and the standardized coefficient for path c is .782 (the beta coefficient to predict
zSBP from zAge in Figure 16.5).
16.8.2 ♦ Mediated and Direct Path as Partition of Total Effect
The mediation analysis has partitioned the total effect of age on blood pressure (c =
.782) into a direct effect (c′ = .590) and a mediated effect (ab = .191). (Both of these are
given in terms of standardized/unit-free path coefficients.) It appears that mediation
through weight, while statistically significant, explains only a small part of the total effect
of age on blood pressure in this hypothetical example. Within rounding error, c = c′ + ab,
that is, the total effect is the sum of the direct and mediated effects. These terms are additive when OLS regression is used to obtain estimates of coefficients; when other estimation methods such as maximum likelihood are used (as in structural equation modeling
programs), these equalities may not hold. Also note that if there are missing data, each
regression must be performed on the same set of cases in order for this additive association to work.
Note that even if the researcher prefers to label and discuss paths using standardized
regression coefficients, information about the unstandardized coefficients is required to
carry out additional statistical significance tests (to find out whether the product ab differs significantly from zero, for example).
16.8.3 ♦ Magnitude of Mediated Effect
When variables are measured in meaningful units, it is helpful to think through the
magnitude of the effects in real units, as discussed in this paragraph. (The discussion in
this paragraph is helpful primarily in research situations in which units of measurement
have some real-world practical interpretation.) All of the path coefficients in the rest of
this paragraph are unstandardized regression coefficients. From the first regression
analysis, the c coefficient for the total effect of age on blood pressure was c = 2.862. In
simple language, for each 1-year increase in age, we predict an increase in blood pressure
of 2.862 mm Hg. Based on the t test result in Figure 16.2, this is statistically significant.
Mediation——655
Taking into account that people in wealthy countries often live to age 70 or older, this
implies substantial age-related increases in blood pressure; for example, for a 30-year
increase in age, we predict an increase of 28.62 mm Hg in blood pressure, and that is sufficiently large to be clinically important. This tells us that the total effect of age on systolic
blood pressure is reasonably large in terms of clinical or practical importance. From the
second regression, we find that the effect of age on weight is a = 1.432; this is also statistically significant, based on the t test in Figure 16.3. For a 1-year increase in age, we predict almost 1.5 pounds in weight gain. Again, over a period of 10 years, this implies a
sufficiently large increase in predicted body weight (about 14.32 pounds) to be of clinical
importance. The last regression (in Figure 16.4) provides information about two paths,
b and c′. The b coefficient that represents the effect of weight on blood pressure was
b = .49; this was statistically significant. For each 1-pound increase in body weight, we
predict almost a half-point increase in blood pressure. If we take into account that people
may gain 30 or 40 pounds over the course of a lifetime, this would imply weight-related
increases in blood pressure on the order of 15 or 20 mm Hg. This also seems large enough
to be of clinical interest. The indirect effect of age on blood pressure is found by multiplying a × b, in this case, 1.432 × .49 = .701. For each 1-year increase in age, a .7–mm Hg
increase in blood pressure is predicted through the effects of age on weight. Finally, the
direct effect of age on blood pressure when the mediating variable weight is statistically
controlled/taken into account is represented by c′ = 2.161. Over and above any weightrelated increases in blood pressure, we predict about a 2.2-unit increase in blood pressure
for each additional year of age. Of the total effect of age on blood pressure (a predicted
2.862–mm Hg increase in SBP for each 1-year increase in age), a relatively small part is
mediated by weight (.701), and the remainder is not mediated by weight (2.161). (Because
these are hypothetical data, this outcome does not accurately describe the importance of
weight as a mediator in real-life situations.) The mediation analysis partitions the total
effect of age on blood pressure (c = .2.862) into a direct effect (c′ = 2.161) and a mediated
effect (ab = .701). Within rounding error, c = c′ + ab, that is, the total effect c is the sum
of the direct (c′) and mediated (ab) effects.
16.9 ♦ Evaluating Statistical Significance
Several methods to test statistical significance of mediated models have been proposed.
The four most widely used procedures are briefly discussed: Baron and Kenny’s (1986)
causal-steps approach, joint significance tests for the a and b path coefficients, the Sobel
test (Sobel, 1982) for H0: ab = 0, and the use of bootstrapping to obtain confidence intervals for the ab product that represents the mediated or indirect effect.
16.9.1 ♦ Causal-Steps Approach
Fritz and MacKinnon (2007) reviewed and evaluated numerous methods for testing
whether mediation is statistically significant. A subset of these methods is described here.
Their review of mediation studies conducted between 2000 and 2003 revealed that the
most frequently reported method was the causal-steps approach described by Baron
and Kenny (1986). In Baron and Kenny’s initial description of this approach, in order to
656——CHAPTER 16
conclude that mediation may be present, several conditions were required: first, a significant total relationship between X1, the initial cause, and Y, the final outcome variable (i.e.,
a significant path c); significant a and b paths; and a significant ab product using the
Sobel (1982) test or a similar method, as described in Section 16.9.3. The decision
whether to call the outcome partial or complete mediation then depends on whether the
c′ path that represents the direct path from X1 to Y is statistically significant; if c′ is not
statistically significant, the result may be interpreted as complete mediation; if c′ is statistically significant, then only partial mediation may be occurring. Kenny has also noted
elsewhere (http://www.davidakenny.net/cm/mediate.htm) that other factors, such as the
sizes of coefficients and whether they are large enough to be of practical significance,
should also be considered and that, as with any other regression analysis, meaningful
results can only be obtained from a correctly specified model.
This approach is widely recognized, but it is not the most highly recommended procedure at present for two reasons. First, there are (relatively rare) cases in which mediation
may occur even when the original X1, Y association is not significant. For example, if one
of the paths in the mediation model is negative, a form of suppression may occur such
that positive direct and negative indirect effects tend to cancel each other out to yield a
small and nonsignificant total effect. (If a is negative, while b and c′ are positive, then
when we combine a negative ab product with a positive c′ coefficient to reconstitute the
total effect c, the total effect c can be quite small even if the separate positive direct path
and negative indirect paths are quite large.) MacKinnon, Fairchild, and Fritz (2007) refer
to this as “inconsistent mediation”; the mediator acts as a suppressor variable. See
Section 10.12.5.3 for further discussion and an example of inconsistent mediation.
Second, among the methods compared by Fritz and MacKinnon (2007), this approach
had relatively low statistical power.
16.9.2 ♦ Joint Significance Test
Fritz and MacKinnon (2007) also discussed a joint significance test approach to testing
the significance of mediation. The data analyst simply asks whether the a and b coefficients that constitute the mediated path are both statistically significant; the t tests from
the regression results are used. (On his mediation Web page at http://www.davidakenny
.net/cm/mediate.htm, Kenny suggested that if this approach is used, and if an overall risk
of Type I error of .05 is desired, each test should use a = .025, two-tailed, as the criterion
for significance.) This approach is easy to implement and has moderately good statistical
power compared with the other test procedures reviewed by Fritz and MacKinnon (2007).
However, it is not the most frequently reported method; journal reviewers may prefer better known procedures.
16.9.3 ♦ Sobel Test of H0: ab = 0
Another method to assess the significance of mediation is to examine the product of
the a, b coefficients for the mediated path. (This is done as part of the Baron & Kenny
[1986] causal-steps approach.) The null hypothesis, in this case, is H0: ab = 0. To set up a
z test statistic, an estimate of the standard error of this ab product (SEab) is needed. Sobel
(1982) provided the following approximate estimate for SEab.
Mediation——657
SEab ≈ b 2 sa2 + a 2 sb2 ,
(16.1)
where
a and b are the raw (unstandardized) regression coefficients that represent the effect
of X1 on X2 and the effect of X2 on Y, respectively;
sa is the standard error of the a regression coefficient;
sb is the standard error of the b regression coefficient.
Using the standard error from Equation 16.1 as the divisor, the following z ratio for the
Sobel (1982) test can be set up to test the null hypothesis H0: ab = 0:
z = ab/SEab.
(16.2)
The ab product is judged to be statistically significant if z is greater than +1.96 or less
than –1.96. This test is appropriate only for large sample sizes. The Sobel (1982) test is
relatively conservative, and among the procedures reviewed by Fritz and MacKinnon
(2007), it had moderately good statistical power. It is sometimes used in the context of the
Baron and Kenny (1986) causal-steps procedure and sometimes reported without the
other causal steps. The Sobel test can be done by hand; Preacher and Hayes (2008) provide an online calculator at http://people.ku.edu/~preacher/sobel/sobel.htm to compute
this z test given either the unstandardized regression coefficients and their standard
errors or the t ratios for the a and b path coefficients. Their program also provides z tests
based on alternate methods of estimating the standard error of ab suggested by the
Aroian test (Aroian, 1947) and Goodman test (Goodman, 1960).
The Sobel (1982) test was carried out for the hypothetical data on age, weight, and
blood pressure. (Note again that the N in this demonstration dataset is too small for the
Sobel test to yield accurate results; these data are used only to illustrate the use of the
techniques.) For these hypothetical data, a = 1.432, b = .490, sa = .397, and sb = .187.
These values were entered into the appropriate lines of the calculator provided at the
Preacher Web page; the results appear in Figure 16.6. Because z = 2.119, with p = .034,
two-tailed, the ab product that represents the effect of age on blood pressure mediated by
weight can be judged statistically significant.
Note that the z tests for the significance of ab assume that values of this ab product are
normally distributed across samples from the same population; it has been demonstrated
empirically that this assumption is incorrect for many values of a and b. Because of this,
authorities on mediation analysis (MacKinnon, Preacher, and their colleagues) now recommend bootstrapping methods to obtain confidence intervals for estimates of ab.
16.9.4 ♦ Bootstrapped Confidence Interval for ab
Bootstrapping has become widely used in situations where the analytic formula for
the standard error of a statistic is not known and/or there are violations of assumptions
658——CHAPTER 16
Figure 16.6 ♦ Sobel Test Results for H0: ab = 0, Using Calculator Provided by Preacher and Leonardelli at
http://www.people.ku.edu/~preacher/sobel/sobel.htm
Input:
a
1.43 2
b
.490
sa
.397
sb
.187
Std. Error:
Sobel test:
2.119
0.330
0.034
Aroian test:
2.068
0.339
0.038
2.175
0.322
0.029
Goodman test:
R eset all
Input:
ta
3.60 5
tb
2.62 3
p-value:
Test statistic:
Test statistic:
p-value:
Sobel test:
2.120
0.033
Aroian test:
2.069
0.038
2.176
0.029
Goodman test:
R eset all
NOTE: This test is only recommended for use with large N samples. The dataset used for this example has N = 30; this was used only as a
demonstration.
of normal distribution shape (Iacobucci, 2008). A sample is drawn from the population
(with replacement), and values of a, b, and ab are calculated for this sample. This process
is repeated many times (bootstrapping procedures typically allow users to request from
1,000 up to 5,000 different samples). The value of ab is tabulated across these samples;
this provides an empirical sampling distribution that can be used to derive a value for
the standard error of ab. Results of such bootstrapping indicate that the distribution of
ab values is often asymmetrical, and this asymmetry should be taken into account when
setting up confidence interval (CI) estimates of ab. This CI provides a basis for evaluation of the single estimate of ab obtained from analysis of the entire data set.
Bootstrapped CIs do not require that the ab statistic have a normal distribution across
samples. If this CI does not include zero, the analyst concludes that there is statistically
significant mediation. Some bootstrapping programs include additional refinements,
such as bias correction (see Fritz & MacKinnon, 2007). Most structural equation modeling (SEM) programs, such as Amos, can provide bootstrapped CIs (a detailed example is
presented in Section 16.13).
For data analysts who do not have access to an SEM program, Preacher and Hayes
(2008) provide online scripts and macros for SPSS and SAS that provide bootstrapped CIs
for tests of mediation (go to Hayes’s Web page at http://www.afhayes.com/spss-sas-andmplus-macros-and-code.html and look for the link to download the SPSS script, on the
Mediation——659
line that says “Script: Indirect.sbs”; download the indirect.sbs file to your computer). An
SPSS script is a syntax file that generates a dialog window for the procedure that makes
it easy for the user to enter variable names and select options.
To run the script, open your SPSS data file; from the top-level menu on the Data View
page, select the menu options → ; from the pull-down menu, select
Script as the type of file to open. See Figure 16.7 for an SPSS screen shot. Then locate the
file indirect.sbs downloaded from the Hayes website and open it. This will appear as
shown in Figure 16.8. Do not modify the script in any way. To run the script, on the menu
bar across the top of the indirect.SPS script window, click on the right arrow button (that
resembles the play button on an audio or video player). This opens a dialog window for
the Indirect procedure, as shown in Figure 16.9.
For the hypothetical data in this chapter, the dependent variable blood pressure is
moved into the window for dependent variable Y. The proposed mediator is weight. The
independent variable (X) is age. Note that this procedure allows entry of multiple mediators; this will be discussed in a later section of the chapter; it also allows one or more
covariates to be included in the analysis. Under the heading Bootstrap Samples, the number of samples can be selected from a menu (with values that range from 1,000 to 5,000).
The confidence level for the CI for ab is set at 95% as a default value, and this can be
changed by the user. In addition, there are different choices of estimation procedures for
the CI; the default is “Bias corrected and accelerated.” (Accelerated refers to a correction
for possible skewness in the sampling distribution of ab.)
Figure 16.7 ♦ SPSS Menu Selections to Open the SPSS Indirect Script File
NOTE: → , then select Script from the pull-down menu.
660——CHAPTER 16
Figure 16.8 ♦ SPSS Script Indirect.sbs in Syntax Editor Window (Preacher & Hayes, 2008)
Figure 16.9 ♦ SPSS Dialog Window for Indirect Script
When these selections have been made, click OK; the output appears in Figure 16.10.
Many of the results duplicate those from the earlier regression results; for example, the
estimates of the unstandardized path coefficients for paths a, b, c, and c′ are the same as
those obtained using regression methods. From this printout, we can confirm that the
(unstandardized) path coefficients are a = 1.432, b = .4897, c′ = 2.161, and c = 2.8622
(these agree with the regression values reported earlier, except for some rounding error).
The value of ab = .7013. A normal theory test (i.e., a test that assumes that a z statistic
similar to the Sobel test is valid) in the output from the Indirect procedure provides
z = 2.1842; this is close to the Sobel test value reported in Figure 16.6.
Mediation——661
Figure 16.10 ♦ Output From SPSS Indirect Script: One Mediating Variable
Run MATRIX procedure:
Dependent, Independent, and Proposed Mediator Variables:
DV =
BloodPre
IV =
Age
MEDS = Weight
Sample size
30
IV to Mediators (a paths)
Coeff
se
Weight
1.4321
.3972
t
3.6054
p
.0012
Direct Effects of Mediators on DV (b paths)
Coeff
se
t
p
Weight
.4897
.1867
2.6228
.0142
Total Effect of IV on DV (c path)
Coeff
se
t
Age
2.8622
.4317
6.6308
p
.0000
Direct Effect of IV on DV (c-prime path)
Coeff
se
t
p
Age
2.1610
.4749
4.5507
.0001
Model Summary for DV Model
R-sq Adj R-sq
F
.6899
.6670
30.0390
df1
2.0000
df2
27.0000
p
.0000
******************************************************************
NORMAL THEORY TESTS FOR INDIRECT EFFECTS
Indirect Effects of IV on DV through Proposed Mediators (ab paths)
Effect
se
Z
p
TOTAL
.7013
.3211
2.1842
.0289
Weight
.7013
.3211
2.1842
.0289
*****************************************************************
(Continued)
662——CHAPTER 16
Figure 16.10 ♦ (Continued)
BOOTSTRAP RESULTS FOR INDIRECT EFFECTS
Indirect Effects of IV on DV through Proposed Mediators (ab paths)
Data
boot
Bias
SE
TOTAL
.7013
.7788
.0775
.5315
Weight
.7013
.7788
.0775
.5315
Bias Corrected and Accelerated Confidence Intervals
Lower
Upper
TOTAL
.0769
2.0792
Weight
.0769
2.0792
*****************************************************************
Level of Confidence for Confidence Intervals:
95
Number of Bootstrap Resamples:
5000
------ END MATRIX -----
Based on bootstrapping, the Indirect procedure also provides a 95% CI for the value of
the indirect effect ab (again, this is in terms of unstandardized coefficients). The lower
limit of this CI is .0769; the upper limit is 2.0792. Because this CI does not include zero,
the null hypothesis that ab = 0 can be rejected.
16.10 ♦ Effect-Size Information
Effect-size information is usually given in unit-free form (Pearson’s r and r2 can both be
interpreted as effect sizes). The raw or unstandardized path coefficients from mediation
analysis can be converted to standardized slopes; alternatively, we can examine the correlation between X1 and X2 to obtain effect-size information for the a path, as well as the
partial correlation between X2 and Y (controlling for X1) to obtain effect-size information
for the b path. There are potential problems with comparisons among standardized
regression or path coefficients. For example, if the same mediation analysis involving the
same set of three variables is conducted in two different samples (e.g., a sample of women
and a sample of men), these samples may have different standard deviations on variables
such as the predictor X1 and the outcome variable Y. Suppose that the male and female
samples yield b and c′ coefficients that are very similar, suggesting that the amount
of change in Y as a function of X1 is about the same across the two groups. When we convert raw score slopes to standardized slopes, this may involve multiplying and dividing by
Mediation——663
different standard deviations for men and women, and different standard deviations
within these groups could make it appear that the groups have different relationships
between variables (different standardized slopes but similar unstandardized slopes).
Unfortunately, both raw score (b) and standardized (β) regression coefficients can be
influenced by numerous sources of artifact that may operate differently in different groups.
Chapter 7 reviewed numerous factors that can artifactually influence the size of r (such as
outliers, curvilinearity, different distribution shapes for X and Y, unreliability of measurement of X and Y, etc.). Chapter 11 demonstrated that β coefficients can be computed from
bivariate correlations and that b coefficients are rescaled versions of β. When Y is the outcome and X is the predictor, b = β × (SDY/SDX). Both b and β coefficients can be influenced
by many of the same problems as correlations. Therefore, if we try to compare regression
coefficients across groups or samples, differences in regression coefficients across samples
may be partly due to artifacts discussed in Chapter 7. Considerable caution is required
whether we want to compare standardized or unstandardized coefficients.
Despite concerns about potential problems with standardized regression slopes (as
discussed by Greenland et al., 1991), data analysts often include standardized path coefficients in reports of mediation analysis, particularly when some or all of the variables are
not measured in meaningful units. In reporting results, authors should make it clear
whether standardized or unstandardized path coefficients are reported. Given the difficulties just discussed, it is a good idea to include both types of path coefficients.
16.11 ♦ Sample Size and Statistical Power
Assuming that the hypothesis of primary interest is H0: ab = 0, how large does sample size
need to be to have an adequate level of statistical power? Answers to questions about
sample size depend on several pieces of information: the alpha level, desired level of
power, the type of test procedure, and the population effect sizes for the strength of the
association between X1 and X2, as well as X2 and Y. Often, information from past studies
can help researchers make educated guesses about effect sizes for correlations between
variables. In the discussion that follows, a = .05 and desired power of .80 are assumed.
We can use the correlation between X1 and X2 as an estimate of the effect-size index for a
and the partial correlation between X2 and Y, controlling for X1, as an estimate of the effect
size for b. Based on recommendations about verbal labels for effect size given by Cohen
(1988), Fritz and MacKinnon (2007) designated a correlation of .14 as small, a correlation
of .39 as medium, and a correlation of .59 as large. They reported statistical power for
combinations of small (S), medium (M), and large (L) effect sizes for the a and b paths.
For example, if a researcher plans to use the Sobel (1982) test and expects that both the a
and b paths correspond to medium effects, the minimum recommended sample size from
Table 16.1 would be 90.
A few cautions are in order: Sample sizes from this table may not be adequate to guarantee significance, even if the researcher has not been overly optimistic about anticipated
effect size. Even when the power table suggests that fewer than 100 cases might be adequate
for statistical power for the test of H0: ab = 0, analysts should keep in mind that small
samples lead to more sampling error in estimates of path coefficients. For most studies that
test mediation models, minimum sample sizes of 150 to 200 would be advisable if possible.
664——CHAPTER 16
Table 16.1 ♦ Empirical Estimates of Sample Size Needed for Power of .80 When Using a = .05 as
the Criterion for Statistical Significance in Three Different Types of Mediation
Analysis
ab Effect Sizea
Joint Significanceb
Sobelc
Bootstrapped Confidence Intervald
SS
530
667
558
SM
403
422
406
SL
403
412
398
MS
405
421
404
MM
74
90
78
ML
58
66
59
LS
405
410
401
LM
59
67
59
LL
36
42
36
SOURCE: Adapted from Fritz and MacKinnon (2007, Table 3, p. 237).
NOTE: These power estimates may be inaccurate when measures of variables are unreliable, assumptions of normality are
violated, or categorical variables are used rather than quantitative variables.
a. SS indicates both a and b are small effects; SM indicates a is small and b is medium; SL indicates a is small and b is large.
b. Joint significance test: Requirement that the a and b coefficients each are statistically significant.
c. A z test for H0: ab using a method to estimate SEab proposed by Sobel (1982).
d. Without bias correction.
Fritz and MacKinnon (2007) have also made SAS and R programs available so that data
analysts can input other values for population effect sizes and desired statistical power; see
http://www.public.asu.edu/~davidpm/ripl/mediate.htm (scroll down to the line that says
“Programs for Estimating Empirical Power”).
16.12 ♦ Additional Examples of Mediation Models
Several variations of the basic mediation model in Figure 16.1 are possible. For example,
the effect of X1 on Y could be mediated by multiple variables instead of just one (see
Figure 16.11). Mediation could involve a multiple-step causal sequence. Mediation and
moderation can both occur together. The following sections provide a brief introduction
to each of these research situations; for more extensive discussion, see MacKinnon (2008).
16.12.1 ♦ Tests of Multiple Mediating Variables
In many situations, the effect of a causal variable X1 on an outcome Y might be mediated by more than one variable. Consider the effects of personality traits (such as extraversion and neuroticism) on happiness. Extraversion is moderately positively correlated
with happiness. Tkach and Lyubomirsky (2006) suggested that the effects of trait extraversion on happiness may be at least partially mediated by behaviors such as social activity. For example, people who score high on extraversion tend to engage in more social
Mediation——665
Figure 16.11 ♦ Path Model for Multiple Mediating Variables Showing Standardized Path
Coefficients
Positive/
Proactive
Behaviors
.45**
.24**
Extraversion
Happiness
.44***(.60***)
.13**
.08
Spiritual
Behaviors
.16**
.33***
Health
Behaviors
SOURCE: Adapted from Warner and Vroman (2011).
NOTE: Coefficient estimates and statistical significance testes were obtained using the Indirect.sps script (output not shown).
The effect of extraversion on happiness was partially mediated by behaviors. Positive/proactive behaviors (a1 × b1) and health
behaviors (a3 × b3) were significant mediators; spiritual behaviors did not significantly mediate effects of extraversion on
happiness.
activities, and people who engage in more social activities tend to be happier. They
demonstrated that, in their sample, the effects of extraversion on happiness were partially mediated by engaging in social activity, but there was still a significant direct
effect of extraversion on happiness. Their mediation analyses examined only one
behavior at a time as a potential mediator. However, they also noted that there are many
behaviors (other than social activity) that may influence happiness. What happens if
we consider multiple behaviors as possible mediators? The SPSS script Indirect.sbs
(discussed in Section 16.9.4) can be used to conduct simultaneous tests for more than
one mediating variable. Figure 16.11 shows standardized path coefficients obtained
666——CHAPTER 16
using the Indirect.sps script to test a multiple-mediation model (Warner & Vroman,
2011) that included three different behaviors as mediators between extraversion and happiness. (Output similar to Figure 16.10 was obtained but is not included here.) Results
indicated that the effects of extraversion on happiness were only partially mediated by
behavior. Positive/prosocial behaviors and health behaviors were both significant mediators of the effect of extraversion on happiness. Spiritual behaviors did not significantly
mediate the effects of extraversion on happiness (the path from spiritual behaviors to
happiness was not statistically significant).
16.12.2 ♦ Multiple-Step Mediated Paths
It is possible to examine a mediation sequence that involves more than one intermediate step, as in the sequence X1 → X2 → X3 → Y. If only partial mediation occurs, additional paths would need to be included in this type of model; for further discussion, see
Taylor, MacKinnon, and Tein (2008).
16.12.3 ♦ Mediated Moderation and Moderated Mediation
It is possible for moderation (as described in Chapter 15) to co-occur with mediation
in two different ways. Mediated moderation occurs when two initial causal variables (let’s
call these variables A and B) have an interaction (A × B), and the effects of this interaction
involve a mediating variable. In this situation, A, B, and the A × B interaction are included
as initial causal variables, and the mediation analysis is conducted to assess the degree to
which a potential mediating variable explains the impact of the A × B interaction on the
outcome variable.
Moderated mediation occurs when you have two different groups (e.g., males and
females), and the strength or signs of the paths in a mediation model for the same set of
variables differ across these two groups. Many structural equation modeling programs,
such as Amos, make it possible to compare path models across groups and to test hypotheses about whether one, or several, path coefficients differ between groups (e.g., males vs.
females). Further discussion can be found in Edwards and Lambert (2007); Muller, Judd,
and Yzerbyt (2005); and Preacher, Rucker, and Hayes (2007). Comparison of models
across groups using the Amos structural equation modeling program is demonstrated by
Byrne (2009).
16.13 ♦ Use of Structural Equation Modeling
Programs to Test Mediation Models
SEM programs such as LISREL, EQS, MPLUS, and Amos make it possible to test models
that include multiple-step paths (e.g., mediation hypotheses) and to compare results
across groups (to test moderation hypotheses). In addition, SEM programs make it possible to include multiple indicator variables for some or all of the constructs; in theory,
this makes it possible to assess multiple indicator measurement reliability. Most SEM
programs now also provide bootstrapping; most analysts now view SEM programs as the
preferred method for assessment of mediated models. More extensive discussion of other
Mediation——667
types of analyses that can be performed using structural equation modeling is beyond the
scope of this book; for further information, see Byrne (2009) or Kline (2010).
16.13.1 ♦ Comparison of Regression and SEM Tests of Mediation
As described in earlier sections of this chapter, simple mediated models can be tested
by using OLS linear regression in SPSS and then conducting the Sobel test to assess
whether the indirect path(s) are significant. In the following example, Amos Graphics will
be used to analyze the same empirical example. Amos is an add-on structural equation
modeling program for IBM SPSS that is licensed separately from IBM SPSS. Use of SEM
programs provides two advantages compared to the regression methods described earlier
in this chapter. First, they make it possible to test more complex path models involving a
larger number of variables. Second, most SEM programs provide bootstrapped confidence
intervals and associated statistical significance tests for ab indirect paths; bootstrapped
confidence intervals are now regarded as the best method for statistical significance testing
for indirect effects, particularly when assumptions of normality may be violated. In this
section, Amos is used only to perform one specific type of analysis, that is, to obtain CIs
and significance tests for the ab indirect effect for a simple three-variable mediated model.
16.13.2 ♦ Steps in Running Amos
Running analyses in Amos Graphics involves the following steps, each of which is discussed in more detail in subsequent sections.
1. Open the Amos Graphics program and use the drawing tools to draw a path diagram that represents the hypothesized mediated causal model.
2. Name the variables in this diagram (the variable names must correspond exactly to
the names of the variables in the SPSS data file).
3. Open the SPSS data file.
4. Edit the Analysis Properties to specify how the analysis will be performed and what
output you want to see.
5. Run the analysis and check to make sure that the analysis ran successfully; if it did
not, you may need to correct variable names and/or make changes in the path
model.
6. View and interpret the output.
16.13.3 ♦ Opening the Amos Graphics Program
From the Windows operating system, begin with the Menu (usually this is in the
lower left corner of the screen). When you click the button, make the following
selections from the popup menus, as shown in Figure 16.12: →
→ . The initial view of the Amos worksheet
appears in Figure 16.13 (if there is already a path diagram in the right-hand panel, click the
→ menu selections from the top menu bar to start with a blank worksheet).
668——CHAPTER 16
Figure 16.12 ♦ Initial Menu Selection From Start Menu to Start Amos 19 Amos Graphics
Program
NOTE: From the Menu, Select → → .
Figure 16.13 ♦ Initial Screen View in Amos Graphics
Mediation——669
The Amos Graphics worksheet has several parts. Across the top, as in most Windows
applications, there is a menu bar. Down the left-hand side are icons that represent numerous tools for drawing and modifying models and doing other operations (shown in
greater detail in Figure 16.14). Just to the right of the tools is a set of small windows with
headings: Group Number 1, Default Model, and so forth. (In this example, only one group
is used; Amos can estimate and compare model parameters for multiple groups; for
example, it can compare mediated models for males and females.) These windows are
used later to select which part of the output you want to see. To the right, the largest window is a blank drawing sheet that provides space for you to draw a path model that represents your hypotheses about causal connections.
Figure 16.14 ♦ Amos Drawing Tools
16.13.4 ♦ Amos Tools
In this brief introduction, only a few of the drawing tools in Figure 16.14 are used
(Byrne, 2009, provides more extensive examples of tool use). Beginning in the upper lefthand corner: The rectangle tool creates a rectangle that corresponds to an observed
(measured) variable. (An example at the end of Chapter 20 also includes latent variables;
these are represented by ovals.) The single-headed arrow tool is used to draw a causal
path (a detailed discussion of types of paths was presented in Chapter 10). (The doubleheaded arrow tool , not used in this example, is used to indicate that predictor variables are correlated.) The tool is used to create an error term for each dependent variable
in the path model (error terms must be explicitly included in SEM path models). Three
additional tools that are not used in this example are useful to know: The moving truck
tool can be used to move objects in the path model, the delete tool is used to delete
objects from the graph, and the clipboard is used to copy an Amos path model into the
Windows clipboard so that it can be pasted into other applications (such as Word or
PowerPoint).
670——CHAPTER 16
16.13.5 ♦ First Steps Toward Drawing and Labeling an Amos Path Model
The path model for this example is the same as the one that appeared earlier in the
bottom panel of Figure 16.1. The goal of the analysis is to assess to degree to which the
effects of age on blood pressure may be mediated by weight. All of the steps are shown
below; you can see a similar analysis (using different variable names) as an animated
tutorial at this URL: http://amosdevelopment.com/video/indirect/flash/indirect.html
(you need the Adobe Flash player to view this animation).
To draw the path model, start with the observed variables. Left click on the rectangle
tool, move the cursor over to the blank worksheet on the right, then right click; a popup
menu appears; left click on the menu option to “draw observed variable” (see top panel of
Figure 16.15). The popup menu will then disappear. Left click (and continue to hold the
button on the mouse down) on the blank worksheet in the location where you want
the variable to appear and drag the mouse; a rectangle will appear. Drag the mouse until
the location and dimensions of the rectangle look the way you want and then release the
mouse button. Your worksheet should now contain a rectangle similar to the one that
appears in the bottom panel of Figure 16.15.
Figure 16.15 ♦ Drawing a Rectangle That Corresponds to an Observed/Measured Variable
Mediation——671
To give this variable a name, point the cursor at the rectangle and right click. From the
popup menu that appears (as shown in Figure 16.16), click on Object Properties. This
opens the Object Properties dialog window; this appears in Figure 16.16 near the bottom
of the worksheet. In the space to the right of “Variable name,” type in the name of the first
variable (age). Font size and style can be modified. (Variable labels are not used in this
example. The name typed in the Variable name window must correspond exactly to the
name of the variable in the SPSS data file. If you want to have a different label appear in
the Amos diagram, enter this in the box for the Variable label.)
16.13.6 ♦ Adding Variables and Paths to the Amos Path Diagram
For this analysis, the path model needs to include the following additional elements.
Rectangles must be added for the other observed variables (weight, blood pressure). Note
that conventionally, causal sequences are diagrammed from left to right (or from top to
bottom). Age is the initial cause, and so it is placed on the left. Blood pressure is the final
outcome, so it is placed on the right. The hypothesized mediator, weight, is placed between
and above the other two variables, as shown in Figure 16.17.
To add paths to the model, left click on the unidirectional arrow tool , left click on
the initial causal variable (rectangle) in the path model and continue to hold the mouse
button down, and drag the mouse until the cursor points at the outcome or dependent
variable, then release the mouse button. An arrow will appear in the diagram. For this
model, you need three unidirectional arrows: from age to weight, from weight to blood
pressure, and from weight to blood pressure, as shown in Figure 16.17.
Figure 16.16 ♦ The Object Properties Popup Menu
672——CHAPTER 16
Figure 16.17 ♦ Final Path Model for the Mediation Analysis
0,
e1
1
Weight
0,
e2
1
Age
BloodPressure
16.13.7 ♦ Adding Error Terms for Dependent Variables
Each dependent variable (a variable is dependent if it has a unidirectional arrow pointing
toward it) must have an explicit error term. To create the error terms shown in Figure 16.17,
left click on the error term tool , move the mouse to position the cursor over a dependent
variable such as weight, and left click again. An error term (a circle with an arrow that
points toward the observed variable) will appear in the path model. Note that this arrow
has a coefficient of 1 associated with it; this predetermined value for this path is required
so that Amos can scale the error term to be consistent with the variance of the observed
variable. In Figure 16.17, this 1 was edited to display as a larger font than initially
appeared in Amos Graphics; to do this, right click near the arrow that represents this path
(positioning is tricky for this) and click on the Object Properties window; within the
Object Properties window, the font size for this path coefficient can be changed. Each
error term also is preassigned a mean of 0; for this reason, a small 0 appears near each
circle that represents an error term. You must give each error term a name, and the names
for error terms must not correspond to the names of any SPSS observed variables. It is
conventional to give error terms brief names such as e1 and e2, as shown in Figure 16.17.
16.13.8 ♦ Correcting Mistakes and Printing the Path Model
During this process, if you make a mistake or want to redraw some element of the
model, you can use the delete tool to remove any variable or path from the model.
(Amos has other tools that can be used to make the elements of these path model diagrams look nicer, such as the moving truck ; see Byrne, 2009, for details.) If you want
Mediation——673
to paste a copy of this diagram into a Word document or other application, left click on
the clipboard icon in the tool bar ( ) and then use the Paste command within Word.
When you have completed all these drawing steps, your path diagram should look similar
to the final path model that appears in Figure 16.17.
16.13.9 ♦ Opening a Data File From Amos
The next step is to open the SPSS data file that contains scores for the observed variables in this model (age, weight, blood pressure). From the top-level menu, make the following selections: → , as shown in Figure 16.18. This opens the
dialog window for Data Files, as shown in Figure 16.19. Click on the File Name button to
open a browsing window (not shown here); in this window, you can navigate to the folder
that contains your SPSS data file. (The first time you open this window, the default directory is Amos examples; you will need to navigate to one of your own data directories to
locate your data file.) When you have located the SPSS data file (for this example, it is the
file named ageweightbp.sav), highlight it, then click the Open button and then the OK
button. This will return you to the screen view in Figure 16.17.
16.13.10 ♦ Specification of Analysis Method and Request for Output
The next step is to tell Amos how to do the analysis and what output you want to see. To
do this, go to the top-level menu (as shown in Figure 16.17) and make these menu selections (see Figure 16.20): → . This opens the Analysis
Properties dialog window, as shown in Figure 16.21. Note the series of tabs across the top
of this window from left to right; only a few of these are used in this example. Click on the
Figure 16.18 ♦ Amos Menu Selections to Open the SPSS Data File
674——CHAPTER 16
Figure 16.19 ♦ Amos Dialog Window: Data Files
“Estimation” tab to specify the estimation method, select the radio button for “Maximum
likelihood,” and check the box for “Estimate means and intercepts.” The radio button for
“Fit the saturated and independence models”3 is also selected in this example. (Amos is not
very forgiving about missing data. Some options are not available, and other options must
be selected, if the SPSS data file contains any missing values. These limitations can be
avoided by either removing cases with missing data from the SPSS data file or using imputation methods to replace missing values; refer to Chapter 4 for further discussion of missing values in SPSS Statistics.)
Next, still in the “Analysis Properties” dialog window, click on the “Output” tab; in the
checklist that appears (see Figure 16.22), check the boxes for “Minimization history,”
“Standardized estimates,” “Squared multiple correlations,” and “Indirect, direct and total
effects.”
Figure 16.20 ♦ Amos Pull-Down Menu: Analysis Properties
Mediation——675
Figure 16.21 ♦ Estimation Tab in Analysis Properties Dialog Window
Figure 16.22 ♦ Output Tab in Analysis Properties Dialog Window
676——CHAPTER 16
Continuing in the “Analysis Properties” dialog window, click on the “Bootstrap” tab.
Click the checkbox for “Perform bootstrap,” and in the window for “Number of bootstrap
samples,” type in a reasonably large number (usually between 1,000 and 5,000; in this
example, 2,000 bootstrap samples were requested). Also check the box for “Bias-corrected
confidence intervals.”
To finish work in the “Analysis Properties” window, click the X in the upper right-hand
corner of this window to close it. This returns you to the screen view that appears in
Figure 16.17.
16.13.11 ♦ Running the Amos Analysis and Examining Preliminary Results
The next step is the run the requested analysis. From the top-level menu (as it appears
in Figure 16.17), make the following menu selections: → . After you do this, new information appears in the center column of the worksheet that reports preliminary information about results, as shown in Figure 16.24.
Numbers were added to this screen shot to highlight the things you will want to look at.
Number 1 points to this element in the screen: . This pair of icons provides a way to
toggle between two views of the model in Amos. If you click on the left-hand icon, this
puts you in model specification mode; in this view, you can draw or modify the path
model. When you click on the right-hand icon, results of the most recent analysis are
displayed as path coefficients superimposed on the path model, as shown in Figure 16.24.
Figure 16.23 ♦ Bootstrap Tab in Analysis Properties Window
Mediation——677
Figure 16.24 ♦ Project View After Analysis Has Run Successfully
NOTE: Initial view shows unstandardized (b) path coefficients.
The first thing you need to know after you have run an analysis is whether the analysis
ran successfully. Amos can fail to run an analysis for many reasons—for example, the
path model was not drawn correctly, or missing data in the SPSS file require different
specifications for the analysis. See number 2 in Figure 16.24; you may need to scroll up
and down in this window. If the analysis failed to run, there will be an error message in
this window. If the analysis ran successfully, then numerical results (such as the chisquare4 for model fit) will appear in this window.
16.13.12 ♦ Unstandardized Path Coefficients on Path Diagram
Path coefficients now appear on the path model diagram, as indicated by number 3 in
Figure 16.24 (the initial view shows unstandardized path coefficients). The user can toggle back and forth between viewing unstandardized versus standardized coefficient estimates by highlighting the corresponding terms in the window indicated by number 4. In
Figure 16.24, the standardized path coefficients (these correspond to b coefficients in
regression) appear. When the user highlights the option for standardized coefficients
indicated by number 4, the path model is displayed with standardized coefficients (these
correspond to b coefficients in regression), as shown in Figure 16.25. Object properties
were modified to make the display fonts larger for these coefficients. Because this model
does not include latent variables, the values of path coefficients reported by Amos are the
same as those reported earlier from linear regression analyses (see Figure 16.5).5 The
values adjacent to the rectangles that represent weight (.32) and blood pressure (.69), in
678——CHAPTER 16
Figure 16.25, are the squared multiple correlations or R2 values for the prediction of these
dependent variables. (Sometimes the locations of the numbers on Amos path model diagrams do not make it clear what parameter estimates they represent; ambiguity about
this can be resolved by looking at the text output, as described next.)
16.13.13 ♦ Examining Text Output From Amos
To view the text output, from the top-level menu, make the following menu selections:
→ , as shown in Figure 16.26.
This opens up the Text Output window. The left-hand panel of this window provides a
list of the output that is available (this is similar to the list of output that appears on the
left-hand side of the SPSS Statistics output window). Only selected output will be examined and interpreted here. Use the cursor to highlight the Estimates portion of the output,
as shown on the left-hand side of Figure 16.28. The complete output for Amos estimates
(of path coefficients, multiple R2 values, indirect effects, and other results) appears in
Figure 16.29.
In Figure 16.29, the unstandardized path coefficients are reported where it is marked
with the letter a; these coefficients correspond to the b/unstandardized regression coefficients reported earlier (in Figure 16.4, for example). The column headed C. R. (this
stands for “critical ratio,” and this is similar but not identical to a simple t ratio) reports
the ratio of each path coefficient estimate to its standard error; the computation of stan
dard error is different in SEM than in linear regression. The p value (shown as capital P in
Figure 16.25 ♦ Standardized Path Coefficients and Squared Multiple Correlations
Mediation——679
Figure 16.26 ♦ Amos Menu: View Text Output
Figure 16.27 ♦ List of Available Text Output (Left-Hand Side)
Figure 16.28 ♦ Screen View: Amos Estimates
680——CHAPTER 16
Figure 16.29 ♦ Complete Amos Estimates
Mediation——681
Amos output) appears as *** by default when it is zero to more than three decimal
places. Double clicking on any element in the output (such as a specific p value) opens
up a text box that provides an explanation of each term. Although the C. R. values are
not identical to the t values obtained when the same analysis was performed using
linear regression earlier (results in Figure 16.4), the b coefficient estimates from Amos
and the judgments about their statistical significance are the same as for the linear
regression results. The output table labeled “b” contains the corresponding standardized path coefficients.
682——CHAPTER 16
Moving to the bottom of Figure 16.28, unstandardized and standardized estimates of
the strength of the indirect effect (denoted ab in earlier sections of this chapter) are
reported where the letter c appears.
16.13.14 ♦ Locating and Interpreting Output for
Bootstrapped CI for the ab Indirect Effect
To obtain information about statistical significance, we must examine the output for
bootstrapped CIs. To see this information, double click the left mouse button on the
Estimates in the list of available output (upper left-hand panel of Figure 16.30). This
opens up a list that includes Scalars and Matrices. Double click the left mouse button on
Matrices to examine the options within this Matrices list. Within this list, select Indirect
effects (move the cursor to highlight this list entry and then left click on it). Now you
will see the Estimates/Bootstrap menu in the window in the lower left-hand side of
Figure 16.30. Left click on “Bootstrap confidence.” (To see an animation that shows this
series of menu selections, view the video at this URL: http://amosdevelopment.com/video/
indi rect/flash/indirect.html.) The right-hand panel in Figure 16.30 shows the 95% CI
results for the estimate of the unstandardized ab indirect effect (in this example, the effect
of age on blood pressure, mediated by weight). The lower and upper limits of this 95% CI
are .122 and 2.338. The result of a statistical significance test for H0: ab = 0, using an error
term derived from bootstrapping, is p = .011. While there are differences in some numerical values, the Amos analysis of the mediation model presented earlier (Figure 16.1) was
generally similar to the results obtained used linear regression.
Figure 16.30 ♦ Output for Bootstrapped Confidence Interval for ab Indirect of Mediated Effect of
Age on Blood Pressure Through Weight
NOTE: For an animated demonstration of the series of selections in the text output list that are required to view this result, view
the video at this URL: http://amosdevelopment.com/video/indirect/flash/indirect.html.
Mediation——683
16.13.15 ♦ Why Use Amos/SEM Rather Than OLS Regression?
There are two reasons why it is worthwhile to learn how to use Amos (or other SEM
programs) to test mediated models. First, it is now generally agreed that bootstrapping is
the preferred method to test the statistical significance of indirect effects in mediated
models; bootstrapping may be more robust to violations of assumptions of normality.
Second, once a student has learned to use Amos (or other SEM programs) to test simple
mediation models similar to the example in this chapter, the program can be used to add
additional predictor and/or mediator variables, as shown in Figure 16.11. SEM programs
have other uses that are briefly discussed at the end of Chapter 20; for example, SEM
programs can be used to do confirmatory factor analysis, and SEM models can include
latent variables with multiple indicators.
16.14 ♦ Results Section
For the hypothetical data in this chapter, a Results section could read as follows. Results
presented here are based on the output from linear regression (Figures 16.2–16.4) and the
Sobel test result in Figure 16.6. (Results would include slightly different numerical values
if the Amos output is used.)
Results
A mediation analysis was performed using the Baron and Kenny (1986) causalsteps approach; in addition, a bootstrapped confidence interval for the ab indirect effect was obtained using procedures described by Preacher and Hayes
(2008). The initial causal variable was age, in years; the outcome variable was
systolic blood pressure (SBP), in mm Hg; and the proposed mediating variable
was body weight, measured in pounds. [Note to reader: The sample N, mean,
standard deviation, minimum and maximum scores for each variable, and correlations among all three variables would generally appear in earlier sections.]
Refer to Figure 16.1 for the path diagram that corresponds to this mediation
hypothesis. Preliminary data screening suggested that there were no serious
violations of assumptions of normality or linearity. All coefficients reported here
are unstandardized, unless otherwise noted; a = .05 two-tailed is the criterion
for statistical significance.
The total effect of age on SBP was significant, c = 2.862, t(28) = 6.631, p <
.001; each 1-year increase in age predicted approximately a 3-point increase in
SBP in mm Hg. Age was significantly predictive of the hypothesized mediating
variable, weight; a = 1.432, t(28) = 3.605, p = .001. When controlling for age,
weight was significantly predictive of SBP, b = .490, t(27) = 2.623, p = .014. The
estimated direct effect of age on SBP, controlling for weight, was c′ = 2.161,
t(27) = 4.551, p < .001.
SBP was predicted quite well from age and weight, with adjusted R2 = .667
and F(2, 27) = 30.039, p < .001.
The indirect effect, ab, was .701. This was judged to be statistically significant
using the Sobel (1982) test, z = 2.119, p = .034. [Note to reader: The Sobel test
should be used only with much larger sample sizes than the N of 30 for this
hypothetical dataset.] Using the SPSS script for the Indirect procedure (Preacher
684——CHAPTER 16
& Hayes, 2008), bootstrapping was performed; 5,000 samples were requested;
a bias-corrected and accelerated confidence interval (CI) was created for ab. For
this 95% CI, the lower limit was .0769 and the upper limit was 2.0792.
Several criteria can be used to judge the significance of the indirect path. In
this case, both the a and b coefficients were statistically significant, the Sobel
test for the ab product was significant, and the bootstrapped CI for ab did not
include zero. By all these criteria, the indirect effect of age on SBP through
weight was statistically significant. The direct path from age to SBP (c′) was also
statistically significant; therefore, the effects of age on SBP were only partly
mediated by weight.
The upper diagram in Figure 16.5 shows the unstandardized path coefficients
for this mediation analysis; the lower diagram shows the corresponding stan
dardized path coefficients.
Comparison of the coefficients for the direct versus indirect paths (c′ = 2.161
vs. ab = .701) suggests that a relatively small part of the effect of age on SBP is
mediated by weight. There may be other mediating variables through which age
might influence SBP, such as other age-related disease processes.
16.15 ♦ Summary
This chapter demonstrates how to assess whether a proposed mediating variable (X2)
may partly or completely mediate the effect of an initial causal variable (X1) on an outcome variable (Y). The analysis partitions the total effect of X1 on Y into a direct effect, as
well as an indirect effect through the X2 mediating variable. The path model represents
causal hypotheses, but readers should remember that the analysis cannot prove causality
if the data are collected in the context of a nonexperimental design. If controlling for X2
completely accounts for the correlation between X1 and Y, this could happen for reasons
that have nothing to do with mediated causality; for example, this can occur when X1 and
X2 are highly correlated with each other because they measure the same construct. A
mediation analysis should be undertaken only when there are good reasons to believe that
X1 causes X2 and that X2 in turn causes Y. In addition, it is highly desirable to collect data
in a manner that ensures temporal precedence (i.e., X1 occurs first, X2 occurs second, and
Y occurs third).
These analyses can be done using OLS regression; however, use of SPSS scripts provided by Preacher and Hayes (2008) provides bootstrapped estimates of confidence
intervals, and most analysts now believe this provides better information than statistical
significance tests that assume normality. SEM programs provide even more flexibility for
assessment of more complex models.
If a mediation analysis suggests that partial or complete mediation may be present,
additional research is needed to establish whether this is replicable and real. If it is possible to manipulate or block the effect of the proposed mediating variable experimentally,
experimental work can provide stronger evidence of causality (MacKinnon, 2008).
Notes
1. It is also possible to hypothesize bidirectional causality, such that X causes Y and that Y
in return also influences X; this hypothesis of reciprocal causation would be denoted with two
Mediation——685
→ Y, not with a double-headed arrow. Information about additional
unidirectional arrows, X ←
predictors of X and Y are needed to obtain separate estimates of the strengths of these two causal
paths; see Felson and Bohrnstedt (1979) and Smith (1982).
2. For discussion of potential problems with comparisons among standardized regression
coefficients, see Greenland et al. (1991). Despite the problems they and others have identified,
research reports still commonly report standardized regression or path coefficients, particularly in
situations where variables have arbitrary units of measurement.
3. Because each variable has a direct path to every other variable in this example, the chisquare for model fit is 0 (this means that the path coefficients can perfectly reconstruct the variances and covariances among the observed variables). In more advanced applications of SEM,
some possible paths are omitted, and then the model usually cannot exactly reproduce the
observed variances and covariances. In those situations, it becomes important to examine several
different indexes of model fit to evaluate the consistency between model and observed data. For
further discussion, see Kline (2010).
4. For reasons given in Note 3, in this example, chi-square equals 0.
5. SEM programs such as Amos typically use some form of maximum likelihood estimation,
while linear regression uses ordinary least squares estimation methods (see Glossary for definitions of these terms). For this reason, the estimates of path coefficients and other model parameters may differ, particularly for more complicated models.
686——CHAPTER 16
Comprehension Questions
1. Suppose that a researcher first measures a Y outcome variable, then measures an
X1 predictor and an X2 hypothesized mediating variable. Why would this not be
a good way to collect data to test the hypothesis that the effects of X1 on Y may
be mediated by X2?
2. Suppose a researcher wants to test a mediation model that says that the effects of
math ability (X1) on science achievement (Y) are mediated by sex (X2). Is this a
reasonable mediation hypothesis? Why or why not?
3. A researcher believes that the prediction of Y (job achievement) from X1 (need
for power) is different for males versus females (X2). Would a mediation analysis
be appropriate? If not, what other analysis would be more appropriate in this
situation?
4. Refer to Figure 16.1. If a, b, and ab are all statistically significant (and large
enough to be of practical or clinical importance), and c′ is not statistically significant and/or not large enough to be judged practically or clinically important,
would you say that the effects of X1 on Y are partially or completely mediated
by X2?
5. What pattern of outcomes would you expect to see for coefficient estimates in
Figure 16.1—for example, which coefficients would need to be statistically significant and large enough to be of practical importance, for the interpretation
that X2 only partly mediates the effects of X1 on Y? Which coefficients (if any)
should be not statistically significant if the effect of X1 on Y is only partly mediated by X2?
6. In Figure 16.1, suppose that you initially find that path c (the total effect of X1 on
Y) is not statistically significant and too small to be of any practical or clinical
importance. Does it follow that there cannot possibly be any indirect effects of X1
on Y that are statistically significant? Why or why not?
Comprehension Questions
7. Using Figure 16.1 again, consider this equation: c = (a × b) + c′. Which
coefficients represent direct, indirect, and total effects of X1 on Y in this
equation?
8. A researcher believes that the a path in a mediated model (see Figure 16.1) corresponds to a medium unit-free effect size and the b path in a mediated model
also corresponds to a medium unit-free effect size. If assumptions are met (e.g.,
scores on all variables are quantitative and normally distributed), and the
researcher wants to have power of about .80, what sample size would be needed
for the Sobel test (according to Table 16.1)?
Mediation——687
9. Give an example of a three-variable study for which a mediation analysis would
make sense. Be sure to make it clear which variable is the proposed initial predictor, mediator, and outcome.
10. Briefly comment on the difference between the use of a bootstrapped CI (for the
unstandardized estimate of ab) versus the use of the Sobel test. What programs
can be used to obtain the estimates for each case? Which approach is less
dependent on assumptions of normality?
Comprehension Questions