CJUS 745 ECC Advantages of Applying Moderation and Mediation Discussion
One of the most advanced quantitative methods that can be applied to criminal justice data is mediation and moderation analysis.
After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?
How do an inadequate design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?
How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?
After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?
According to MacKinnon (2011), there are several reasons for adding mediating variables into studies, including:
Moderation analysis can also provide groups in which the intervention has the greatest effect or no effect.
How does a preliminary design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?
Inadequate design is often caused by the researcher(s) failing to take into consideration the probable effect of bias in the study they are executing. Bias may cause the researchers to ignore the requirements needed to conduct the study fairly and instead interject their own inclinations into the study, thereby negating the results (Kline, 2015). Kline (2015) states that it is crucial that the researchers utilize the research design that is best suited so that the results of the analysis will be more accurate when utilizing mediation and moderating analysis.
An inadequate research design possesses the likelihood of the mediating/moderating and predictor variables affecting the accuracy of the analysis (Baron & Kenny, 1986). Poor study design selection may lead to inaccurate results. It can also contribute to presenting erroneous data that may lead to other researchers performing their own analysis to reach their own results (Kline, 2015).
The flawed analysis strategy negatively affects mediation analysis and moderating by increasing bias and reducing data accuracy, thereby leading to a maximum marginal error occurring (Baron & Kenny, 1986).
The lack of attention to assumptions can lead to errors in error independence, linearity, and collinearity (Baron & Kenny, 1986). The errors and the lack of attention to assumptions can cause incorrect and misguided results (Baron & Kenny, 1986, Kline, 2015).
How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?
The researcher’s theoretical framework identifies crucial aspects that affect a phenomenon of concern and highlight the significance of examining how those important factors may vary and in what circumstances (Baron & Kenny, 1986). Poor theoretical framework suggests that the researcher cannot grasp ideas and information that is pertinent to the research topic (Kline, 2015). The researcher may also not understand a broader area of knowledge under investigation, thereby affecting mediation and moderation analysis (Baron & Kenny, 1986). The lack of a theoretical framework can also hamper the researcher from finding specific and significant variables and how they impact or are correlated in different circumstances (Kline, 2015).
“Whatever you do, work heartily, as for the Lord and not for men, knowing that from the Lord you will receive the inheritance as your reward. You are serving the Lord Christ” (Colossians 3:23-24, ESV).
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182. https://doi.org/10.1037//0022-35184.108.40.2063
Kline, R. B. (2015). The Mediation Myth. Basic and Applied Social Psychology, 37(4), 202–213.
MacKinnon, D. P. (2011). Integrating Mediators and Moderators in Research Design. Research on Social Work Practice, 21(6), 675–681.
Here is Jeff
Mediation & Moderation Analysis
Both moderation and mediation allow researchers to address questions concerning contingencies and mechanisms that can better reveal the complexities of how a set of variables is interrelated. In recent years, applications of statistical mediation have become more prevalent in social science research for testing assumptions about why or how an independent variable is associated with an outcome of interest. However, mediation may not hold in all conditions or for all groups of people. In this paper, we reviewed and illustrated how moderated mediation analysis can be used to test whether an indirect effect is conditional on values of a proposed moderating variable. Despite its advantages for modeling complex relationships among variables, moderated mediation is under-utilized in the substantive literatures. Instead, researchers typically analyze interactions and mechanisms separately, or rely on other outdated methods for testing moderated mediation (Blair, 2022) Methods for testing mediation and moderation effects in a dataset, both together and separately. Investigations of this kind are especially valuable in prevention research to obtain information on the process by which a program achieves its effects and whether the program is effective for subgroups of individuals (Fairchild & MacKinnon, 2008).
After reading the two articles by Pick and Teo (2017) and Pais (2017), as well as the articles by Hayes and others, what are the advantages of applying this analysis?
According to Pais in 2017, advances in mediation analysis are used to examine the legacy effects of racial residential segregation in the United States on neighborhood attainments across two familial generations. The findings are supported by a comprehensive mediation analysis that provides a formal sensitivity analysis, deploys an instrumental variable, and assesses effect heterogeneity. Methodological advancements in mediation analysis are used on data from the Panel Study of Income Dynamics (PSID) and the U.S. Census to assess the relative explanatory power of these pathways for white and black families that have origins in the United States dating back at least to the height of racial residential segregation in the late 1960s. In the context of causal mediation analysis, confounders can be observed or unobserved variables that affect the mediator and the outcome and are affected by the treatment variable. Causal mediation analysis requires researchers to inspect the validity of the no “treatment-mediator” interaction assumption (Pais, 2017).
According to Pick & Teo in 2016, a number of limitations should be considered when assessing the generalizabiltiy of the findings. To minimize the effect of common method bias, several tests were employed. One-way ANOVA testing was also applied to assess whether the merging of the surveys from two different time periods was appropriately performed. While the response rate was low, the sample size was sufficient to allow the findings of the model to have general relevance. In retrospect it would have been useful to collect data about the quality of information and trust in sources of information. Furthermore, as indicated in the final path model, external change initiatives were excluded from the analysis due to low discriminant validity. Future study should examine if this particular construct is valid and present in the Australian public sector context. Consistent with the call by Oreg, Vakola, and Armenakis (2011, 514), future work should use research design appropriate for studying the longitudinal effects of change at the individual level, collect data from multi-raters and use objective indicators to supplement self-report information (Pick & Teo, 2016).
How do an inadequate design, a flawed analysis strategy, and lack of attention to assumptions affect the use of mediation and moderation analysis?
The cost of the generalizability of the general model to test mediation and moderation effects is possible inflation of Type I error, lack of power, and difficulty with interpretation of model parameters if several effects are present. The model may be simplified, however, to represent more specialized cases of mediation and moderation joint effects such as baseline by treatment interactions by constraining paths in the model to be zero. An additional limitation of models with moderation and mediation is the extensive assumptions required for accurate assessment of relations among variables (Holland, 1988). The sensitivity of conclusions to violations of assumptions is not yet known and correct conclusions will likely require repeated applications in any substantive research area. In particular, often the X variable is the only variable that represents random assignment, making interpretation of causal relations between other variables in the model susceptible to omitted variable bias. In many applications, the model results may represent descriptive information about how variables are related rather than elucidating true causal relations among variables. Information on true causal relations will require programs of research to replicate and extend results as well as information from other sources such as qualitative information and replication studies in different substantive research areas (Fairchild & MacKinnon, 2008).
How does the researcher’s lack of theoretical framework concerning variables affect the application of mediation and moderation analysis?
Relations between variables are often more complex than simple bivariate relations between a predictor and a criterion. Rather these relations may be modified by, or informed by, the addition of a third variable in the research design. Examples of third variables include suppressors, confounders, covariates, mediators, and moderators (MacKinnon et al. 2000). Many of these third variable effects have been investigated in the research literature, and more recent research has examined the influences of more than one third variable effect in an analysis. The importance of investigating mediation and moderation effects together has been recognized for some time in prevention science, but statistical methods to conduct these analyses are only now being developed. Investigations of this kind are especially valuable in prevention research where data may present several mediation and moderation relations (MacKinnon et al., 2000). The statistical similarities among mediation, confounding, and suppression. Each is quantified by measuring the change in the relationship between an independent and a dependent variable after adding a third variable to the analysis. Mediation and confounding are identical statistically and can be distinguished only on conceptual grounds. Methods to determine the confidence intervals for confounding and suppression effects are proposed based on methods developed for mediated effects. Although the statistical estimation of effects and standard errors is the same, there are important conceptual differences among the three types of effects (MacKinnon et al., 2000).
CJUS 745DISCUSSION ASSIGNMENT INSTRUCTIONS
You will take part in 3 Discussions in which you will post a thread presenting your scholarly
response on the assigned topic, writing 750–850 words. For each thread, students must support
their assertions with at least four (4) scholarly citations in APA format. The original thread must
incorporate ideas and several scholarly citations from all of the Learn material for the assigned
Then, you will post replies of 250–300 words (supported with at least two cites) each to 3 or
more classmates’ threads. Each reply must incorporate at least two (2) scholarly citation(s) in
APA format. The reply posts can integrate ideas and citations from the Learn material
throughout the course.
Any sources cited must have been published within the last five years. Integrate Biblical
principles in your personal thread and in all replies to peers.
16.1 ♦ Definition of Mediation
Chapter 10 examined research situations that involve three variables and described several possible forms of interrelationship. One of these is mediation; this involves a set of
causal hypotheses. An initial causal variable X1 may influence an outcome variable Y
through a mediating variable X2. (Some books and websites use different notations for the
three variables; for example, on Kenny’s mediation Web page, http://www.davidakenny
.net/cm/mediate.htm, the initial causal variable is denoted X, the outcome as Y, and the
mediating variable as M.) Mediation occurs if the effect of X1 on Y is partly or entirely
“transmitted” by X2. A mediated causal model involves a causal sequence; first, X1 causes
or influences X2; then, X2 causes or influences Y. X1 may have additional direct effects on
Y that are not transmitted by X2. A mediation hypothesis can be represented by a diagram
of a causal model. Note that the term causal is used because the path diagram represents
hypotheses about possible causal influence; however, when data come from nonexperimental designs, we can only test whether a hypothesized causal model is consistent or
inconsistent with a particular causal model. That analysis falls short of proof that any
specific causal model is correct.
16.1.1 ♦ Path Model Notation
Path model notation was introduced in Chapter 10 (see Table 10.2), and it is briefly
reviewed here. We begin with two variables (X and Y). Arrows are used to correspond to
paths that represent different types of relations between variables. The absence of an
arrow between X and Y corresponds to an assumption that these variables are not related
in any way; they are not correlated or confounded, and they are not directly causally connected. A unidirectional arrow corresponds to the hypothesis that one variable has a
causal influence on the other—for example, X → Y corresponds to the hypothesis that X
causes or influences Y; Y → X corresponds to the hypothesis that Y causes or influences
X. A bidirectional or double-headed arrow represents a noncausal association, such as
correlation or confounding of variables that does not arise from any causal connection
between them. In path diagrams, these double-headed arrows may be shown as curved
If we consider only two variables, X and Y, there are four possible models: (1) X and Y
are not related in any way (this is denoted in a path diagram by the absence of a path
between X and Y), (2) X causes Y (X → Y), (3) Y causes X (Y → X), and (4) X and Y are
correlated but not because of any causal influence (X
Y).1 When a third variable is
added, the number of possible relationships among the variables X1, X2, and Y increases
substantially, as discussed in Chapter 10. One theoretical model corresponds to X1 and X2
as correlated causes of Y. For this model, the appropriate analysis is a regression to predict
Y from both X1 and X2 (as discussed in Chapter 11). Another possible hypothesis is that
X2 may be a moderator of the relationship between X1 and Y; this is also described as an
interaction between X2 and X1 as predictors of Y. Statistical significance and nature of
interaction can be assessed using the procedures described in Chapter 15. Chapter 10
outlined procedures for preliminary exploratory data analyses that can help a data analyst
decide which of many possible patterns of relationship need to be examined in further
16.1.2 ♦ Circumstances When Mediation May Be a Reasonable Hypothesis
Because a mediated causal model includes the hypothesis that X1 causes or influences
X2 and the hypothesis that X2 causes or influences Y, it does not make sense to consider
mediation analysis in situations where one or both of these hypotheses would be nonsense. For X1 to be hypothesized as a cause of X2, X1 should occur before X2, and there
should be a plausible mechanism through which X1 could influence X2. For example, suppose we are interested in a possible association between height and salary (a few studies
suggest that taller people earn higher salaries). It is conceivable that height influences
salary (perhaps employers have a bias that leads them to pay tall people more money). It
is not conceivable that a person’s salary changes his or her height.
16.2 ♦ A Hypothetical Research Example
Involving One Mediating Variable
The hypothetical data introduced in Chapter 10 as an illustration of a mediation hypothesis involved three variables: X1, age; X2, body weight, and Y, systolic blood pressure
(BloodPressure or SBP). The data are in an SPSS file named ageweightbp.sav; the scores
also appear in Table 10.3. The hypothetical dataset has N = 30 to make it easy to carry out
the same analyses using the data in Table 10.3. Note that for research applications of
mediation analysis, much larger sample sizes should be used.
For these variables, it is plausible to hypothesize the following causal connections. Blood
pressure tends to increase as people age. As people age, body weight tends to increase (this
could be due to lower metabolic rate, reduced activity level, or other factors). Other factors
being equal, increased body weight makes the cardiovascular system work harder, and this
can increase blood pressure. It is possible that at least part of the age-related increase in
blood pressure might be mediated by age-related weight gain. Figure 16.1 is a path model
that represents this mediation hypothesis for this set of three variables.
To estimate the strength of association that corresponds to each path in Figure 16.1, a
series of three ordinary least squares (OLS) linear regression analyses can be run. Note
Figure 16.1 ♦ Hypothetical Mediation Example: Effects of Age on Systolic Blood Pressure (SBP)
NOTES: Top panel: The total effect of age on SBP is denoted by c. Bottom panel: The path coefficients (a, b, c′) that estimate the
strength of hypothesized causal associations are estimated by unstandardized regression coefficients. The product a × b
estimates the strength of the mediated or indirect effect of age on SBP, that is, how much of the increase in SBP that occurs as
people age is due to weight gain. The c′ coefficient estimates the strength of the direct (also called partial) effect of age on SPB,
that is, any effect of age on SBP that is not mediated by weight. The coefficients in this bottom panel decompose the total effect
(c) into a direct effect (c′) and an indirect effect (a × b). When ordinary least squares regression is used to estimate
unstandardized path coefficients, c = (a × b) + c′; the total relationship between age and SBP is the sum of the direct
relationship between age and SBP and the indirect or mediated effect of age on SBP through weight.
that a variable is dependent if it has one or more unidirectional arrows pointing toward it.
We run a regression analysis for each dependent variable (such as Y), using all variables that
have unidirectional arrows that point toward Y as predictors. For the model in Figure 16.1,
the first regression predicts Y from X1 (blood pressure from age). The second regression
predicts X2 from X1 (weight from age). The third regression predicts Y from both X1 and
X2 (blood pressure predicted from both age and weight).
16.3 ♦ Limitations of Causal Models
Path models similar to the one in Figure 16.1 are called “causal” models because each
unidirectional arrow represents a hypothesis about a possible causal connection between
two variables. However, the data used to estimate the strength of relationship for the paths
are almost always from nonexperimental studies, and nonexperimental data cannot prove
causal hypotheses. If the path coefficient between two variables such as X2 and Y (this coefficient is denoted b in Figure 16.1) is statistically significant and large enough in magnitude to indicate a change in the outcome variable that is clinically or practically important,
this result is consistent with the possibility that X2 might cause Y, but it is not proof of a
causal connection. Numerous other situations could yield a large path coefficient between
X2 and Y. For example, Y may cause X2; both Y and X2 may be caused by some third variable, X3; X2 and Y may actually be measures of the same variable; the relationship between
X2 and Y may be mediated by other variables, X4, X5, and so on; or a large value for the b
path coefficient may be due to sampling error.
16.3.1 ♦ Reasons Why Some Path Coefficients May Be Not Statistically Significant
If the path coefficient between two variables is not statistically significantly different
from zero, there are also several possible reasons. If the b path coefficient in Figure 16.1 is
close to zero, this could be because there is no causal or noncausal association between X2
and Y. However, a small path coefficient could also occur because of sampling error or
because assumptions required for regression are severely violated.
16.3.2 ♦ Possible Interpretations for a Statistically Significant Path
A large and statistically significant b path coefficient is consistent with the hypothesis
that X2 causes Y, but it is not proof of that causal hypothesis. Replication of results (such
as values of a, b, and c′ path coefficients in Figure 16.1) across samples increases confidence that findings are not due to sampling error. For predictor variables and/or hypothesized mediating variables that can be experimentally manipulated, experimental studies
can be done to provide stronger evidence whether associations between variables are
causal (MacKinnon, 2008). By itself, a single mediation analysis only provides preliminary nonexperimental evidence to evaluate whether the proposed causal model is plausible (i.e., consistent with the data).
16.4 ♦ Questions in a Mediation Analysis
Researchers typically ask two questions in a mediation analysis. The first question is
whether there is a statistically significant mediated path from X1 to Y via X2 (and whether
the part of the Y outcome variable score that is predictable from this path is large enough to
be of practical importance). Recall from the discussion of the tracing rule in Chapter 10
that when a path from X to Y includes more than one arrow, the strength of the relationship
for this multiple-step path is obtained by multiplying the coefficients for each included
path. Thus, the strength of the mediated relationship (the path from X1 to Y through X2 in
Figure 16.1) is estimated by the product of the a × b (ab) coefficients. The null hypothesis
of interest is H0: ab = 0. Note that the unstandardized regression coefficients are used for
this significance test. Later sections in this chapter describe test statistics for this null
hypothesis. If this mediated path is judged to be nonsignificant, the mediation hypothesis
is not supported, and the data analyst would need to consider other explanations.
If there is a significant mediated path (i.e., the ab product differs significantly from
zero), then the second question in the mediation analysis is whether there is also a significant direct path from X1 to Y; this path is denoted c′ in Figure 16.1. If c′ is not statistically significant (or too small to be of any practical importance), a possible inference is
that the effect of X1 on Y is completely mediated by X2. If c′ is statistically significant and
large enough to be of practical importance, a possible inference is that the influence of X1
on Y is only partially mediated by X2 and that X1 has some additional effect on Y that is
not mediated by X2. In the hypothetical data used for the example in this chapter (in the
SPSS file ageweightbp.sav), we will see that the effects of age on blood pressure are only
partially mediated by body weight.
Of course, it is possible that there could be additional mediators of the effect of age on
blood pressure, for example, age-related changes in the condition of arteries might also
influence blood pressure. Models with multiple mediating variables are discussed briefly
later in the chapter.
16.5 ♦ Issues in Designing a Mediation Analysis Study
A mediation analysis begins with a minimum of three variables. Every unidirectional
arrow that appears in Figure 16.1 represents a hypothesized causal connection and must
correspond to a plausible theoretical mechanism. A model such as age → body weight →
blood pressure seems reasonable; processes that occur with advancing age, such as slowing metabolic rate, can lead to weight gain, and weight gain increases the demands on the
cardiovascular system, which can cause an increase in blood pressure. However, it would
be nonsense to propose a model of the following form: blood pressure → body weight →
age, for example; there is no reasonable mechanism through which blood pressure could
influence body weight, and weight cannot influence age in years.
16.5.1 ♦ Type and Measurement of Variables in Mediation Analysis
Usually all three variables (X1, X2, and Y) in a mediation analysis are quantitative. A
dichotomous variable can be used as a predictor in regression (Chapter 12), and therefore
it is acceptable to include an X1 variable that is dichotomous (e.g., treatment vs. control)
as the initial causal variable in a mediation analysis; OLS regression methods can still be
used in this situation. However, both X2 and Y are dependent variables in mediation
analysis; if one or both of these variables are categorical, then logistic regression is
needed to estimate regression coefficients, and this complicates the interpretation of outcomes (see MacKinnon, 2008, chap. 11).
It is helpful if scores on the variables can be measured in meaningful units because
this makes it easier to evaluate whether the strength of influence indicated by path coefficients is large enough to be clinically or practically significant. For example, suppose
that we want to predict annual salary in dollars (Y) from years of education (X1). An
unstandardized regression coefficient is easy to interpret. A student who is told that each
additional year of education predicts a $50 increase in annual salary will understand that
the effect is too weak to be of any practical value, while a student who is told that each
additional year of education predicts a $5,000 increase in annual salary will understand
that this is enough money to be worth the effort. Often, however, measures are given in
arbitrary units (e.g., happiness rated on a scale from 1 = not happy at all to 7 = extremely
happy). In this kind of situation, it may be difficult to judge the practical significance of a
half-point increase in happiness.
As in other applications of regression, measurements of variables are assumed to be
reliable and valid. If they are not, regression results can be misleading.
16.5.2 ♦ Temporal Precedence or Sequence of Variables in Mediation Studies
Hypothesized causes must occur earlier in time than hypothesized outcomes (temporal precedence, as discussed in Chapter 1). It seems reasonable to hypothesize that “being
abused as a child” might predict “becoming an abuser as an adult”; it would not make
sense to suggest that being an abusive adult causes a person to have experiences of abuse
in childhood. Sometimes measurements of the three variables X1, X2, and Y are all
obtained at the same time (e.g., in a one-time survey). If X1 is a retrospective report of
experiencing abuse as a child, and Y is a report of current abusive behaviors, then the
requirement for temporal precedence (X1 happened before Y) may be satisfied. In some
studies, measures are obtained at more than one point in time; in these situations, it
would be preferable to measure X1 first, then X2, and then Y; this may help to establish
temporal precedence. When all three variables are measured at the same point in time and
there is no logical reason to believe one of them occurs earlier in time than the others, it
may not be possible to establish temporal precedence.
16.5.3 ♦ Time Lags Between Variables
When measures are obtained at different points in time, it is important to consider the
time lag between measures. If this time lag is too brief, the effects of X1 may not be apparent yet when Y is measured (e.g., if X1 is initiation of treatment with either placebo or
Prozac, a drug that typically does not have full antidepressant effects until about 6 weeks,
and Y is a measure of depression and is measured one day after X1, then the full effect of
the drug will not be apparent). Conversely, if the time lag is too long, the effects of X1 may
have worn off by the time Y is measured. Suppose that X1 is receiving positive feedback
from a relationship partner and Y is relationship satisfaction, and Y is measured 2 months
after X1. The effects of the positive feedback (X1) may have dissipated over this period of
time. The optimal time lag will vary depending on the variables involved; some X1 interventions or measured variables may have immediate but not long-lasting effects, while
others may require a substantial time before effects are apparent.
16.6 ♦ Assumptions in Mediation Analysis
and Preliminary Data Screening
Unless the types of variables involved require different estimation methods (e.g., if a
dependent variable is categorical, logistic regression methods are required), the coefficients (a, b, and c′) associated with the paths in Figure 16.1 can be estimated using OLS
regression. All of the assumptions required for regression (see Chapters 9 and 11) are also
required for mediation analysis. Because preliminary data screening has been presented
in greater detail earlier, data screening procedures are reviewed here only briefly. For each
variable, histograms or other graphic methods can be used to assess whether scores on
all quantitative variables are reasonably normally distributed, without extreme outliers. If
the X1 variable is dichotomous, both groups should have a reasonably large number of
cases. Scatter plots can be used to evaluate whether relationships between each pair of
variables appear to be linear (X1 with Y, X1 with X2, and X2 with Y) and to identify bivariate outliers. Decisions about handling any identified outliers should be made at an early
stage in the analysis, as discussed in Chapter 4.
Baron and Kenny (1986) suggested that a mediation model should not be tested unless
there is a significant relationship between X1 and Y. In more recent treatments of mediation, it has been pointed out that in situations where one of the path coefficients is negative, there can be significant mediated effects even when X1 and Y are not significantly
correlated (A. F. Hayes, 2009). This can be understood as a form of suppression; see
Section 10.12.5.3 for further discussion with examples. If none of the pairs of variables in
the model are significantly related to each other in bivariate analyses, however, there is not
much point in testing mediated models.
16.7 ♦ Path Coefficient Estimation
The most common way to obtain estimates of the path coefficients that appear in Figure 16.1
is to run the following series of regression analyses. These steps are similar to those recommended by Baron and Kenny (1986), except that, as suggested in recent treatments of
mediation (MacKinnon, 2008), a statistically significant outcome on the first step is not
considered a requirement before going on to subsequent steps.
Step 1. First, a regression is run to predict Y (blood pressure or SBP) from X1 (age). (SPSS
procedures for this type of regression were provided in Chapter 9.) The raw or unstandardized regression coefficient from this regression corresponds to path c. This step is sometimes omitted; however, it provides information that can help evaluate how much
controlling for the X2 mediating variable reduces the strength of association between X1
and Y. Figure 16.2 shows the regression coefficients part of the output. The unstandardized
regression coefficient for the prediction of Y (BloodPressure—note that there is no blank
within the SPSS variable name) from X1 (age) is c = 2.862; this is statistically significant,
t(28) = 6.631, p < .001. (The N for this dataset is 30; therefore, the df for this t ratio is N – 2 = 28.) Thus, the overall effect of age on blood pressure is statistically significant. Step 2. Next a regression is performed to predict the mediating variable (X2, weight) from the causal variable (X1, age). The results of this regression provide the path coefficient for the path denoted a in Figure 16.1 and also the standard error of a (sa) and the t test for the statistical significance of the a path coefficient (ta). The coefficient table for this regression appears in Figure 16.3. For the hypothetical data, the unstandardized a path coefficient was 1.432, with t(28) = 3.605, p = .001. Figure 16.2 ♦ Regression Coefficient to Predict Blood Pressure (Y) From Age (X1) Coefficientsa Standardized Unstandardized Coefficients Model 1 B (Constant) Age Coefficients Std. Error Beta 10.398 26.222 2.862 .432 t .782 a. Dependent Variable: BloodPressure NOTE: The raw score slope in this equation, 2.862, corresponds to coefficient c in the path diagram in Figure 16.1. Sig. .397 .695 6.631 .000 652——CHAPTER 16 Figure 16.3 ♦ Regression Coefficient to Predict Weight (Mediating Variable X2) From Age (X1) Coefficientsa Standardized Unstandardized Coefficients Model 1 B (Constant) Age Std. Error 78.508 24.130 1.432 .397 Coefficients Beta t .563 Sig. 3.254 .003 3.605 .001 a. Dependent Variable: Weight NOTE: The raw score slope from this equation, 1.432, corresponds to the path labeled a in Figure 16.1. Step 3. Finally, a regression is performed to predict the outcome variable Y (blood pressure) from both X1 (age) and X2 (weight). (Detailed examples of regression with two predictor variables appeared in Chapter 11.) This regression provides estimates of the unstandardized coefficients for path b (and sb and tb) and also path c′ (the direct or remaining effect of X1 on Y when the mediating variable has been included in the analysis). See Figure 16.1 for the corresponding path diagram. From Figure 16.4, path b = .49, t(27) = 2.623, p = .014; path c′ = 2.161, t(27) = 4.551, p < .001. These unstandardized path coefficients are used to label the paths in a diagram of the causal model (top panel of Figure 16.5). These values are also used later to test the null hypothesis H0: ab = 0. In many research reports, particularly when the units in which the variables are measured are not meaningful or not easy to interpret, researchers report the standardized path coefficients (these are called beta coefficients on the SPSS output); the lower panel of Figure 16.5 shows the standardized path coefficients. Sometimes the estimate of the c coefficient appears in parentheses, next to or below the c′ coefficient, in these diagrams. In addition to examining the path coefficients from these regressions, the data analyst should pay some attention to how well the X1 and X2 variables predict Y. From Figure 16.4, R2 = .69, adjusted R2 is .667, and this is statistically significant, F(2, 27) = 30.039, p < .001. These two variables do a good job of predicting variance in blood pressure. 16.8 ♦ Conceptual Issues: Assessment of Direct Versus Indirect Paths When a path that leads from a predictor variable X to a dependent variable Y involves other variables and multiple arrows, the overall strength of the path is estimated by multiplying the coefficients for each leg of the path (as discussed in the introduction to the tracing rule in Section 11.10). 16.8.1 ♦ The Mediated or Indirect Path: ab The strength of the indirect or mediated effect of age on blood pressure through weight is estimated by multiplying the ab path coefficients. In many applications, one or more of the variables are measured in arbitrary units (e.g., happiness may be rated on a scale from 1 to 7). In such situations, the unstandardized regression coefficients may not be very Mediation——653 Figure 16.4 ♦ Regression Coefficient to Predict Blood Pressure (Y) From Age (X1) and Mediating Variable Weight (X2) Model Summary Adjusted R Model R R Square .831a 1 Std. Error of the Square .690 Estimate .667 36.692 a. Predictors: (Constant), Weight, Age ANOVAb Model 1 Sum of Squares Regression Mean Square F 2 40441.066 36349.735 27 1346.286 117231.867 29 Residual Total df 80882.132 Sig. 30.039 .000a a. Predictors: (Constant), Weight, Age b. Dependent Variable: BloodPressure Coefficientsa Standardized Unstandardized Coefficients Model 1 B (Constant) Coefficients Std. Error Beta -28.046 27.985 2.161 .475 .490 .187 Age Weight t Sig. -1.002 .325 .590 4.551 .000 .340 2.623 .014 a. Dependent Variable: BloodPressure NOTE: The raw score slope for a in this equation, 2.161, corresponds to the path labeled c′ in Figure 16.1; the raw score slope for weight in this equation, .490, corresponds to the path labeled b. Figure 16.5 ♦ Path Coefficients for the Age/Weight/Systolic Blood Pressure (SBP) Mediation Analysis Unstandardized Path Coefficients Weight a 1.432** Age b .490* 2.161* C’ SBP (c = 2.862***) Standardized Path Coefficients Weight a .563** Age b .340* .590*** C’ (c = .782) *p < .05, **p < .01, ***p < .001, all two-tailed. SBP 654——CHAPTER 16 informative, and research reports often focus on standardized coefficients.2 The stan dardized (β) coefficients for the paths in the age/weight/blood pressure hypothetical data appear in the bottom panel of Figure 16.5. Throughout the remainder of this section, all path coefficients are given in standardized (β coefficient) form. Recall from Chapter 10 that, when the path from X to Y has multiple parts or arrows, the overall strength of the association for the entire path is estimated by multiplying the coefficients for each part of the path. Thus, the unit-free index of strength of the mediated effect (the effect of age on blood pressure, through the mediating variable weight) is given by the product of the standardized estimates of the path coefficients, ab. For the standardized coefficients, this product = (.563 × .340) = .191. The strength of the direct or nonmediated path from age to blood pressure (SBP) corresponds to c′; the standardized coefficient for this path is .590. In other words, for a one–standard deviation increase in zAge, we predict a .191 increase in zSBP through the mediating variable zWeight. In addition, we predict a .590 increase in zSBP due to direct effects of zAge (effects that are not mediated by zWeight); this corresponds to the c′ path. The total effect of zAge on zSBP corresponds to path c, and the standardized coefficient for path c is .782 (the beta coefficient to predict zSBP from zAge in Figure 16.5). 16.8.2 ♦ Mediated and Direct Path as Partition of Total Effect The mediation analysis has partitioned the total effect of age on blood pressure (c = .782) into a direct effect (c′ = .590) and a mediated effect (ab = .191). (Both of these are given in terms of standardized/unit-free path coefficients.) It appears that mediation through weight, while statistically significant, explains only a small part of the total effect of age on blood pressure in this hypothetical example. Within rounding error, c = c′ + ab, that is, the total effect is the sum of the direct and mediated effects. These terms are additive when OLS regression is used to obtain estimates of coefficients; when other estimation methods such as maximum likelihood are used (as in structural equation modeling programs), these equalities may not hold. Also note that if there are missing data, each regression must be performed on the same set of cases in order for this additive association to work. Note that even if the researcher prefers to label and discuss paths using standardized regression coefficients, information about the unstandardized coefficients is required to carry out additional statistical significance tests (to find out whether the product ab differs significantly from zero, for example). 16.8.3 ♦ Magnitude of Mediated Effect When variables are measured in meaningful units, it is helpful to think through the magnitude of the effects in real units, as discussed in this paragraph. (The discussion in this paragraph is helpful primarily in research situations in which units of measurement have some real-world practical interpretation.) All of the path coefficients in the rest of this paragraph are unstandardized regression coefficients. From the first regression analysis, the c coefficient for the total effect of age on blood pressure was c = 2.862. In simple language, for each 1-year increase in age, we predict an increase in blood pressure of 2.862 mm Hg. Based on the t test result in Figure 16.2, this is statistically significant. Mediation——655 Taking into account that people in wealthy countries often live to age 70 or older, this implies substantial age-related increases in blood pressure; for example, for a 30-year increase in age, we predict an increase of 28.62 mm Hg in blood pressure, and that is sufficiently large to be clinically important. This tells us that the total effect of age on systolic blood pressure is reasonably large in terms of clinical or practical importance. From the second regression, we find that the effect of age on weight is a = 1.432; this is also statistically significant, based on the t test in Figure 16.3. For a 1-year increase in age, we predict almost 1.5 pounds in weight gain. Again, over a period of 10 years, this implies a sufficiently large increase in predicted body weight (about 14.32 pounds) to be of clinical importance. The last regression (in Figure 16.4) provides information about two paths, b and c′. The b coefficient that represents the effect of weight on blood pressure was b = .49; this was statistically significant. For each 1-pound increase in body weight, we predict almost a half-point increase in blood pressure. If we take into account that people may gain 30 or 40 pounds over the course of a lifetime, this would imply weight-related increases in blood pressure on the order of 15 or 20 mm Hg. This also seems large enough to be of clinical interest. The indirect effect of age on blood pressure is found by multiplying a × b, in this case, 1.432 × .49 = .701. For each 1-year increase in age, a .7–mm Hg increase in blood pressure is predicted through the effects of age on weight. Finally, the direct effect of age on blood pressure when the mediating variable weight is statistically controlled/taken into account is represented by c′ = 2.161. Over and above any weightrelated increases in blood pressure, we predict about a 2.2-unit increase in blood pressure for each additional year of age. Of the total effect of age on blood pressure (a predicted 2.862–mm Hg increase in SBP for each 1-year increase in age), a relatively small part is mediated by weight (.701), and the remainder is not mediated by weight (2.161). (Because these are hypothetical data, this outcome does not accurately describe the importance of weight as a mediator in real-life situations.) The mediation analysis partitions the total effect of age on blood pressure (c = .2.862) into a direct effect (c′ = 2.161) and a mediated effect (ab = .701). Within rounding error, c = c′ + ab, that is, the total effect c is the sum of the direct (c′) and mediated (ab) effects. 16.9 ♦ Evaluating Statistical Significance Several methods to test statistical significance of mediated models have been proposed. The four most widely used procedures are briefly discussed: Baron and Kenny’s (1986) causal-steps approach, joint significance tests for the a and b path coefficients, the Sobel test (Sobel, 1982) for H0: ab = 0, and the use of bootstrapping to obtain confidence intervals for the ab product that represents the mediated or indirect effect. 16.9.1 ♦ Causal-Steps Approach Fritz and MacKinnon (2007) reviewed and evaluated numerous methods for testing whether mediation is statistically significant. A subset of these methods is described here. Their review of mediation studies conducted between 2000 and 2003 revealed that the most frequently reported method was the causal-steps approach described by Baron and Kenny (1986). In Baron and Kenny’s initial description of this approach, in order to 656——CHAPTER 16 conclude that mediation may be present, several conditions were required: first, a significant total relationship between X1, the initial cause, and Y, the final outcome variable (i.e., a significant path c); significant a and b paths; and a significant ab product using the Sobel (1982) test or a similar method, as described in Section 16.9.3. The decision whether to call the outcome partial or complete mediation then depends on whether the c′ path that represents the direct path from X1 to Y is statistically significant; if c′ is not statistically significant, the result may be interpreted as complete mediation; if c′ is statistically significant, then only partial mediation may be occurring. Kenny has also noted elsewhere (http://www.davidakenny.net/cm/mediate.htm) that other factors, such as the sizes of coefficients and whether they are large enough to be of practical significance, should also be considered and that, as with any other regression analysis, meaningful results can only be obtained from a correctly specified model. This approach is widely recognized, but it is not the most highly recommended procedure at present for two reasons. First, there are (relatively rare) cases in which mediation may occur even when the original X1, Y association is not significant. For example, if one of the paths in the mediation model is negative, a form of suppression may occur such that positive direct and negative indirect effects tend to cancel each other out to yield a small and nonsignificant total effect. (If a is negative, while b and c′ are positive, then when we combine a negative ab product with a positive c′ coefficient to reconstitute the total effect c, the total effect c can be quite small even if the separate positive direct path and negative indirect paths are quite large.) MacKinnon, Fairchild, and Fritz (2007) refer to this as “inconsistent mediation”; the mediator acts as a suppressor variable. See Section 10.12.5.3 for further discussion and an example of inconsistent mediation. Second, among the methods compared by Fritz and MacKinnon (2007), this approach had relatively low statistical power. 16.9.2 ♦ Joint Significance Test Fritz and MacKinnon (2007) also discussed a joint significance test approach to testing the significance of mediation. The data analyst simply asks whether the a and b coefficients that constitute the mediated path are both statistically significant; the t tests from the regression results are used. (On his mediation Web page at http://www.davidakenny .net/cm/mediate.htm, Kenny suggested that if this approach is used, and if an overall risk of Type I error of .05 is desired, each test should use a = .025, two-tailed, as the criterion for significance.) This approach is easy to implement and has moderately good statistical power compared with the other test procedures reviewed by Fritz and MacKinnon (2007). However, it is not the most frequently reported method; journal reviewers may prefer better known procedures. 16.9.3 ♦ Sobel Test of H0: ab = 0 Another method to assess the significance of mediation is to examine the product of the a, b coefficients for the mediated path. (This is done as part of the Baron & Kenny  causal-steps approach.) The null hypothesis, in this case, is H0: ab = 0. To set up a z test statistic, an estimate of the standard error of this ab product (SEab) is needed. Sobel (1982) provided the following approximate estimate for SEab. Mediation——657 SEab ≈ b 2 sa2 + a 2 sb2 , (16.1) where a and b are the raw (unstandardized) regression coefficients that represent the effect of X1 on X2 and the effect of X2 on Y, respectively; sa is the standard error of the a regression coefficient; sb is the standard error of the b regression coefficient. Using the standard error from Equation 16.1 as the divisor, the following z ratio for the Sobel (1982) test can be set up to test the null hypothesis H0: ab = 0: z = ab/SEab. (16.2) The ab product is judged to be statistically significant if z is greater than +1.96 or less than –1.96. This test is appropriate only for large sample sizes. The Sobel (1982) test is relatively conservative, and among the procedures reviewed by Fritz and MacKinnon (2007), it had moderately good statistical power. It is sometimes used in the context of the Baron and Kenny (1986) causal-steps procedure and sometimes reported without the other causal steps. The Sobel test can be done by hand; Preacher and Hayes (2008) provide an online calculator at http://people.ku.edu/~preacher/sobel/sobel.htm to compute this z test given either the unstandardized regression coefficients and their standard errors or the t ratios for the a and b path coefficients. Their program also provides z tests based on alternate methods of estimating the standard error of ab suggested by the Aroian test (Aroian, 1947) and Goodman test (Goodman, 1960). The Sobel (1982) test was carried out for the hypothetical data on age, weight, and blood pressure. (Note again that the N in this demonstration dataset is too small for the Sobel test to yield accurate results; these data are used only to illustrate the use of the techniques.) For these hypothetical data, a = 1.432, b = .490, sa = .397, and sb = .187. These values were entered into the appropriate lines of the calculator provided at the Preacher Web page; the results appear in Figure 16.6. Because z = 2.119, with p = .034, two-tailed, the ab product that represents the effect of age on blood pressure mediated by weight can be judged statistically significant. Note that the z tests for the significance of ab assume that values of this ab product are normally distributed across samples from the same population; it has been demonstrated empirically that this assumption is incorrect for many values of a and b. Because of this, authorities on mediation analysis (MacKinnon, Preacher, and their colleagues) now recommend bootstrapping methods to obtain confidence intervals for estimates of ab. 16.9.4 ♦ Bootstrapped Confidence Interval for ab Bootstrapping has become widely used in situations where the analytic formula for the standard error of a statistic is not known and/or there are violations of assumptions 658——CHAPTER 16 Figure 16.6 ♦ Sobel Test Results for H0: ab = 0, Using Calculator Provided by Preacher and Leonardelli at http://www.people.ku.edu/~preacher/sobel/sobel.htm Input: a 1.43 2 b .490 sa .397 sb .187 Std. Error: Sobel test: 2.119 0.330 0.034 Aroian test: 2.068 0.339 0.038 2.175 0.322 0.029 Goodman test: R eset all Input: ta 3.60 5 tb 2.62 3 p-value: Test statistic: Test statistic: p-value: Sobel test: 2.120 0.033 Aroian test: 2.069 0.038 2.176 0.029 Goodman test: R eset all NOTE: This test is only recommended for use with large N samples. The dataset used for this example has N = 30; this was used only as a demonstration. of normal distribution shape (Iacobucci, 2008). A sample is drawn from the population (with replacement), and values of a, b, and ab are calculated for this sample. This process is repeated many times (bootstrapping procedures typically allow users to request from 1,000 up to 5,000 different samples). The value of ab is tabulated across these samples; this provides an empirical sampling distribution that can be used to derive a value for the standard error of ab. Results of such bootstrapping indicate that the distribution of ab values is often asymmetrical, and this asymmetry should be taken into account when setting up confidence interval (CI) estimates of ab. This CI provides a basis for evaluation of the single estimate of ab obtained from analysis of the entire data set. Bootstrapped CIs do not require that the ab statistic have a normal distribution across samples. If this CI does not include zero, the analyst concludes that there is statistically significant mediation. Some bootstrapping programs include additional refinements, such as bias correction (see Fritz & MacKinnon, 2007). Most structural equation modeling (SEM) programs, such as Amos, can provide bootstrapped CIs (a detailed example is presented in Section 16.13). For data analysts who do not have access to an SEM program, Preacher and Hayes (2008) provide online scripts and macros for SPSS and SAS that provide bootstrapped CIs for tests of mediation (go to Hayes’s Web page at http://www.afhayes.com/spss-sas-andmplus-macros-and-code.html and look for the link to download the SPSS script, on the Mediation——659 line that says “Script: Indirect.sbs”; download the indirect.sbs file to your computer). An SPSS script is a syntax file that generates a dialog window for the procedure that makes it easy for the user to enter variable names and select options. To run the script, open your SPSS data file; from the top-level menu on the Data View page, select the menu options → ; from the pull-down menu, select Script as the type of file to open. See Figure 16.7 for an SPSS screen shot. Then locate the file indirect.sbs downloaded from the Hayes website and open it. This will appear as shown in Figure 16.8. Do not modify the script in any way. To run the script, on the menu bar across the top of the indirect.SPS script window, click on the right arrow button (that resembles the play button on an audio or video player). This opens a dialog window for the Indirect procedure, as shown in Figure 16.9. For the hypothetical data in this chapter, the dependent variable blood pressure is moved into the window for dependent variable Y. The proposed mediator is weight. The independent variable (X) is age. Note that this procedure allows entry of multiple mediators; this will be discussed in a later section of the chapter; it also allows one or more covariates to be included in the analysis. Under the heading Bootstrap Samples, the number of samples can be selected from a menu (with values that range from 1,000 to 5,000). The confidence level for the CI for ab is set at 95% as a default value, and this can be changed by the user. In addition, there are different choices of estimation procedures for the CI; the default is “Bias corrected and accelerated.” (Accelerated refers to a correction for possible skewness in the sampling distribution of ab.) Figure 16.7 ♦ SPSS Menu Selections to Open the SPSS Indirect Script File NOTE: → , then select Script from the pull-down menu. 660——CHAPTER 16 Figure 16.8 ♦ SPSS Script Indirect.sbs in Syntax Editor Window (Preacher & Hayes, 2008) Figure 16.9 ♦ SPSS Dialog Window for Indirect Script When these selections have been made, click OK; the output appears in Figure 16.10. Many of the results duplicate those from the earlier regression results; for example, the estimates of the unstandardized path coefficients for paths a, b, c, and c′ are the same as those obtained using regression methods. From this printout, we can confirm that the (unstandardized) path coefficients are a = 1.432, b = .4897, c′ = 2.161, and c = 2.8622 (these agree with the regression values reported earlier, except for some rounding error). The value of ab = .7013. A normal theory test (i.e., a test that assumes that a z statistic similar to the Sobel test is valid) in the output from the Indirect procedure provides z = 2.1842; this is close to the Sobel test value reported in Figure 16.6. Mediation——661 Figure 16.10 ♦ Output From SPSS Indirect Script: One Mediating Variable Run MATRIX procedure: Dependent, Independent, and Proposed Mediator Variables: DV = BloodPre IV = Age MEDS = Weight Sample size 30 IV to Mediators (a paths) Coeff se Weight 1.4321 .3972 t 3.6054 p .0012 Direct Effects of Mediators on DV (b paths) Coeff se t p Weight .4897 .1867 2.6228 .0142 Total Effect of IV on DV (c path) Coeff se t Age 2.8622 .4317 6.6308 p .0000 Direct Effect of IV on DV (c-prime path) Coeff se t p Age 2.1610 .4749 4.5507 .0001 Model Summary for DV Model R-sq Adj R-sq F .6899 .6670 30.0390 df1 2.0000 df2 27.0000 p .0000 ****************************************************************** NORMAL THEORY TESTS FOR INDIRECT EFFECTS Indirect Effects of IV on DV through Proposed Mediators (ab paths) Effect se Z p TOTAL .7013 .3211 2.1842 .0289 Weight .7013 .3211 2.1842 .0289 ***************************************************************** (Continued) 662——CHAPTER 16 Figure 16.10 ♦ (Continued) BOOTSTRAP RESULTS FOR INDIRECT EFFECTS Indirect Effects of IV on DV through Proposed Mediators (ab paths) Data boot Bias SE TOTAL .7013 .7788 .0775 .5315 Weight .7013 .7788 .0775 .5315 Bias Corrected and Accelerated Confidence Intervals Lower Upper TOTAL .0769 2.0792 Weight .0769 2.0792 ***************************************************************** Level of Confidence for Confidence Intervals: 95 Number of Bootstrap Resamples: 5000 ------ END MATRIX ----- Based on bootstrapping, the Indirect procedure also provides a 95% CI for the value of the indirect effect ab (again, this is in terms of unstandardized coefficients). The lower limit of this CI is .0769; the upper limit is 2.0792. Because this CI does not include zero, the null hypothesis that ab = 0 can be rejected. 16.10 ♦ Effect-Size Information Effect-size information is usually given in unit-free form (Pearson’s r and r2 can both be interpreted as effect sizes). The raw or unstandardized path coefficients from mediation analysis can be converted to standardized slopes; alternatively, we can examine the correlation between X1 and X2 to obtain effect-size information for the a path, as well as the partial correlation between X2 and Y (controlling for X1) to obtain effect-size information for the b path. There are potential problems with comparisons among standardized regression or path coefficients. For example, if the same mediation analysis involving the same set of three variables is conducted in two different samples (e.g., a sample of women and a sample of men), these samples may have different standard deviations on variables such as the predictor X1 and the outcome variable Y. Suppose that the male and female samples yield b and c′ coefficients that are very similar, suggesting that the amount of change in Y as a function of X1 is about the same across the two groups. When we convert raw score slopes to standardized slopes, this may involve multiplying and dividing by Mediation——663 different standard deviations for men and women, and different standard deviations within these groups could make it appear that the groups have different relationships between variables (different standardized slopes but similar unstandardized slopes). Unfortunately, both raw score (b) and standardized (β) regression coefficients can be influenced by numerous sources of artifact that may operate differently in different groups. Chapter 7 reviewed numerous factors that can artifactually influence the size of r (such as outliers, curvilinearity, different distribution shapes for X and Y, unreliability of measurement of X and Y, etc.). Chapter 11 demonstrated that β coefficients can be computed from bivariate correlations and that b coefficients are rescaled versions of β. When Y is the outcome and X is the predictor, b = β × (SDY/SDX). Both b and β coefficients can be influenced by many of the same problems as correlations. Therefore, if we try to compare regression coefficients across groups or samples, differences in regression coefficients across samples may be partly due to artifacts discussed in Chapter 7. Considerable caution is required whether we want to compare standardized or unstandardized coefficients. Despite concerns about potential problems with standardized regression slopes (as discussed by Greenland et al., 1991), data analysts often include standardized path coefficients in reports of mediation analysis, particularly when some or all of the variables are not measured in meaningful units. In reporting results, authors should make it clear whether standardized or unstandardized path coefficients are reported. Given the difficulties just discussed, it is a good idea to include both types of path coefficients. 16.11 ♦ Sample Size and Statistical Power Assuming that the hypothesis of primary interest is H0: ab = 0, how large does sample size need to be to have an adequate level of statistical power? Answers to questions about sample size depend on several pieces of information: the alpha level, desired level of power, the type of test procedure, and the population effect sizes for the strength of the association between X1 and X2, as well as X2 and Y. Often, information from past studies can help researchers make educated guesses about effect sizes for correlations between variables. In the discussion that follows, a = .05 and desired power of .80 are assumed. We can use the correlation between X1 and X2 as an estimate of the effect-size index for a and the partial correlation between X2 and Y, controlling for X1, as an estimate of the effect size for b. Based on recommendations about verbal labels for effect size given by Cohen (1988), Fritz and MacKinnon (2007) designated a correlation of .14 as small, a correlation of .39 as medium, and a correlation of .59 as large. They reported statistical power for combinations of small (S), medium (M), and large (L) effect sizes for the a and b paths. For example, if a researcher plans to use the Sobel (1982) test and expects that both the a and b paths correspond to medium effects, the minimum recommended sample size from Table 16.1 would be 90. A few cautions are in order: Sample sizes from this table may not be adequate to guarantee significance, even if the researcher has not been overly optimistic about anticipated effect size. Even when the power table suggests that fewer than 100 cases might be adequate for statistical power for the test of H0: ab = 0, analysts should keep in mind that small samples lead to more sampling error in estimates of path coefficients. For most studies that test mediation models, minimum sample sizes of 150 to 200 would be advisable if possible. 664——CHAPTER 16 Table 16.1 ♦ Empirical Estimates of Sample Size Needed for Power of .80 When Using a = .05 as the Criterion for Statistical Significance in Three Different Types of Mediation Analysis ab Effect Sizea Joint Significanceb Sobelc Bootstrapped Confidence Intervald SS 530 667 558 SM 403 422 406 SL 403 412 398 MS 405 421 404 MM 74 90 78 ML 58 66 59 LS 405 410 401 LM 59 67 59 LL 36 42 36 SOURCE: Adapted from Fritz and MacKinnon (2007, Table 3, p. 237). NOTE: These power estimates may be inaccurate when measures of variables are unreliable, assumptions of normality are violated, or categorical variables are used rather than quantitative variables. a. SS indicates both a and b are small effects; SM indicates a is small and b is medium; SL indicates a is small and b is large. b. Joint significance test: Requirement that the a and b coefficients each are statistically significant. c. A z test for H0: ab using a method to estimate SEab proposed by Sobel (1982). d. Without bias correction. Fritz and MacKinnon (2007) have also made SAS and R programs available so that data analysts can input other values for population effect sizes and desired statistical power; see http://www.public.asu.edu/~davidpm/ripl/mediate.htm (scroll down to the line that says “Programs for Estimating Empirical Power”). 16.12 ♦ Additional Examples of Mediation Models Several variations of the basic mediation model in Figure 16.1 are possible. For example, the effect of X1 on Y could be mediated by multiple variables instead of just one (see Figure 16.11). Mediation could involve a multiple-step causal sequence. Mediation and moderation can both occur together. The following sections provide a brief introduction to each of these research situations; for more extensive discussion, see MacKinnon (2008). 16.12.1 ♦ Tests of Multiple Mediating Variables In many situations, the effect of a causal variable X1 on an outcome Y might be mediated by more than one variable. Consider the effects of personality traits (such as extraversion and neuroticism) on happiness. Extraversion is moderately positively correlated with happiness. Tkach and Lyubomirsky (2006) suggested that the effects of trait extraversion on happiness may be at least partially mediated by behaviors such as social activity. For example, people who score high on extraversion tend to engage in more social Mediation——665 Figure 16.11 ♦ Path Model for Multiple Mediating Variables Showing Standardized Path Coefficients Positive/ Proactive Behaviors .45** .24** Extraversion Happiness .44***(.60***) .13** .08 Spiritual Behaviors .16** .33*** Health Behaviors SOURCE: Adapted from Warner and Vroman (2011). NOTE: Coefficient estimates and statistical significance testes were obtained using the Indirect.sps script (output not shown). The effect of extraversion on happiness was partially mediated by behaviors. Positive/proactive behaviors (a1 × b1) and health behaviors (a3 × b3) were significant mediators; spiritual behaviors did not significantly mediate effects of extraversion on happiness. activities, and people who engage in more social activities tend to be happier. They demonstrated that, in their sample, the effects of extraversion on happiness were partially mediated by engaging in social activity, but there was still a significant direct effect of extraversion on happiness. Their mediation analyses examined only one behavior at a time as a potential mediator. However, they also noted that there are many behaviors (other than social activity) that may influence happiness. What happens if we consider multiple behaviors as possible mediators? The SPSS script Indirect.sbs (discussed in Section 16.9.4) can be used to conduct simultaneous tests for more than one mediating variable. Figure 16.11 shows standardized path coefficients obtained 666——CHAPTER 16 using the Indirect.sps script to test a multiple-mediation model (Warner & Vroman, 2011) that included three different behaviors as mediators between extraversion and happiness. (Output similar to Figure 16.10 was obtained but is not included here.) Results indicated that the effects of extraversion on happiness were only partially mediated by behavior. Positive/prosocial behaviors and health behaviors were both significant mediators of the effect of extraversion on happiness. Spiritual behaviors did not significantly mediate the effects of extraversion on happiness (the path from spiritual behaviors to happiness was not statistically significant). 16.12.2 ♦ Multiple-Step Mediated Paths It is possible to examine a mediation sequence that involves more than one intermediate step, as in the sequence X1 → X2 → X3 → Y. If only partial mediation occurs, additional paths would need to be included in this type of model; for further discussion, see Taylor, MacKinnon, and Tein (2008). 16.12.3 ♦ Mediated Moderation and Moderated Mediation It is possible for moderation (as described in Chapter 15) to co-occur with mediation in two different ways. Mediated moderation occurs when two initial causal variables (let’s call these variables A and B) have an interaction (A × B), and the effects of this interaction involve a mediating variable. In this situation, A, B, and the A × B interaction are included as initial causal variables, and the mediation analysis is conducted to assess the degree to which a potential mediating variable explains the impact of the A × B interaction on the outcome variable. Moderated mediation occurs when you have two different groups (e.g., males and females), and the strength or signs of the paths in a mediation model for the same set of variables differ across these two groups. Many structural equation modeling programs, such as Amos, make it possible to compare path models across groups and to test hypotheses about whether one, or several, path coefficients differ between groups (e.g., males vs. females). Further discussion can be found in Edwards and Lambert (2007); Muller, Judd, and Yzerbyt (2005); and Preacher, Rucker, and Hayes (2007). Comparison of models across groups using the Amos structural equation modeling program is demonstrated by Byrne (2009). 16.13 ♦ Use of Structural Equation Modeling Programs to Test Mediation Models SEM programs such as LISREL, EQS, MPLUS, and Amos make it possible to test models that include multiple-step paths (e.g., mediation hypotheses) and to compare results across groups (to test moderation hypotheses). In addition, SEM programs make it possible to include multiple indicator variables for some or all of the constructs; in theory, this makes it possible to assess multiple indicator measurement reliability. Most SEM programs now also provide bootstrapping; most analysts now view SEM programs as the preferred method for assessment of mediated models. More extensive discussion of other Mediation——667 types of analyses that can be performed using structural equation modeling is beyond the scope of this book; for further information, see Byrne (2009) or Kline (2010). 16.13.1 ♦ Comparison of Regression and SEM Tests of Mediation As described in earlier sections of this chapter, simple mediated models can be tested by using OLS linear regression in SPSS and then conducting the Sobel test to assess whether the indirect path(s) are significant. In the following example, Amos Graphics will be used to analyze the same empirical example. Amos is an add-on structural equation modeling program for IBM SPSS that is licensed separately from IBM SPSS. Use of SEM programs provides two advantages compared to the regression methods described earlier in this chapter. First, they make it possible to test more complex path models involving a larger number of variables. Second, most SEM programs provide bootstrapped confidence intervals and associated statistical significance tests for ab indirect paths; bootstrapped confidence intervals are now regarded as the best method for statistical significance testing for indirect effects, particularly when assumptions of normality may be violated. In this section, Amos is used only to perform one specific type of analysis, that is, to obtain CIs and significance tests for the ab indirect effect for a simple three-variable mediated model. 16.13.2 ♦ Steps in Running Amos Running analyses in Amos Graphics involves the following steps, each of which is discussed in more detail in subsequent sections. 1. Open the Amos Graphics program and use the drawing tools to draw a path diagram that represents the hypothesized mediated causal model. 2. Name the variables in this diagram (the variable names must correspond exactly to the names of the variables in the SPSS data file). 3. Open the SPSS data file. 4. Edit the Analysis Properties to specify how the analysis will be performed and what output you want to see. 5. Run the analysis and check to make sure that the analysis ran successfully; if it did not, you may need to correct variable names and/or make changes in the path model. 6. View and interpret the output. 16.13.3 ♦ Opening the Amos Graphics Program From the Windows operating system, begin with the Menu (usually this is in the lower left corner of the screen). When you click the button, make the following selections from the popup menus, as shown in Figure 16.12: → → . The initial view of the Amos worksheet appears in Figure 16.13 (if there is already a path diagram in the right-hand panel, click the → menu selections from the top menu bar to start with a blank worksheet). 668——CHAPTER 16 Figure 16.12 ♦ Initial Menu Selection From Start Menu to Start Amos 19 Amos Graphics Program NOTE: From the Menu, Select → → . Figure 16.13 ♦ Initial Screen View in Amos Graphics Mediation——669 The Amos Graphics worksheet has several parts. Across the top, as in most Windows applications, there is a menu bar. Down the left-hand side are icons that represent numerous tools for drawing and modifying models and doing other operations (shown in greater detail in Figure 16.14). Just to the right of the tools is a set of small windows with headings: Group Number 1, Default Model, and so forth. (In this example, only one group is used; Amos can estimate and compare model parameters for multiple groups; for example, it can compare mediated models for males and females.) These windows are used later to select which part of the output you want to see. To the right, the largest window is a blank drawing sheet that provides space for you to draw a path model that represents your hypotheses about causal connections. Figure 16.14 ♦ Amos Drawing Tools 16.13.4 ♦ Amos Tools In this brief introduction, only a few of the drawing tools in Figure 16.14 are used (Byrne, 2009, provides more extensive examples of tool use). Beginning in the upper lefthand corner: The rectangle tool creates a rectangle that corresponds to an observed (measured) variable. (An example at the end of Chapter 20 also includes latent variables; these are represented by ovals.) The single-headed arrow tool is used to draw a causal path (a detailed discussion of types of paths was presented in Chapter 10). (The doubleheaded arrow tool , not used in this example, is used to indicate that predictor variables are correlated.) The tool is used to create an error term for each dependent variable in the path model (error terms must be explicitly included in SEM path models). Three additional tools that are not used in this example are useful to know: The moving truck tool can be used to move objects in the path model, the delete tool is used to delete objects from the graph, and the clipboard is used to copy an Amos path model into the Windows clipboard so that it can be pasted into other applications (such as Word or PowerPoint). 670——CHAPTER 16 16.13.5 ♦ First Steps Toward Drawing and Labeling an Amos Path Model The path model for this example is the same as the one that appeared earlier in the bottom panel of Figure 16.1. The goal of the analysis is to assess to degree to which the effects of age on blood pressure may be mediated by weight. All of the steps are shown below; you can see a similar analysis (using different variable names) as an animated tutorial at this URL: http://amosdevelopment.com/video/indirect/flash/indirect.html (you need the Adobe Flash player to view this animation). To draw the path model, start with the observed variables. Left click on the rectangle tool, move the cursor over to the blank worksheet on the right, then right click; a popup menu appears; left click on the menu option to “draw observed variable” (see top panel of Figure 16.15). The popup menu will then disappear. Left click (and continue to hold the button on the mouse down) on the blank worksheet in the location where you want the variable to appear and drag the mouse; a rectangle will appear. Drag the mouse until the location and dimensions of the rectangle look the way you want and then release the mouse button. Your worksheet should now contain a rectangle similar to the one that appears in the bottom panel of Figure 16.15. Figure 16.15 ♦ Drawing a Rectangle That Corresponds to an Observed/Measured Variable Mediation——671 To give this variable a name, point the cursor at the rectangle and right click. From the popup menu that appears (as shown in Figure 16.16), click on Object Properties. This opens the Object Properties dialog window; this appears in Figure 16.16 near the bottom of the worksheet. In the space to the right of “Variable name,” type in the name of the first variable (age). Font size and style can be modified. (Variable labels are not used in this example. The name typed in the Variable name window must correspond exactly to the name of the variable in the SPSS data file. If you want to have a different label appear in the Amos diagram, enter this in the box for the Variable label.) 16.13.6 ♦ Adding Variables and Paths to the Amos Path Diagram For this analysis, the path model needs to include the following additional elements. Rectangles must be added for the other observed variables (weight, blood pressure). Note that conventionally, causal sequences are diagrammed from left to right (or from top to bottom). Age is the initial cause, and so it is placed on the left. Blood pressure is the final outcome, so it is placed on the right. The hypothesized mediator, weight, is placed between and above the other two variables, as shown in Figure 16.17. To add paths to the model, left click on the unidirectional arrow tool , left click on the initial causal variable (rectangle) in the path model and continue to hold the mouse button down, and drag the mouse until the cursor points at the outcome or dependent variable, then release the mouse button. An arrow will appear in the diagram. For this model, you need three unidirectional arrows: from age to weight, from weight to blood pressure, and from weight to blood pressure, as shown in Figure 16.17. Figure 16.16 ♦ The Object Properties Popup Menu 672——CHAPTER 16 Figure 16.17 ♦ Final Path Model for the Mediation Analysis 0, e1 1 Weight 0, e2 1 Age BloodPressure 16.13.7 ♦ Adding Error Terms for Dependent Variables Each dependent variable (a variable is dependent if it has a unidirectional arrow pointing toward it) must have an explicit error term. To create the error terms shown in Figure 16.17, left click on the error term tool , move the mouse to position the cursor over a dependent variable such as weight, and left click again. An error term (a circle with an arrow that points toward the observed variable) will appear in the path model. Note that this arrow has a coefficient of 1 associated with it; this predetermined value for this path is required so that Amos can scale the error term to be consistent with the variance of the observed variable. In Figure 16.17, this 1 was edited to display as a larger font than initially appeared in Amos Graphics; to do this, right click near the arrow that represents this path (positioning is tricky for this) and click on the Object Properties window; within the Object Properties window, the font size for this path coefficient can be changed. Each error term also is preassigned a mean of 0; for this reason, a small 0 appears near each circle that represents an error term. You must give each error term a name, and the names for error terms must not correspond to the names of any SPSS observed variables. It is conventional to give error terms brief names such as e1 and e2, as shown in Figure 16.17. 16.13.8 ♦ Correcting Mistakes and Printing the Path Model During this process, if you make a mistake or want to redraw some element of the model, you can use the delete tool to remove any variable or path from the model. (Amos has other tools that can be used to make the elements of these path model diagrams look nicer, such as the moving truck ; see Byrne, 2009, for details.) If you want Mediation——673 to paste a copy of this diagram into a Word document or other application, left click on the clipboard icon in the tool bar ( ) and then use the Paste command within Word. When you have completed all these drawing steps, your path diagram should look similar to the final path model that appears in Figure 16.17. 16.13.9 ♦ Opening a Data File From Amos The next step is to open the SPSS data file that contains scores for the observed variables in this model (age, weight, blood pressure). From the top-level menu, make the following selections: → , as shown in Figure 16.18. This opens the dialog window for Data Files, as shown in Figure 16.19. Click on the File Name button to open a browsing window (not shown here); in this window, you can navigate to the folder that contains your SPSS data file. (The first time you open this window, the default directory is Amos examples; you will need to navigate to one of your own data directories to locate your data file.) When you have located the SPSS data file (for this example, it is the file named ageweightbp.sav), highlight it, then click the Open button and then the OK button. This will return you to the screen view in Figure 16.17. 16.13.10 ♦ Specification of Analysis Method and Request for Output The next step is to tell Amos how to do the analysis and what output you want to see. To do this, go to the top-level menu (as shown in Figure 16.17) and make these menu selections (see Figure 16.20): → . This opens the Analysis Properties dialog window, as shown in Figure 16.21. Note the series of tabs across the top of this window from left to right; only a few of these are used in this example. Click on the Figure 16.18 ♦ Amos Menu Selections to Open the SPSS Data File 674——CHAPTER 16 Figure 16.19 ♦ Amos Dialog Window: Data Files “Estimation” tab to specify the estimation method, select the radio button for “Maximum likelihood,” and check the box for “Estimate means and intercepts.” The radio button for “Fit the saturated and independence models”3 is also selected in this example. (Amos is not very forgiving about missing data. Some options are not available, and other options must be selected, if the SPSS data file contains any missing values. These limitations can be avoided by either removing cases with missing data from the SPSS data file or using imputation methods to replace missing values; refer to Chapter 4 for further discussion of missing values in SPSS Statistics.) Next, still in the “Analysis Properties” dialog window, click on the “Output” tab; in the checklist that appears (see Figure 16.22), check the boxes for “Minimization history,” “Standardized estimates,” “Squared multiple correlations,” and “Indirect, direct and total effects.” Figure 16.20 ♦ Amos Pull-Down Menu: Analysis Properties Mediation——675 Figure 16.21 ♦ Estimation Tab in Analysis Properties Dialog Window Figure 16.22 ♦ Output Tab in Analysis Properties Dialog Window 676——CHAPTER 16 Continuing in the “Analysis Properties” dialog window, click on the “Bootstrap” tab. Click the checkbox for “Perform bootstrap,” and in the window for “Number of bootstrap samples,” type in a reasonably large number (usually between 1,000 and 5,000; in this example, 2,000 bootstrap samples were requested). Also check the box for “Bias-corrected confidence intervals.” To finish work in the “Analysis Properties” window, click the X in the upper right-hand corner of this window to close it. This returns you to the screen view that appears in Figure 16.17. 16.13.11 ♦ Running the Amos Analysis and Examining Preliminary Results The next step is the run the requested analysis. From the top-level menu (as it appears in Figure 16.17), make the following menu selections: → . After you do this, new information appears in the center column of the worksheet that reports preliminary information about results, as shown in Figure 16.24. Numbers were added to this screen shot to highlight the things you will want to look at. Number 1 points to this element in the screen: . This pair of icons provides a way to toggle between two views of the model in Amos. If you click on the left-hand icon, this puts you in model specification mode; in this view, you can draw or modify the path model. When you click on the right-hand icon, results of the most recent analysis are displayed as path coefficients superimposed on the path model, as shown in Figure 16.24. Figure 16.23 ♦ Bootstrap Tab in Analysis Properties Window Mediation——677 Figure 16.24 ♦ Project View After Analysis Has Run Successfully NOTE: Initial view shows unstandardized (b) path coefficients. The first thing you need to know after you have run an analysis is whether the analysis ran successfully. Amos can fail to run an analysis for many reasons—for example, the path model was not drawn correctly, or missing data in the SPSS file require different specifications for the analysis. See number 2 in Figure 16.24; you may need to scroll up and down in this window. If the analysis failed to run, there will be an error message in this window. If the analysis ran successfully, then numerical results (such as the chisquare4 for model fit) will appear in this window. 16.13.12 ♦ Unstandardized Path Coefficients on Path Diagram Path coefficients now appear on the path model diagram, as indicated by number 3 in Figure 16.24 (the initial view shows unstandardized path coefficients). The user can toggle back and forth between viewing unstandardized versus standardized coefficient estimates by highlighting the corresponding terms in the window indicated by number 4. In Figure 16.24, the standardized path coefficients (these correspond to b coefficients in regression) appear. When the user highlights the option for standardized coefficients indicated by number 4, the path model is displayed with standardized coefficients (these correspond to b coefficients in regression), as shown in Figure 16.25. Object properties were modified to make the display fonts larger for these coefficients. Because this model does not include latent variables, the values of path coefficients reported by Amos are the same as those reported earlier from linear regression analyses (see Figure 16.5).5 The values adjacent to the rectangles that represent weight (.32) and blood pressure (.69), in 678——CHAPTER 16 Figure 16.25, are the squared multiple correlations or R2 values for the prediction of these dependent variables. (Sometimes the locations of the numbers on Amos path model diagrams do not make it clear what parameter estimates they represent; ambiguity about this can be resolved by looking at the text output, as described next.) 16.13.13 ♦ Examining Text Output From Amos To view the text output, from the top-level menu, make the following menu selections: → , as shown in Figure 16.26. This opens up the Text Output window. The left-hand panel of this window provides a list of the output that is available (this is similar to the list of output that appears on the left-hand side of the SPSS Statistics output window). Only selected output will be examined and interpreted here. Use the cursor to highlight the Estimates portion of the output, as shown on the left-hand side of Figure 16.28. The complete output for Amos estimates (of path coefficients, multiple R2 values, indirect effects, and other results) appears in Figure 16.29. In Figure 16.29, the unstandardized path coefficients are reported where it is marked with the letter a; these coefficients correspond to the b/unstandardized regression coefficients reported earlier (in Figure 16.4, for example). The column headed C. R. (this stands for “critical ratio,” and this is similar but not identical to a simple t ratio) reports the ratio of each path coefficient estimate to its standard error; the computation of stan dard error is different in SEM than in linear regression. The p value (shown as capital P in Figure 16.25 ♦ Standardized Path Coefficients and Squared Multiple Correlations Mediation——679 Figure 16.26 ♦ Amos Menu: View Text Output Figure 16.27 ♦ List of Available Text Output (Left-Hand Side) Figure 16.28 ♦ Screen View: Amos Estimates 680——CHAPTER 16 Figure 16.29 ♦ Complete Amos Estimates Mediation——681 Amos output) appears as *** by default when it is zero to more than three decimal places. Double clicking on any element in the output (such as a specific p value) opens up a text box that provides an explanation of each term. Although the C. R. values are not identical to the t values obtained when the same analysis was performed using linear regression earlier (results in Figure 16.4), the b coefficient estimates from Amos and the judgments about their statistical significance are the same as for the linear regression results. The output table labeled “b” contains the corresponding standardized path coefficients. 682——CHAPTER 16 Moving to the bottom of Figure 16.28, unstandardized and standardized estimates of the strength of the indirect effect (denoted ab in earlier sections of this chapter) are reported where the letter c appears. 16.13.14 ♦ Locating and Interpreting Output for Bootstrapped CI for the ab Indirect Effect To obtain information about statistical significance, we must examine the output for bootstrapped CIs. To see this information, double click the left mouse button on the Estimates in the list of available output (upper left-hand panel of Figure 16.30). This opens up a list that includes Scalars and Matrices. Double click the left mouse button on Matrices to examine the options within this Matrices list. Within this list, select Indirect effects (move the cursor to highlight this list entry and then left click on it). Now you will see the Estimates/Bootstrap menu in the window in the lower left-hand side of Figure 16.30. Left click on “Bootstrap confidence.” (To see an animation that shows this series of menu selections, view the video at this URL: http://amosdevelopment.com/video/ indi rect/flash/indirect.html.) The right-hand panel in Figure 16.30 shows the 95% CI results for the estimate of the unstandardized ab indirect effect (in this example, the effect of age on blood pressure, mediated by weight). The lower and upper limits of this 95% CI are .122 and 2.338. The result of a statistical significance test for H0: ab = 0, using an error term derived from bootstrapping, is p = .011. While there are differences in some numerical values, the Amos analysis of the mediation model presented earlier (Figure 16.1) was generally similar to the results obtained used linear regression. Figure 16.30 ♦ Output for Bootstrapped Confidence Interval for ab Indirect of Mediated Effect of Age on Blood Pressure Through Weight NOTE: For an animated demonstration of the series of selections in the text output list that are required to view this result, view the video at this URL: http://amosdevelopment.com/video/indirect/flash/indirect.html. Mediation——683 16.13.15 ♦ Why Use Amos/SEM Rather Than OLS Regression? There are two reasons why it is worthwhile to learn how to use Amos (or other SEM programs) to test mediated models. First, it is now generally agreed that bootstrapping is the preferred method to test the statistical significance of indirect effects in mediated models; bootstrapping may be more robust to violations of assumptions of normality. Second, once a student has learned to use Amos (or other SEM programs) to test simple mediation models similar to the example in this chapter, the program can be used to add additional predictor and/or mediator variables, as shown in Figure 16.11. SEM programs have other uses that are briefly discussed at the end of Chapter 20; for example, SEM programs can be used to do confirmatory factor analysis, and SEM models can include latent variables with multiple indicators. 16.14 ♦ Results Section For the hypothetical data in this chapter, a Results section could read as follows. Results presented here are based on the output from linear regression (Figures 16.2–16.4) and the Sobel test result in Figure 16.6. (Results would include slightly different numerical values if the Amos output is used.) Results A mediation analysis was performed using the Baron and Kenny (1986) causalsteps approach; in addition, a bootstrapped confidence interval for the ab indirect effect was obtained using procedures described by Preacher and Hayes (2008). The initial causal variable was age, in years; the outcome variable was systolic blood pressure (SBP), in mm Hg; and the proposed mediating variable was body weight, measured in pounds. [Note to reader: The sample N, mean, standard deviation, minimum and maximum scores for each variable, and correlations among all three variables would generally appear in earlier sections.] Refer to Figure 16.1 for the path diagram that corresponds to this mediation hypothesis. Preliminary data screening suggested that there were no serious violations of assumptions of normality or linearity. All coefficients reported here are unstandardized, unless otherwise noted; a = .05 two-tailed is the criterion for statistical significance. The total effect of age on SBP was significant, c = 2.862, t(28) = 6.631, p < .001; each 1-year increase in age predicted approximately a 3-point increase in SBP in mm Hg. Age was significantly predictive of the hypothesized mediating variable, weight; a = 1.432, t(28) = 3.605, p = .001. When controlling for age, weight was significantly predictive of SBP, b = .490, t(27) = 2.623, p = .014. The estimated direct effect of age on SBP, controlling for weight, was c′ = 2.161, t(27) = 4.551, p < .001. SBP was predicted quite well from age and weight, with adjusted R2 = .667 and F(2, 27) = 30.039, p < .001. The indirect effect, ab, was .701. This was judged to be statistically significant using the Sobel (1982) test, z = 2.119, p = .034. [Note to reader: The Sobel test should be used only with much larger sample sizes than the N of 30 for this hypothetical dataset.] Using the SPSS script for the Indirect procedure (Preacher 684——CHAPTER 16 & Hayes, 2008), bootstrapping was performed; 5,000 samples were requested; a bias-corrected and accelerated confidence interval (CI) was created for ab. For this 95% CI, the lower limit was .0769 and the upper limit was 2.0792. Several criteria can be used to judge the significance of the indirect path. In this case, both the a and b coefficients were statistically significant, the Sobel test for the ab product was significant, and the bootstrapped CI for ab did not include zero. By all these criteria, the indirect effect of age on SBP through weight was statistically significant. The direct path from age to SBP (c′) was also statistically significant; therefore, the effects of age on SBP were only partly mediated by weight. The upper diagram in Figure 16.5 shows the unstandardized path coefficients for this mediation analysis; the lower diagram shows the corresponding stan dardized path coefficients. Comparison of the coefficients for the direct versus indirect paths (c′ = 2.161 vs. ab = .701) suggests that a relatively small part of the effect of age on SBP is mediated by weight. There may be other mediating variables through which age might influence SBP, such as other age-related disease processes. 16.15 ♦ Summary This chapter demonstrates how to assess whether a proposed mediating variable (X2) may partly or completely mediate the effect of an initial causal variable (X1) on an outcome variable (Y). The analysis partitions the total effect of X1 on Y into a direct effect, as well as an indirect effect through the X2 mediating variable. The path model represents causal hypotheses, but readers should remember that the analysis cannot prove causality if the data are collected in the context of a nonexperimental design. If controlling for X2 completely accounts for the correlation between X1 and Y, this could happen for reasons that have nothing to do with mediated causality; for example, this can occur when X1 and X2 are highly correlated with each other because they measure the same construct. A mediation analysis should be undertaken only when there are good reasons to believe that X1 causes X2 and that X2 in turn causes Y. In addition, it is highly desirable to collect data in a manner that ensures temporal precedence (i.e., X1 occurs first, X2 occurs second, and Y occurs third). These analyses can be done using OLS regression; however, use of SPSS scripts provided by Preacher and Hayes (2008) provides bootstrapped estimates of confidence intervals, and most analysts now believe this provides better information than statistical significance tests that assume normality. SEM programs provide even more flexibility for assessment of more complex models. If a mediation analysis suggests that partial or complete mediation may be present, additional research is needed to establish whether this is replicable and real. If it is possible to manipulate or block the effect of the proposed mediating variable experimentally, experimental work can provide stronger evidence of causality (MacKinnon, 2008). Notes 1. It is also possible to hypothesize bidirectional causality, such that X causes Y and that Y in return also influences X; this hypothesis of reciprocal causation would be denoted with two Mediation——685 → Y, not with a double-headed arrow. Information about additional unidirectional arrows, X ← predictors of X and Y are needed to obtain separate estimates of the strengths of these two causal paths; see Felson and Bohrnstedt (1979) and Smith (1982). 2. For discussion of potential problems with comparisons among standardized regression coefficients, see Greenland et al. (1991). Despite the problems they and others have identified, research reports still commonly report standardized regression or path coefficients, particularly in situations where variables have arbitrary units of measurement. 3. Because each variable has a direct path to every other variable in this example, the chisquare for model fit is 0 (this means that the path coefficients can perfectly reconstruct the variances and covariances among the observed variables). In more advanced applications of SEM, some possible paths are omitted, and then the model usually cannot exactly reproduce the observed variances and covariances. In those situations, it becomes important to examine several different indexes of model fit to evaluate the consistency between model and observed data. For further discussion, see Kline (2010). 4. For reasons given in Note 3, in this example, chi-square equals 0. 5. SEM programs such as Amos typically use some form of maximum likelihood estimation, while linear regression uses ordinary least squares estimation methods (see Glossary for definitions of these terms). For this reason, the estimates of path coefficients and other model parameters may differ, particularly for more complicated models. 686——CHAPTER 16 Comprehension Questions 1. Suppose that a researcher first measures a Y outcome variable, then measures an X1 predictor and an X2 hypothesized mediating variable. Why would this not be a good way to collect data to test the hypothesis that the effects of X1 on Y may be mediated by X2? 2. Suppose a researcher wants to test a mediation model that says that the effects of math ability (X1) on science achievement (Y) are mediated by sex (X2). Is this a reasonable mediation hypothesis? Why or why not? 3. A researcher believes that the prediction of Y (job achievement) from X1 (need for power) is different for males versus females (X2). Would a mediation analysis be appropriate? If not, what other analysis would be more appropriate in this situation? 4. Refer to Figure 16.1. If a, b, and ab are all statistically significant (and large enough to be of practical or clinical importance), and c′ is not statistically significant and/or not large enough to be judged practically or clinically important, would you say that the effects of X1 on Y are partially or completely mediated by X2? 5. What pattern of outcomes would you expect to see for coefficient estimates in Figure 16.1—for example, which coefficients would need to be statistically significant and large enough to be of practical importance, for the interpretation that X2 only partly mediates the effects of X1 on Y? Which coefficients (if any) should be not statistically significant if the effect of X1 on Y is only partly mediated by X2? 6. In Figure 16.1, suppose that you initially find that path c (the total effect of X1 on Y) is not statistically significant and too small to be of any practical or clinical importance. Does it follow that there cannot possibly be any indirect effects of X1 on Y that are statistically significant? Why or why not? Comprehension Questions 7. Using Figure 16.1 again, consider this equation: c = (a × b) + c′. Which coefficients represent direct, indirect, and total effects of X1 on Y in this equation? 8. A researcher believes that the a path in a mediated model (see Figure 16.1) corresponds to a medium unit-free effect size and the b path in a mediated model also corresponds to a medium unit-free effect size. If assumptions are met (e.g., scores on all variables are quantitative and normally distributed), and the researcher wants to have power of about .80, what sample size would be needed for the Sobel test (according to Table 16.1)? Mediation——687 9. Give an example of a three-variable study for which a mediation analysis would make sense. Be sure to make it clear which variable is the proposed initial predictor, mediator, and outcome. 10. Briefly comment on the difference between the use of a bootstrapped CI (for the unstandardized estimate of ab) versus the use of the Sobel test. What programs can be used to obtain the estimates for each case? Which approach is less dependent on assumptions of normality? Comprehension Questions