Skip to contents

sim_ame() is a wrapper for sim_apply() that computes average marginal effects, the average effect of changing a single variable from one value to another (i.e., from one category to another for categorical variables or a tiny change for continuous variables).

Usage

sim_ame(
  sim,
  var,
  subset = NULL,
  by = NULL,
  contrast = NULL,
  outcome = NULL,
  type = NULL,
  eps = 1e-05,
  verbose = TRUE,
  cl = NULL
)

# S3 method for clarify_ame
print(x, digits = NULL, max.ests = 6, ...)

Arguments

sim

a clarify_sim object; the output of a call to sim() or misim().

var

either the name of a variable for which marginal effects are to be computed or a named list of length one containing the values the variable should take. If a list is supplied or the named variables is categorical (factor, character, or having two values), categorical calculations will be triggered. Otherwise, continuous calculations will be triggered. See Details.

subset

optional; a vector used to subset the data used to compute the marginal effects. This will be evaluated within the original dataset used to fit the model using subset(), so nonstandard evaluation is allowed.

by

a one-sided formula or character vector containing the names of variables for which to stratify the estimates. Each quantity will be computed within each level of the complete cross of the variables specified in by.

contrast

a string containing the name of a contrast between the average marginal means when the variable named in var is categorical and takes on two values. Allowed options include "diff" for the difference in means (also "rd"), "rr" for the risk ratio (also "irr"), "log(rr): for the log risk ratio (also "log(irr)"), "sr" for the survival ratio, "log(sr): for the log survival ratio, "srr" for the switch relative risk (also "grrr"), "or" for the odds ratio, "log(or)" for the log odds ratio, and "nnt" for the number needed to treat. These options are not case sensitive, but the parentheses must be included if present.

outcome

a string containing the name of the outcome or outcome level for multivariate (multiple outcomes) or multi-category outcomes. Ignored for univariate (single outcome) and binary outcomes.

type

a string containing the type of predicted values (e.g., the link or the response). Passed to marginaleffects::get_predict() and eventually to predict() in most cases. The default and allowable option depend on the type of model supplied, but almost always corresponds to the response scale (e.g., predicted probabilities for binomial models).

eps

when the variable named in var is continuous, the value by which to change the variable values to approximate the derivative. See Details.

verbose

logical; whether to display a text progress bar indicating progress and estimated time remaining for the procedure. Default is TRUE.

cl

a cluster object created by parallel::makeCluster(), or an integer to indicate the number of child-processes (integer values are ignored on Windows) for parallel evaluations. See pbapply::pblapply() for details. If NULL, no parallelization will take place.

x

a clarify_ame object.

digits

the minimum number of significant digits to be used; passed to print.data.frame().

max.ests

the maximum number of estimates to display.

...

optional arguments passed to FUN.

Value

A clarify_ame object, which inherits from clarify_est and is similar to the output of sim_apply(), with the additional attributes "var" containing the variable named in var and "by" containing the names of the variables specified in by (if any). The average adjusted predictions will be named E[Y({v})], where {v} is replaced with the values the focal variable (var) takes on. The average marginal effect for a continuous var will be named E[dY/d({x})] where {x} is replaced with var. When by is specified, the average adjusted predictions will be named E[Y({v})|{b}] and the average marginel effect E[dY/d({x})|{b}] where {b} is a comma-separated list of of values of the by variables at which the quantity is computed. See examples.

Details

sim_ame() operates differently depending on whether continuous or categorical calculations are triggered. To trigger categorical calculations, var should be a string naming a factor, character, or binary variable or a named list with specific values given (e.g., var = list(x1 = c(1, 2 ,3))). Otherwise, continuous calculations are triggered.

Categorical calculations involve computing average marginal means at each level of var. The average marginal mean is the average predicted outcome value after setting all units' value of var to one level. (This quantity has several names, including the average potential outcome, average adjusted prediction, and standardized mean). When var only takes on two levels (or it is supplied as a list and only two values are specified), a contrast between the average marginal means can be computed by supplying an argument to contrast. Contrasts can be manually computed using transform() afterward as well.

Continuous calculations involve computing the average of marginal effects of var across units. A marginal effect is the instantaneous rate of change corresponding to changing a unit's observed value of var by a tiny amount and considering to what degree the predicted outcome changes. The ratio of the change in the predicted outcome to the change in the value of var is the marginal effect; these are averaged across the sample to arrive at an average marginal effect. The "tiny amount" used is eps times the standard deviation of the focal variable.

If unit-level weights are included in the model fit (and discoverable using insight::get_weights()), all means will be computed as weighted means.

Effect measures

The effect measures specified in contrast are defined below. Typically only "diff" is appropriate for continuous outcomes and "diff" or "irr" are appropriate for count outcomes; the rest are appropriate for binary outcomes. For a focal variable with two levels, 0 and 1, and an outcome Y, the average marginal means will be denoted in the below formulas as E[Y(0)] and E[Y(1)], respectively.

contrastFormula
"diff"E[Y(1)] - E[Y(0)]
"rr"E[Y(1)] / E[Y(0)]
"sr"(1 - E[Y(1)]) / (1 - E[Y(0)])
"srr"1 - sr if E[Y(1)] > E[Y(0)]
rr - 1 if E[Y(1)] < E[Y(0)]
0 otherwise
"or"O[Y(1)] / O[Y(0)]
where O[Y(.)] = E[Y(.)] / (1 - E[Y(.)])
"nnt"1 / (E[Y(1)] - E[Y(0)])

The log(.) versions are defined by taking the log() (natural log) of the corresponding effect measure.

See also

sim_apply(), which provides a general interface to computing any quantities for simulation-based inference; plot.clarify_est() for plotting the output of a call to sim_ame(); summary.clarify_est() for computing p-values and confidence intervals for the estimated quantities.

marginaleffects::marginaleffects(), marginaleffects::comparisons(), and margins::margins() for delta method-based implementations of computing average marginal effects.

Examples

data("lalonde", package = "MatchIt")

# Fit the model
fit <- glm(I(re78 > 0) ~ treat + age + race +
             married + re74,
           data = lalonde, family = binomial)

# Simulate coefficients
set.seed(123)
s <- sim(fit, n = 100)

# Average marginal effect of `age`
est <- sim_ame(s, var = "age", verbose = FALSE)
summary(est)
#>              Estimate    2.5 %   97.5 %
#> E[dY/d(age)] -0.00695 -0.01013 -0.00442

# Contrast between average adjusted predictions
# for `treat`
est <- sim_ame(s, var = "treat", contrast = "rr",
               verbose = FALSE)
summary(est)
#>         Estimate 2.5 % 97.5 %
#> E[Y(0)]    0.743 0.701  0.781
#> E[Y(1)]    0.810 0.758  0.858
#> RR         1.090 1.000  1.185

# Average adjusted predictions for `race`; need to follow up
# with contrasts for specific levels
est <- sim_ame(s, var = "race", verbose = FALSE)

est <- transform(est,
                 `RR(h,b)` = `E[Y(hispan)]` / `E[Y(black)]`)

summary(est)
#>              Estimate 2.5 % 97.5 %
#> E[Y(black)]     0.706 0.646  0.758
#> E[Y(hispan)]    0.832 0.741  0.898
#> E[Y(white)]     0.800 0.751  0.847
#> RR(h,b)         1.178 1.029  1.331

# Average adjusted predictions for `treat` within levels of
# `married`, first using `subset` and then using `by`
est0 <- sim_ame(s, var = "treat", subset = married == 0,
                contrast = "rd", verbose = FALSE)
names(est0) <- paste0(names(est0), "|married=0")
est1 <- sim_ame(s, var = "treat", subset = married == 1,
                contrast = "rd", verbose = FALSE)
names(est1) <- paste0(names(est1), "|married=1")

summary(cbind(est0, est1))
#>                    Estimate     2.5 %    97.5 %
#> E[Y(0)]|married=0  7.24e-01  6.58e-01  7.79e-01
#> E[Y(1)]|married=0  7.95e-01  7.42e-01  8.43e-01
#> RD|married=0       7.11e-02 -2.65e-04  1.47e-01
#> E[Y(0)]|married=1  7.70e-01  7.21e-01  8.17e-01
#> E[Y(1)]|married=1  8.31e-01  7.55e-01  8.94e-01
#> RD|married=1       6.10e-02 -4.39e-05  1.16e-01

est <- sim_ame(s, var = "treat", by = ~married,
               contrast = "rd", verbose = FALSE)

est
#> A `clarify_est` object (from `sim_ame()`)
#>  - Average marginal effect of `treat`
#>    - within levels of `married`
#>  - 100 simulated values
#>  - 6 quantities estimated:                     
#>  E[Y(0)|0] 0.72405662
#>  E[Y(1)|0] 0.79518774
#>  RD[0]     0.07113112
#>  E[Y(0)|1] 0.76974375
#>  E[Y(1)|1] 0.83078255
#>  RD[1]     0.06103880
summary(est)
#>            Estimate     2.5 %    97.5 %
#> E[Y(0)|0]  7.24e-01  6.58e-01  7.79e-01
#> E[Y(1)|0]  7.95e-01  7.42e-01  8.43e-01
#> RD[0]      7.11e-02 -2.65e-04  1.47e-01
#> E[Y(0)|1]  7.70e-01  7.21e-01  8.17e-01
#> E[Y(1)|1]  8.31e-01  7.55e-01  8.94e-01
#> RD[1]      6.10e-02 -4.39e-05  1.16e-01

# Average marginal effect of `re74` within levels of
# married*race
est <- sim_ame(s, var = "age", by = ~married + race,
               verbose = FALSE)
est
#> A `clarify_est` object (from `sim_ame()`)
#>  - Average marginal effect of `age`
#>    - within levels of `married` and `race`
#>  - 100 simulated values
#>  - 6 quantities estimated:                                   
#>  E[dY/d(age)|0,black]  -0.007950175
#>  E[dY/d(age)|0,hispan] -0.005599656
#>  E[dY/d(age)|0,white]  -0.006626402
#>  E[dY/d(age)|1,black]  -0.008058486
#>  E[dY/d(age)|1,hispan] -0.005497813
#>  E[dY/d(age)|1,white]  -0.006303652
summary(est, null = 0)
#>                       Estimate    2.5 %   97.5 % P-value    
#> E[dY/d(age)|0,black]  -0.00795 -0.01171 -0.00530  <2e-16 ***
#> E[dY/d(age)|0,hispan] -0.00560 -0.00988 -0.00281  <2e-16 ***
#> E[dY/d(age)|0,white]  -0.00663 -0.00951 -0.00398  <2e-16 ***
#> E[dY/d(age)|1,black]  -0.00806 -0.01194 -0.00499  <2e-16 ***
#> E[dY/d(age)|1,hispan] -0.00550 -0.00938 -0.00290  <2e-16 ***
#> E[dY/d(age)|1,white]  -0.00630 -0.00942 -0.00380  <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Comparing AMEs between married and unmarried for
# each level of `race`
est_diff <- est[4:6] - est[1:3]
names(est_diff) <- paste0("AME_diff|", levels(lalonde$race))
summary(est_diff)
#>                  Estimate     2.5 %    97.5 %
#> AME_diff|black  -0.000108 -0.001290  0.001062
#> AME_diff|hispan  0.000102 -0.001386  0.001438
#> AME_diff|white   0.000323 -0.001384  0.001580