# Plotting and inference for `clarify_est`

objects

Source: `R/plot.clarify_est.R`

, `R/summary.clarify_est.R`

`summary.clarify_est.Rd`

`summary()`

tabulates the estimates and confidence intervals and (optionally) p-values from a `clarify_est`

object. `confint()`

computes confidence intervals. `plot()`

plots the "posterior" distribution of estimates.

## Usage

```
# S3 method for clarify_est
plot(
x,
parm,
ci = TRUE,
level = 0.95,
method = "quantile",
reference = FALSE,
ncol = 3,
...
)
# S3 method for clarify_est
summary(object, parm, level = 0.95, method = "quantile", null = NA, ...)
# S3 method for clarify_est
confint(object, parm, level = 0.95, method = "quantile", ...)
```

## Arguments

- parm
a vector of the names or indices of the estimates to plot. If unspecified, all estimates will be displayed.

- ci
`logical`

; whether to display confidence interval limits for the estimates. Default is`TRUE`

.- level
the confidence level desired. Default is .95 for 95% confidence intervals.

- method
the method used to compute p-values and confidence intervals. Can be

`"wald"`

to use a Normal approximation or`"quantile"`

to use the simulated sampling distribution (default). See Details. Abbreviations allowed.- reference
`logical`

; whether to overlay a normal density reference distribution over the plots. Default is`FALSE`

.- ncol
the number of columns used when wrapping multiple plots; default is 3.

- ...
for

`plot()`

, further arguments passed to`ggplot2::geom_density()`

.- object, x
a

`clarify_est`

object; the output of a call to`sim_apply()`

or its wrappers.- null
the values of the parameters under the null hypothesis for the p-value calculations. Should have length equal to the number of quantities estimated, or one, in which case it will be recycled, or it can be a named vector with just the names of quantities for which null values are to be set. Set values to

`NA`

to omit p-values for those quantities. When all values are`NA`

, the default, no p-values are produced.

## Value

For `summary()`

, a `summary.clarify_est`

object, which is a matrix containing the coefficient estimates, standard errors, test statistics, p-values, and confidence intervals. Not all columns will be present depending on the arguments supplied to `summary()`

.

For `confint()`

, a matrix containing the confidence intervals for the requested quantities.

For `plot()`

, a `ggplot`

object.

## Details

`summary()`

uses the estimates computed from the original model as its estimates and uses the simulated parameters for inference only, in line with the recommendations of Rainey (2023).

When `method = "wald"`

, the standard deviation of the simulation estimates is used as the standard error, which is used in the z-statistics and the confidence intervals. The p-values and confidence intervals are valid only when the sampling distribution of the resulting statistic is normal (which can be assessed using `plot()`

). When `method = "quantile"`

, the confidence interval is calculated using the quantiles of the simulation estimates corresponding to `level`

, and the p-value is calculated as twice the proportion of simulation estimates less than or greater than `null`

, whichever is smaller; this is equivalent to inverting the confidence interval but is only truly valid when the true sampling distribution is only a location shift from the sampling distribution under the null hypothesis and should therefore be interpreted with caution. Using `"method = "quantile"`

(the default) is recommended because the confidence intervals will be valid even if the sampling distribution is not Normally distributed. The precision of the p-values and confidence intervals depends on the number of simulations requested (the value of `n`

supplied to `sim()`

).

The plots are produced using `ggplot2::geom_density()`

and can be customized with ggplot2 functions. When `reference = TRUE`

, a reference Normal distribution is produced using the empirical mean and standard deviation of the simulated values. A blue references line is plotted at the median of the simulated values. For Wald-based inference to be valid, the reference distribution should overlap with the empirical distribution, in which case the quantile-based and Wald-based intervals should be similar. For quantile-based inference to be valid, the median of the estimates should overlap with the estimated value; this is a necessary but not sufficient condition, though.

## References

Rainey, C. (2023). A careful consideration of CLARIFY: Simulation-induced bias in point estimates of quantities of interest. *Political Science Research and Methods*, 1–10. doi:10.1017/psrm.2023.8

## See also

`sim_apply()`

for applying a function to each set of simulated coefficients

## Examples

```
data("lalonde", package = "MatchIt")
fit <- glm(I(re78 > 0) ~ treat + age + race + nodegree + re74,
data = lalonde)
s <- sim(fit, n = 100)
# Compute average marginal means for `treat`
est <- sim_ame(s, var = "treat", verbose = FALSE)
coef(est)
#> E[Y(0)] E[Y(1)]
#> 0.7453346 0.8175754
# Compute average marginal effects on risk difference
# (RD) and risk ratio (RR) scale
est <- transform(est,
RD = `E[Y(1)]` - `E[Y(0)]`,
RR = `E[Y(1)]` / `E[Y(0)]`)
# Compute confidence intervals and p-values,
# using given null values for computing p-values
summary(est, null = c(`RD` = 0, `RR` = 1))
#> Estimate 2.5 % 97.5 % P-value
#> E[Y(0)] 0.7453 0.7027 0.7956 .
#> E[Y(1)] 0.8176 0.7584 0.8936 .
#> RD 0.0722 -0.0102 0.1675 0.14
#> RR 1.0969 0.9870 1.2392 0.14
# Same tests using normal approximation and alternate
# syntax for `null`
summary(est, null = c(NA, NA, 0, 1),
normal = TRUE)
#> Estimate 2.5 % 97.5 % P-value
#> E[Y(0)] 0.7453 0.7027 0.7956 .
#> E[Y(1)] 0.8176 0.7584 0.8936 .
#> RD 0.0722 -0.0102 0.1675 0.14
#> RR 1.0969 0.9870 1.2392 0.14
# Plot the RD and RR with a reference distribution
plot(est, parm = c("RD", "RR"), reference = TRUE,
ci = FALSE)
# Plot the RD and RR with quantile confidence bounds
plot(est, parm = c("RD", "RR"), ci = TRUE)
```