Simulate model coefficients after multiple imputation

misim() simulates model parameters from multivariate normal or t distributions after multiple imputation that are then used by sim_apply() to calculate quantities of interest.

Usage

misim(fitlist, n = 1000, vcov = NULL, coefs = NULL, dist = NULL)

Arguments

fitlist: a list of model fits, one for each imputed dataset, or a mira object (the output of a call to with() applied to a mids object in mice).
n: the number of simulations to run for each imputed dataset; default is 1000. More is always better but resulting calculations will take longer.
vcov: a square covariance matrix of the coefficient covariance estimates, a function to use to extract it from fit, or a list thereof with an element for each imputed dataset. By default, uses stats::vcov() or insight::get_varcov() if that doesn't work.
coefs: a vector of coefficient estimates, a function to use to extract it from fit, or a list thereof with an element for each imputed dataset. By default, uses stats::coef() or insight::get_parameters() if that doesn't work.
dist: a character vector containing the name of the multivariate distribution(s) to use to draw simulated coefficients. Should be one of "normal" (multivariate normal distribution) or "t_{#}" (multivariate t distribution), where {#} corresponds to the desired degrees of freedom (e.g., "t_100"). If NULL, the right distributions to use will be determined based on heuristics; see sim() for details.

Value

A clarify_misim object, which inherits from clarify_sim and has the following components:

sim.coefs: a matrix containing the simulated coefficients with a column for each coefficient and a row for each simulation for each imputation
coefs: a matrix containing the original coefficients extracted from fitlist or supplied to coefs, with a row per imputation.
fit: the list of model fits supplied to fitlist
imp: a identifier of which imputed dataset each set of simulated coefficients corresponds to.

The "dist" attribute contains "normal" if the coefficients were sampled from a multivariate normal distribution and "t({df})" if sampled from a multivariate t distribution. The "clarify_hash" attribute contains a unique hash generated by rlang::hash().

Details

misim() essentially combines multiple sim() calls applied to a list of model fits, each fit in an imputed dataset, into a single combined pool of simulated coefficients. When simulation-based inference is to be used with multiply imputed data, many imputations are required; see Zhou and Reiter (2010).

References

Zhou, X., & Reiter, J. P. (2010). A Note on Bayesian Inference After Multiple Imputation. The American Statistician, 64(2), 159–163. doi:10.1198/tast.2010.09109

Examples

data("africa", package = "Amelia")

# Multiple imputation using Amelia
a.out <- Amelia::amelia(x = africa, m = 10,
                        cs = "country",
                        ts = "year", logs = "gdp_pc",
                        p2s = 0)

fits <- with(a.out, lm(gdp_pc ~ infl * trade))

# Simulate coefficients
s <- misim(fits)
s
#> A `clarify_misim` object
#>  - 4 coefficients, 10 imputations with 1000 simulated values each
#>  - sampled distributions: multivariate t(116)