13 Weighting
Next, we’ll use weighting to target the ATE of RHC on death1. We’ll use the WeightIt
package, which provides an interface to many different weighting methods and has utilities for assessing the quality of the weights. For more details on this procedure, including effect estimation, see the WeightIt
documentation and vignettes.
First we’ll perform the most common weighting method, inverse probability weighting using a logistic regression propensity score.
w1 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
paco21 + surv2md1 + resp1 + card + edu +
age + race + sex, data = rhc,
estimand = "ATE")
w1
A weightit object
- method: "glm" (propensity score weighting with GLM)
- number of obs.: 5735
- sampling weights: none
- treatment: 2-category
- estimand: ATE
- covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex
We’ll use bal.tab()
again to assess balance.
bal.tab(w1, stats = c("m", "ks"), binary = "std")
Balance Measures
Type Diff.Adj KS.Adj
prop.score Distance -0.0243 0.0281
aps1 Contin. -0.0256 0.0222
meanbp1 Contin. 0.0168 0.0331
pafi1 Contin. 0.0020 0.0253
crea1 Contin. -0.0119 0.0646
hema1 Contin. -0.0265 0.0575
paco21 Contin. 0.0024 0.0218
surv2md1 Contin. 0.0141 0.0344
resp1 Contin. 0.0339 0.0475
card_Yes Binary -0.0065 0.0031
edu Contin. 0.0105 0.0206
age Contin. 0.0109 0.0565
race_white Binary 0.0058 0.0024
race_black Binary -0.0031 0.0011
race_other Binary -0.0053 0.0013
sex_Male Binary -0.0021 0.0011
Effective sample sizes
Control Treated
Unadjusted 3551. 2184.
Adjusted 2657.85 1509.52
Balance looks excellent using standard inverse probability weighting, and normally we might stop here. However, we’ll carry on in search of even better balance. We’ll use entropy balancing, which guarantees exact balance on the means of included covariates (but may not balance the rest of the covariate distributions).
w2 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
paco21 + surv2md1 + resp1 + card + edu +
age + race + sex, data = rhc,
estimand = "ATE", method = "ebal")
w2
A weightit object
- method: "ebal" (entropy balancing)
- number of obs.: 5735
- sampling weights: none
- treatment: 2-category
- estimand: ATE
- covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex
bal.tab(w2, binary = "std", int = TRUE,
poly = 4, thresholds = c(m = .05),
disp.bal.tab = FALSE)
Balance tally for mean differences
count
Balanced, <0.05 164
Not Balanced, >0.05 12
Variable with the greatest mean difference
Variable Diff.Adj M.Threshold
crea1³ -0.1005 Not Balanced, >0.05
Effective sample sizes
Control Treated
Unadjusted 3551. 2184.
Adjusted 3160.91 1542.53
Here we included interactions and up to fourth powers of the covariates to assess balance more fully on the covariate distributions; although balance after entropy balancing was excellent, it might still be possible to improve on it. We could request that entropy balancing additionally balances specific powers of the covariates using the moments
and int
arguments. Instead, we’ll try energy balancing, which tends to have excellent performance at balancing the entire covariate distribution and doesn’t require manually specifying components to balance (but it can be a bit slow on larger datasets).
w3 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
paco21 + surv2md1 + resp1 + card + edu +
age + race + sex, data = rhc,
estimand = "ATE", method = "energy")
w3
A weightit object
- method: "energy" (energy balancing)
- number of obs.: 5735
- sampling weights: none
- treatment: 2-category
- estimand: ATE
- covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex
bal.tab(w3, binary = "std", int = TRUE,
poly = 4, thresholds = c(m = .05),
disp.bal.tab = FALSE)
Balance tally for mean differences
count
Balanced, <0.05 176
Not Balanced, >0.05 0
Variable with the greatest mean difference
Variable Diff.Adj M.Threshold
age⁴ -0.0211 Balanced, <0.05
Effective sample sizes
Control Treated
Unadjusted 3551. 2184.
Adjusted 2518.29 1212.94
We find that energy balancing was successful at balancing the full covariate distribution. We’ll carry on with our energy balancing results.
To estimate the treatment effect, we will again use g-computation, aided by the marginaleffects
package.
First, we need to fit the outcome model. Remember, this model is not to be interpreted. We can use glm_weightit()
, which automatically estimates asymptotically correct SEs when available and robust SEs otherwise. For energy balancing, the latter are used, which are generally appropriate for weighting for the ATE (though they may be conservative). Other methods of computing SEs can be requested using the vcov
argument.
# Fit the outcome model
fit <- glm_weightit(death ~ RHC * (aps1 + meanbp1 + pafi1 + crea1 + hema1 +
paco21 + surv2md1 + resp1 + card + edu +
age + race + sex),
data = rhc,
weightit = w3,
family = binomial)
Next we’ll compute the marginal predictions and their ratio.
# Marginal predictions
avg_predictions(fit,
variables = "RHC")
RHC Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
0 0.635 0.00874 72.6 <0.001 Inf 0.618 0.652
1 0.668 0.01264 52.8 <0.001 Inf 0.643 0.693
Type: probs
Columns: RHC, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
# Risk ratio
avg_comparisons(fit,
variables = "RHC",
comparison = "lnratioavg",
transform = "exp")
Estimate Pr(>|z|) S 2.5 % 97.5 %
1.05 0.0295 5.1 1.01 1.1
Term: RHC
Type: probs
Comparison: ln(mean(1) / mean(0))
Columns: term, contrast, estimate, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
Here we find evidence of a positive risk ratio overall, indicating that on average, receiving RHC increases the risk of death by 5%2.
Again, it is useful to report balance to demonstrate the performance of the weights. Here, we could say that the largest SMD for the covariates was .004 and the largest KS statistic was .029, and the SMDs for all powers of the covariates up to 4 and two-way interactions were less than .021. The specific values for each covariate would not be required because this summary indicates that all covariates were balanced more than adequately.
Note that the ATE can be targeted by matching (not 1:1 matching, but other methods) and the ATT and other estimands can be targeted by weighting; don’t think matching is for the ATT and weighting is for the ATE. Use whichever method yields the best performance and would be best understood by your audience.↩︎
Note, in this case, the conclusions would have been the same regardless of which weighting method we moved forward with.↩︎