13  Weighting

Next, we’ll use weighting to target the ATE of RHC on death1. We’ll use the WeightIt package, which provides an interface to many different weighting methods and has utilities for assessing the quality of the weights. For more details on this procedure, including effect estimation, see the WeightIt documentation and vignettes.

First we’ll perform the most common weighting method, inverse probability weighting using a logistic regression propensity score.

w1 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
                 paco21 + surv2md1 + resp1 + card + edu +
                 age + race + sex, data = rhc,
               estimand = "ATE")
w1
A weightit object
 - method: "glm" (propensity score weighting with GLM)
 - number of obs.: 5735
 - sampling weights: none
 - treatment: 2-category
 - estimand: ATE
 - covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex

We’ll use bal.tab() again to assess balance.

bal.tab(w1, stats = c("m", "ks"), binary = "std")
Balance Measures
               Type Diff.Adj KS.Adj
prop.score Distance  -0.0243 0.0281
aps1        Contin.  -0.0256 0.0222
meanbp1     Contin.   0.0168 0.0331
pafi1       Contin.   0.0020 0.0253
crea1       Contin.  -0.0119 0.0646
hema1       Contin.  -0.0265 0.0575
paco21      Contin.   0.0024 0.0218
surv2md1    Contin.   0.0141 0.0344
resp1       Contin.   0.0339 0.0475
card_Yes     Binary  -0.0065 0.0031
edu         Contin.   0.0105 0.0206
age         Contin.   0.0109 0.0565
race_white   Binary   0.0058 0.0024
race_black   Binary  -0.0031 0.0011
race_other   Binary  -0.0053 0.0013
sex_Male     Binary  -0.0021 0.0011

Effective sample sizes
           Control Treated
Unadjusted 3551.   2184.  
Adjusted   2657.85 1509.52

Balance looks excellent using standard inverse probability weighting, and normally we might stop here. However, we’ll carry on in search of even better balance. We’ll use entropy balancing, which guarantees exact balance on the means of included covariates (but may not balance the rest of the covariate distributions).

w2 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
                 paco21 + surv2md1 + resp1 + card + edu +
                 age + race + sex, data = rhc,
               estimand = "ATE", method = "ebal")
w2
A weightit object
 - method: "ebal" (entropy balancing)
 - number of obs.: 5735
 - sampling weights: none
 - treatment: 2-category
 - estimand: ATE
 - covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex
bal.tab(w2, binary = "std", int = TRUE,
        poly = 4, thresholds = c(m = .05),
        disp.bal.tab = FALSE)
Balance tally for mean differences
                    count
Balanced, <0.05       164
Not Balanced, >0.05    12

Variable with the greatest mean difference
 Variable Diff.Adj         M.Threshold
   crea1³  -0.1005 Not Balanced, >0.05

Effective sample sizes
           Control Treated
Unadjusted 3551.   2184.  
Adjusted   3160.91 1542.53

Here we included interactions and up to fourth powers of the covariates to assess balance more fully on the covariate distributions; although balance after entropy balancing was excellent, it might still be possible to improve on it. We could request that entropy balancing additionally balances specific powers of the covariates using the moments and int arguments. Instead, we’ll try energy balancing, which tends to have excellent performance at balancing the entire covariate distribution and doesn’t require manually specifying components to balance (but it can be a bit slow on larger datasets).

w3 <- weightit(RHC ~ aps1 + meanbp1 + pafi1 + crea1 + hema1 +
                 paco21 + surv2md1 + resp1 + card + edu +
                 age + race + sex, data = rhc,
               estimand = "ATE", method = "energy")
w3
A weightit object
 - method: "energy" (energy balancing)
 - number of obs.: 5735
 - sampling weights: none
 - treatment: 2-category
 - estimand: ATE
 - covariates: aps1, meanbp1, pafi1, crea1, hema1, paco21, surv2md1, resp1, card, edu, age, race, sex
bal.tab(w3, binary = "std", int = TRUE,
        poly = 4, thresholds = c(m = .05),
        disp.bal.tab = FALSE)
Balance tally for mean differences
                    count
Balanced, <0.05       176
Not Balanced, >0.05     0

Variable with the greatest mean difference
 Variable Diff.Adj     M.Threshold
     age⁴  -0.0211 Balanced, <0.05

Effective sample sizes
           Control Treated
Unadjusted 3551.   2184.  
Adjusted   2518.29 1212.94

We find that energy balancing was successful at balancing the full covariate distribution. We’ll carry on with our energy balancing results.

To estimate the treatment effect, we will again use g-computation, aided by the marginaleffects package.

First, we need to fit the outcome model. Remember, this model is not to be interpreted. We can use glm_weightit(), which automatically estimates asymptotically correct SEs when available and robust SEs otherwise. For energy balancing, the latter are used, which are generally appropriate for weighting for the ATE (though they may be conservative). Other methods of computing SEs can be requested using the vcov argument.

# Fit the outcome model
fit <- glm_weightit(death ~ RHC * (aps1 + meanbp1 + pafi1 + crea1 + hema1 +
                                     paco21 + surv2md1 + resp1 + card + edu +
                                     age + race + sex),
                    data = rhc,
                    weightit = w3,
                    family = binomial)

Next we’ll compute the marginal predictions and their ratio.

# Marginal predictions
avg_predictions(fit,
                variables = "RHC")

 RHC Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
   0    0.635    0.00874 72.6   <0.001 Inf 0.618  0.652
   1    0.668    0.01264 52.8   <0.001 Inf 0.643  0.693

Type:  probs 
Columns: RHC, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
# Risk ratio
avg_comparisons(fit,
                variables = "RHC",
                comparison = "lnratioavg",
                transform = "exp")

 Estimate Pr(>|z|)   S 2.5 % 97.5 %
     1.05   0.0295 5.1  1.01    1.1

Term: RHC
Type:  probs 
Comparison: ln(mean(1) / mean(0))
Columns: term, contrast, estimate, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted 

Here we find evidence of a positive risk ratio overall, indicating that on average, receiving RHC increases the risk of death by 5%2.

Again, it is useful to report balance to demonstrate the performance of the weights. Here, we could say that the largest SMD for the covariates was .004 and the largest KS statistic was .029, and the SMDs for all powers of the covariates up to 4 and two-way interactions were less than .021. The specific values for each covariate would not be required because this summary indicates that all covariates were balanced more than adequately.


  1. Note that the ATE can be targeted by matching (not 1:1 matching, but other methods) and the ATT and other estimands can be targeted by weighting; don’t think matching is for the ATT and weighting is for the ATE. Use whichever method yields the best performance and would be best understood by your audience.↩︎

  2. Note, in this case, the conclusions would have been the same regardless of which weighting method we moved forward with.↩︎