Ensuring Model Feasibility

Using microSynth is easy when the models used to calculate weights are feasible. But as more variables are used for matching, especially when data is scarce or variables are sparse, the risk of an infeasible model increases. Below is a quick guide to how to troubleshoot model feasiblity issues.

Causes of model infeasibility

Model infeasibility becomes increasingly likely when:

  • There are few control observations
  • More variables used for matching
  • Matching variables are sparse (e.g., mostly zero)
  • Treatment units have extreme values for matching variables
  • Permutation weights are calculated in addition to main weights
  • Jackknife weights are calculated in addition to main weights

Responses to an infeasible model

As there are multiple causes of model infeasibility, there is an equally broad range of responses.

Specification of matching variables

If a model is found to be infeasible, the problem may trace back to matching variable specification. We recommend the following diagnostic steps:

  • Review the frequency of matching variables (e.g., with hist() or table()) to check for sparseness. Sparse variables are difficult to match on without large sample sizes.
  • Compare the distribution of variable values in treatment units to the un-treated units.
  • Attempt to reduce the number of matching variables, move variables from exact matches (match.out/match.covar) to best-possible matches (match.out.min/match.covar.min), or aggregate time-variant variables before matching.

When attempts to match on a sparse variable cause model infeasibility, there are several solutions:

  • Do not attempt an exact match. If the variable is time-invariant, move it from match.covar to match.covar.min; if the variable is time-variant, move it from match.out to match.out.min.
  • If the variable is time-variant, aggregate the variable over multiple time periods before matching. If just one or several variables that appear to be sparse or for which the treatment contains values that are rare in the un-treated units, then the user can issue instructions for each of those variables to be aggregated over different time periods. (Those time periods do not have to be at regular intervals, for instance if the sparseness only occurs at certain points in the pre-intervention data.) Exercise 4 from the provides an example of this. If the user would like to aggregate all time-variant variables over the same regular time periods, then it is somewhat simple to pass match.out or match.out.min a vector of variable names, and specify the aggregation periods using period.

Parameters for calculating weights

If varying the specification of matching variables is not satisfactory, the user can set the parameters microSynth() uses for the calculation of weights.

  • max.mse may be raised. This relaxes the constraint governing matches for variables passed to match.out and match.covar.
  • Advanced users may wish to alter maxit, cal.epsilon, calfun, and bounds, which correspond to parameters from the survey::calibrate() and govern the calculation of weights.

Calling on (computationally-intensive) back-up models

By default microSynth() attempts to calculate weights using simple methods. But because these are not always sufficient to produce a feasible model, two arguments, check.feas and use.backup, specify how microsynth should find and use less restrictive backup models. The two arguments do not interact and can be set independently.

check.feas = TRUE will search for a single model that yields satisfactory constraints for all purposes: estimating main weights, permutation weights, and jackknife residuals. The same model will be used for all purposes.

use.backup = TRUE will calculate the main weights without checking for feasibility, but if weights appear to be poor (i.e., they do not satisfy the max.mse condition), then weights will be re-calculated using another model. This way, different backup models may be used for different purposes (i.e., for estimating main weights, permutation weights, and jackknife residuals).