Synthetic Differences in Differences

The estimate_effects() function can be used to conduct DID style analyses where counter-factual observations are constructed using Sparse Synthetic Controls.

import SparseSC

# Fit the model:
fitted_estimates = SparseSC.estimate_effects(outcomes,unit_treatment_periods,covariates=X,fast=True,...)

# Print summary of the model including effect size estimates, 
# p-values, and confidendence intervals:
print(fitted_estimates)

# Extract model attributes:
fitted_estimates.pl_res_post.avg_joint_effect.p_value
fitted_estimates.pl_res_post.avg_joint_effect.CI

# access the fitted Synthetic Controls model:
fitted_model = fitted_estimates.fit

The returned object is of class SparseSCEstResults.

Feature and Target Data

When estimating synthetic controls, units of observation are divided into control and treated units. Data collected on these units may include observations of the outcome of interest, as well as other characteristics of the units (termed "covariates", herein). Outcomes may be observed both before and after an intervention on the treated units.

To maintain independence of the fitted synthetic controls and the post-intervention outcomes of interest of treated units, the post-intervention outcomes from treated units are not used in the fitting process. There are two cuts from the remaining data that may be used to fit synthetic controls, and each has it's advantages and disadvantages.

In the call to estimate_effects(), outcomes should be numeric matrices containing data on the target variables collected prior to (after) the treatment / intervention ( respectively), and the optional parameter covariates may be a matrix of additional features. All matrices should have one row per unit and one column per observation.

In addition, the rows in covariates and outcomes which contain units that were affected by the intervention ("treated units") should be indicated using the treated_units parameter, which may be a vector of booleans or integers indicating the rows which belong to treat units.

Statistical parameters

The confidence level may be specified with the level parameter, and the maximum number of simulations used to produce the placebo distribution may be set with the max_n_pl parameter.

Additional parameters

Additional keyword arguments are passed on to the call to fit(), which is responsible for fitting the Synthetic Controls used to create the counterfactuals.