REGHDFE | Linear Models With Many Levels of Fixed Effects
reghdfe is a Stata package that runs linear and instrumental-variable regressions with many levels of fixed effects, by implementing the estimator of Correia (2015).
This estimator augments the fixed point iteration of Guimarães & Portugal (2010) and Gaure (2013), by adding three features:
- Replace the von Neumann-Halperin alternating projection transforms with symmetric alternatives. This allows us to use Conjugate Gradient acceleration, which provides much better convergence guarantees.
- Iteratively drop singleton groups and—more generally—reduce the linear system into its 2-core graph.
- Apply the algorithms of Spielman and Teng (2004) and Kelner et al (2013) and solve the Dual Randomized Kaczmarz representation of the problem, in order to attain a nearly-linear time estimator (Stata code in development).
Within Stata, it can be viewed as a generalization of areg
/xtreg
, with several additional features:
- Supports two or more levels of fixed effects.
- Supports fixed slopes (different slopes per individual).
- It can estimate not only OLS regressions but two-stage least squares, instrumental-variable regressions, and linear GMM (via the
ivreg2
andivregress
commands). - Two-way and multi-way clustering.
- Advanced options for computing standard errors, thanks to the
avar
command. - Careful estimation of degrees of freedom, taking into account nesting of fixed effects within clusters, as well as many possible sources of collinearity within the fixed effects.
- Iterated elimination of singleton groups.
- Even with only one level of fixed effects, it is faster than
areg
/xtreg
.
In addition, it is easy to use and supports most Stata conventions:
- Time series and factor variable notation, even within the absorbing variables and cluster variables.
- Multicore support through optimized Mata functions.
- Frequency weights, analytic weights, and probability weights are allowed.
- It can cache results in order to run many regressions with the same data, as well as run regressions over several categories.
- It supports most post-estimation commands, such as
test
,estat summarize
, andpredict
.