REGHDFE | Linear Models With Many Levels of Fixed Effects

reghdfe is a Stata package that runs linear and instrumental-variable regressions with many levels of fixed effects, by implementing the estimator of Correia (2015).

This estimator augments the fixed point iteration of Guimarães & Portugal (2010) and Gaure (2013), by adding three features:

  • Replace the von Neumann-Halperin alternating projection transforms with symmetric alternatives. This allows us to use Conjugate Gradient acceleration, which provides much better convergence guarantees.
  • Iteratively drop singleton groups and—more generally—reduce the linear system into its 2-core graph.
  • Apply the algorithms of Spielman and Teng (2004) and Kelner et al (2013) and solve the Dual Randomized Kaczmarz representation of the problem, in order to attain a nearly-linear time estimator (Stata code in development).

Within Stata, it can be viewed as a generalization of areg/xtreg, with several additional features:

  • Supports two or more levels of fixed effects.
  • Supports fixed slopes (different slopes per individual).
  • It can estimate not only OLS regressions but two-stage least squares, instrumental-variable regressions, and linear GMM (via the ivreg2 and ivregress commands).
  • Two-way and multi-way clustering.
  • Advanced options for computing standard errors, thanks to the avar command.
  • Careful estimation of degrees of freedom, taking into account nesting of fixed effects within clusters, as well as many possible sources of collinearity within the fixed effects.
  • Iterated elimination of singleton groups.
  • Even with only one level of fixed effects, it is faster than areg/xtreg.

In addition, it is easy to use and supports most Stata conventions:

  • Time series and factor variable notation, even within the absorbing variables and cluster variables.
  • Multicore support through optimized Mata functions.
  • Frequency weights, analytic weights, and probability weights are allowed.
  • It can cache results in order to run many regressions with the same data, as well as run regressions over several categories.
  • It supports most post-estimation commands, such as test, estat summarize, and predict.