hdfe —
|
Partial-out variables with respect to multiple levels of fixed-effects |
Replace current dataset:
hdfe
varlist [weight] ,
absorb(absvars)
clear
[keepvars(varlist)
keepids
] [clustervars(varlist)
options]
Keep current dataset and add new variables:
hdfe
varlist [weight] ,
absorb(absvars)
generate(stubname)
[sample(newvarname)
] [clustervars(varlist)
options]
Options | Description | |
HDFE-Specific | ||
clear |
will overwrite the dataset; leaving the transformed variables, as well as some ancillary ones (such as the fixed effects, weights, cluster variables, etc.). | |
If you use hdfe with factor variables, you may have trouble relating the old names (e.g. i.turn) to new names. |
||
The solution lies in this line: mata: asarray(varlist_cache, "i.turn")
|
||
keepvars(varlist) |
keep additional variables | |
keepids |
keep the temporary variables for the fixed effects (useful if you set them up like id#year) | |
generate(stubname) |
will not overwrite the variables; instead creating new demeaned variables with the stubname prefix | |
sample(newvarname)
|
will save the equivalent of e(sample) in this variable; useful when dropping singletons. Used with the generate option. |
|
clustervars(varlist)
|
list of variables containing cluster categories. This is used to give more accurate number of degrees of freedom lost due to the fixed effects, as reported on r(df_a). | |
Diagnostic [+] | ||
verbose(#) |
amount of debugging information to show (0=None, 1=Some, 2=More, 3=Parsing/convergence details, 4=Every iteration) | |
timeit |
show elapsed times by stage of computation | |
Optimization [+] | ||
+ | tolerance(#)
|
criterion for convergence (default=1e-8) |
maxiterations(#) |
maximum number of iterations (default=10,000); if set to missing (. ) it will run for as long as it takes. |
|
poolsize(#) |
apply the within algorithm in groups of # variables (default 10). a large poolsize is usually faster but uses more memory | |
acceleration(str) |
acceleration method; options are conjugate_gradient (cg), steep_descent (sd), aitken (a), and none (no) | |
transform(str) |
transform operation that defines the type of alternating projection; options are Kaczmarz (kac), Cimmino (cim), Symmetric Kaczmarz (sym) | |
Degrees-of-Freedom Adjustments [+] | ||
dofadjustments(list) |
allows selecting the desired adjustments for degrees of freedom; rarely used | |
groupvar(newvar)
|
unique identifier for the first mobility group | |
Reporting [+] | ||
version |
reports the version number and date of hdfe, and saves it in e(version). standalone option | |
Undocumented | ||
keepsingletons |
do not drop singleton groups | |
* absorb(absvars) is required. | ||
+ indicates a recommended or important option. | ||
all variables may contain time-series operators and factor variables; see tsvarlist and fvvarlist. | ||
fweight s, aweight s and pweight s are allowed; see weight. |
absvar | Description | |
i. varname
|
categorical variable to be absorbed (the i. prefix is tacit) |
|
i. var1#i. var2
|
absorb the interactions of multiple categorical variables | |
i. var1# c. var2
|
absorb heterogeneous slopes, where var2 has a different slope coef. depending on the category of var1 | |
var1## c. var2
|
equivalent to "i. var1 i. var1# c. var2", but much faster |
|
var1##c.( var2 var3)
|
multiple heterogeneous slopes are allowed together. Alternative syntax: var1##(c. var2 c. var3)
|
|
v1# v2# v3##c.( v4 v5)
|
factor operators can be combined | |
Using categorical interactions (e.g. x# z) is faster than running egen group(...) beforehand. | ||
Singleton obs. are dropped iteratively until no more singletons are found (see ancilliary article for details). | ||
Slope-only absvars ("state#c.time") have poor numerical stability and slow convergence. If you need those, either i) increase tolerance or ii) use slope-and-intercept absvars ("state##c.time"), even if the intercept is redundant. For instance if absvar is "i.zipcode i.state##c.time" then i.state is redundant given i.zipcode, but convergence will still be much faster. |
hdfe
computes the residuals of a set of variables with respect to multiple levels of fixed effects. It is a generalization of the within transformation done by areg and xtreg,fe for more than one fixed effect, also allowing for multiple heterogeneous intercepts.
hdfe
is a programmers' routine that serves as a building block to other regression packages so they can support multiple fixed effects (see for instance {search binscatter}, regife and {search poi2hdfe}). It contains the same code underlying reghdfe and exposes most of its functionality and options.
It also computes the degrees-of-freedom absorbed by the fixed effects and stores them in e(df_a).
It works well with other building-block packages such as avar (from SSC).
Suppose you want to replicate reghdfe
. Then, you would do:
sysuse auto, clear
* Benchmark
reghdfe price weight length, a(turn trunk)
* Demean variables
hdfe price weight length, a(turn trunk) gen(RESID_)
local df_a = e(df_a)
* Run regression
quietly regress RESID_*, nocons
* Fix degrees-of-freedom
local df_r = e(df_r) - `df_a'
matrix b = e(b)
matrix V = e(V) * e(df_r) / `df_r'
ereturn post b V, dep(price) obs(`c(N)') dof(`df_r')
ereturn display
hdfe
stores the following in e()
:
Scalars | ||
e(df_a) |
degrees of freedom lost due to the fixed effects (taking into account the cluster structure and whether the FEs are nested within the clusters) | |
e(N_hdfe) |
number of sets of fixed effects |
Macros | ||
e(absvars) |
canonical expansion of the fixed effects | |
e(extended_absvars) |
expansion of the fixed effects separating heterogeneous slopes (e.g. y##c.z is expanded to y y#c.z) |
Sergio Correia
Fuqua School of Business, Duke University
Email: sergio.correia@duke.edu
A copy of this help file, as well as a more in-depth user guide is in development and will be available at "http://scorreia.com/reghdfe".
hdfe
is updated frequently, and upgrades or minor bug fixes may not be immediately available in SSC. To check or contribute to the latest version of hdfe, explore the Github repository. Bugs or missing features can be discussed through email or at the Github issue tracker.
To see your current version and installed dependencies, type reghdfe, version
This package wouldn't have existed without the invaluable feedback and contributions of Paulo Guimaraes, Amine Ouazad, Mark Schaffer and Kit Baum. Also invaluable are the great bug-spotting abilities of many users.