REGHDFE | Frequently Asked Questions

What does “fixed effect nested within cluster” means?
Can I use reghdfe with multi-way clustering but without fixed effects?
Can I absorb the fixed effects formed by the combination of two variables?
I ran out of memory. What can I do?
I want to repeat a regression with different outcome variables. Can I compute the transformations only once?
Why are there four R2s? Which one should I use?
How can I combine reghdfe with esttab or estout?
In my model the number of FEs varies by observation. What can I do?
I want to report a bug or contribute!

What does “fixed effect nested within cluster” means?

A fixed effect is “nested within cluster” if the same variable is used in vce(cluster ...) and in absorb(...) or—more generally—if a cluster variable is coarser than a fixed effect variable. For instance, if we have state–clustered standard errors and county–level fixed effects, or with clustering by year and monthly fixed effects.

Whenever this happens, reghdfe will avoid applying a double penalty to the standard errors: it will continue to cluster by e.g. state but will not use the number of counties when computing the absorbed degrees of freedom.

If this option is triggered, you will see the message “fixed effect nested within cluster; treated as redundant for DoF computation” in the second output table of reghdfe:

. sysuse auto
(1978 Automobile Data)

. reghdfe price weight length, absorb(turn trunk) vce(cluster turn)

  (output omitted)

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
-------------+-------------------------------------------------|
        turn |            0              13             13 *   | 
       trunk |           12              13              1     | 
---------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

To disable the adjustment, see the dof(...) section of the help file.

Details

When reghdfe computes the VCE matrix, it multiplies the asymptotic VCE matrix by a small sample adjustment (see the formula for “q” here). This adjustment depends on the absorbed degrees of freedom of the model (the degrees of freedom lost by controlling for the fixed effects).

In general, we reduce the degrees of freedom by the number of fixed effects that are estimated by reghdfe. However, as mentioned above, if a fixed effect variable is also a cluster variable (or is nested within a cluster variable), we do not consider it when computing the absorbed degrees of freedom.

This is done to avoid a double penalty for the fixed effects, as we have already penalized for them by using cluster–robust standard errors at the same level. Additionally, this approach is in line with packages such as xtreg and xtivreg2, although not with areg, which uses a different set of assumptions.

For further discussions, see:

Statalist post on “super–observations” and the penalties of cluster–robust standard errors
“… if the FE are nested, then the clustering adjustment already accounts for the degrees of freedom problem.” Lecture notes by Sam Hanson and Adi Sunderam
Todd Gormley’s lecture notes

Can I use `reghdfe` with multi-way clustering but without fixed effects?

Yes. Just run reghdfe without the absorb() option:

. sysuse auto
(1978 Automobile Data)

. reghdfe price weight length, vce(cluster turn trunk)
(converged in 1 iterations)

HDFE Linear regression                            Number of obs   =         74
Absorbing 1 HDFE group                            F(   2,     17) =      34.60
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.3476
                                                  Adj R-squared   =     0.3292
Number of clusters (turn)    =         18         Within R-sq.    =     0.3476
Number of clusters (trunk)   =         18         Root MSE        =  2415.7351

                            (Std. Err. adjusted for 18 clusters in turn trunk)
------------------------------------------------------------------------------
             |               Robust
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |   4.699065   2.650909     1.77   0.094    -.8938651    10.29199
      length |  -97.96031    91.6812    -1.07   0.300    -291.3907     95.4701
------------------------------------------------------------------------------

Can I absorb the fixed effects formed by the combination of two variables?

Often you want to absorb fixed effects at the exporter–importer level (in trade) or industry–year level (in finance). The usual (longer) approach is to run:

. egen cou_year = group(country year)
. reghdfe y x1 x2 , absorb(cou_year)

However, if you are lazy like me, just run:

. reghdfe y x1 x2 , absorb(country#year)

The same applies to cluster variables:

. reghdfe y x1 x2 , absorb(country#year) vce(cluster country#year)

I ran out of memory. What can I do?

With Stata 13 or older, Stata will stop when out of memory. With Stata 14 or newer, it will not stop, but it will become very slow as it will use the hard drive instead of the computer memory. We can fix this by two ways:

You can first try with the standard tricks of the trade. compress long integers into short ones, recast double reals into short ones (with care), drop unused observations and variable.
If all else fails, use the pool(#) option of reghdfe with a smaller value (default is 10). By default, reghdfe applies the generalized within transformation to 10 variables at a time. By choosing a smaller number of variables, it will create smaller temporary matrices, which might just be enough to avoid out–of–memory errors.

I want to repeat a regression with different outcome variables. Can I compute the transformations only once?

Yes. Check out the cache(...) option. For instance:

. sysuse auto
(1978 Automobile Data)

. reghdfe price weight length, absorb(turn trunk) cache(save)
(dropped 9 singleton observations)
(converged in 12 iterations)

. reghdfe price length, absorb(turn trunk) cache(use)

HDFE Linear regression                            Number of obs   =         65
Absorbing 2 HDFE groups                           F(   1,     39) =       4.54
                                                  Prob > F        =     0.0394
                                                  R-squared       =     0.4598
                                                  Adj R-squared   =     0.1136
                                                  Within R-sq.    =     0.1043
                                                  Root MSE        =  2727.2009

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      length |   114.9727   53.95093     2.13   0.039     5.846653    224.0988
-------------+----------------------------------------------------------------
    Absorbed |         F(24, 39) =          .       .             (Joint test)
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
-------------+-------------------------------------------------|
        turn |           13              13              0     | 
       trunk |           12              13              1     | 
---------------------------------------------------------------+

. reghdfe price weight, absorb(turn trunk) cache(use)

HDFE Linear regression                            Number of obs   =         65
Absorbing 2 HDFE groups                           F(   1,     39) =      25.39
                                                  Prob > F        =     0.0000
                                                  R-squared       =     0.6347
                                                  Adj R-squared   =     0.4005
                                                  Within R-sq.    =     0.3943
                                                  Root MSE        =  2242.7172

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |   5.055336   1.003362     5.04   0.000     3.025846    7.084826
-------------+----------------------------------------------------------------
    Absorbed |         F(24, 39) =          .       .             (Joint test)
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
-------------+-------------------------------------------------|
        turn |           13              13              0     | 
       trunk |           12              13              1     | 
---------------------------------------------------------------+

. reghdfe, cache(clear)

Why are there four R2s? Which one should I use?

The problem of the standard R2 under many fixed effects is that its value will mostly be driven by the fixed effects and not by the regressors of interest. As a solution, the within-R2 computes the R2 of the regression where every variable has already been demeaned with respect to all the fixed effects.

Therefore, on top of the usual R2 (e(r2)) and the adjusted R2 (e(r2_a)), we have the within-R2 (e(r2_within)) and the adjusted within-R2 (e(r2_a_within)).

If what you want an acid R2 that relates only to the variables of interest and is not driven by the fixed effets, I would suggest the adjusted within-R2.

Note: see this Statalist thread for more information.

How can I combine `reghdfe` with `esttab` or `estout`?

Reghdfe comes with a “secret” command (estfe) that allows for neat output tables:

* Setup
sysuse auto

* Run and store regressions
reghdfe price weight length, a(turn) keepsing
estimates store model1
reghdfe price weight length, a(turn trunk) keepsing
estimates store model2
reghdfe price weight length, a(turn foreign) keepsing
estimates store model2

* Prepare estimates for -estout-
	estfe . model*, labels(turn "Turn FE" turn#trunk "Turn-Trunk FE")
	return list

* Run estout/esttab
	esttab . model* , indicate("Length Controls=length" `r(indicate_fe)')
		
* Return stored estimates to their previous state
	estfe . model*, restore

And the output is:

------------------------------------------------------------
                      (1)             (2)             (3)   
                    price           price           price   
------------------------------------------------------------
weight              5.741***        5.703***        5.741***
                   (5.44)          (5.17)          (5.44)   

Length Con~s          Yes             Yes             Yes   

Turn FE               Yes             Yes             Yes   

foreign               Yes              No             Yes   
------------------------------------------------------------
N                      74              74              74   
------------------------------------------------------------
t statistics in parentheses
* p < 0.05, ** p < 0.01, *** p < 0.001

In my model the number of FEs varies by observation. What can I do?

Suppose you have a firm–level panel and want to control for board member fixed effects. This has two problems:

The number of board members varies between firms and even within a firm (through time). Thus, I cannot write absorb(board1 board2 ...).
Each individual might be in several boards at the same time.

Given the current setup of reghdfe, this problem cannot be solved. An imperfect alternative might be to reshape the dataset (so instead of one observation per firm-year, you have one per firm-year-board member), and then run reghdfe while clustering by the original observations.

A better solution would be to use the alternating projection method (that underlies reghdfe) to deal with this problem directly. However, this has not been implemented. If you have suggestions or comments about this setup, please contact me.

This is now solved in reghdfe (as of 2021) with the group() and individual() options. See also the paper describing how is it done and why it’s useful.

I want to report a bug or contribute!

Contributors and pull requests are more than welcome. There are a number of extension possibilities, such as estimating standard errors for the fixed effects using bootstrapping, exact computation of degrees-of-freedom for more than two HDFEs, and further improvements in the underlying algorithm. Please see the Github repository for more details.

If you think you have found a bug, please i) test if it is still present in the latest development version, and then ii) report it on Github. Thanks!

Contents: