This help file describes how to use reghdfe to within other programs, either in Stata or Mata. It discusses three types of tools that might be useful for developers:
1. Ancillary commands from ftools that are used by reghdfe, such as ms_get_version.
2. Undocumented options of reghdfe.
3. The . viewsource reghdfe.mata
Mata class behind reghdfe, which can be used to build efficient Mata estimation programs.
These commands are nested in order of integration with reghdfe. Someone writing a command independent of reghdfe might still benefit from #1. Someone writing a command that calls reghdfe a few times, such as ivreghdfe, sumhdfe, or did_imputation might benefit from #2. And someone writing a command that calls reghdfe multiple times, such as ppmlhdfe might also be interested in #3, due to the increase in efficiency and Mata integration.
It's possible your command will depend on other userwritten commands, in the same way as reghdfe
depends on ftools
. To ensure compatibility and reproducibility, you can use the ms_get_version
command to ensure that users are not running versions of these programs that are not too old. For instance, reghdfe version 6.12.0 requires ftools of at least version 2.46.0.
The syntax of ms_get_version
is:
ms_get_version command [,
min_version(str)
min_date(str)
]
Options  Description  
min_version(str)

minimum version. Only supports semantic versioning of the form x.y.z. Note that versions are defined in the first line an adofile, and can be verified by typing which <command>.  
min_date(str)

(less used) minimum date of the program. Supports dates of the form 1jan2018 or 01Jan2018. 
min_version
also stores the following local variables:
`version_number'  version of the requested program  
`version_date'  date of the requested program  
`package_version'  concatenation of `version_number' and `version_string' 
An sample usage of ms_get_version, currently used by reghdfe, is:
ms_get_version ftools, min_version("2.46.0")
Sometimes you might not want to run the entire reghdfe command, but stop at some point and only compute certain objects. There are several objects that allow this.
A) Compute HDFE Nata object but stop before partialling out variables
reghdfe
... ,
nopartialout
[options]
This step will parse all inputs and initialize the HDFE object of the FixedEffects
class. Note that although the regression variables (depvar and indepvar) are not processed, if they have missing values the sample will reflect that.
For instance, the sample adofile below is enough to create a program that reports the number of singletons in a regression, without having to actually computed:
show_singletons.ado
prog show_singletons
qui reghdfe `0' nopartial
noi ereturn list
mata: st_local("n", strofreal(HDFE.num_singletons))
di as text "there are `n' singletons"
end
qui include "reghdfe.mata", adopath
B) Compute HDFE mata object, partial out the variables, but stop before regressing
reghdfe
... ,
noregress
[options]
This step is as A), but will also partial out the variables wrt. the fixed effects and save the resulting information in the HDFE.solution object. For instance, HDFE.solution.data will contained the partialledout data, and HDFE.solution.depvar will contain the name of the dependent variable.
This option can be used to (amongst other things) partial out all the variables only once, and then run regressions on the same sample and same regressors but with multiple lefthandside variables (useful with very large datasets).
C) Run regression but keep the HDFE Mata object
reghdfe
... ,
keepmata
[options]
By saving the HDFE object, this allows further manipulations of the fixed effects data, although the data corresponding to the partialledout variables is not preserved.
In order to use reghdfe's Mata functions without your own adofile, you need to add the following at the end of your file:
include "reghdfe.mata", adopath
This dynamically loads all the reghdfe Mata functions and classes, so they are accessible to the adofile. This alternative is preferred to sharing precompiled Mata objects, which would require compilation for multiple versions of Stata/Mata (or for the lowest possible version of Stata/Mata).
To construct the object, you can do:
class FixedEffects HDFE // Optional declaration
HDFE = FixedEffects() // Note that you can replace "HDFE" with whatever name you choose
HDFE.absvars = "firm_id year"
...
HDFE.init()
...
For more information, see the code of the Estimate function of reghdfe.ado
TODO: update this list
properties (factors)  Description  
Integer N

number of obs  
Integer M

Sum of all possible FE coefs  
Factors factors


Vector sample


Varlist absvars


Varlist ivars


Varlist cvars


Boolean has_intercept


RowVector intercepts


RowVector num_slopes


Integer num_singletons


Boolean save_any_fe


Boolean save_all_fe


Varlist targets


RowVector save_fe

properties (optimization options)  Description  
Real tolerance


Integer maxiter


String transform

Kaczmarz Cimmino Symmetric_kaczmarz (k c s)  
String acceleration

Acceleration method. None/No/Empty is none\  
Integer accel_start

Iteration where we start to accelerate /set it at 6? 2?3?  
string slope_method


Boolean prune

Whether to recursively prune degree1 edges  
Boolean abort

Raise error if convergence failed?  
Integer accel_freq

Specific to Aitken's acceleration  
Boolean storing_alphas

1 if we should compute the alphas/fes  
Real conlim

specific to LSMR  
Real btol

specific to LSMR 
properties (optimization objects)  Description 
BipartiteGraph bg

Used when pruning 1core vertices  
Vector pruned_weight

temp. weight for the factors that were pruned  
Integer prune_g1

Factor 1/2 in the bipartite subgraph that gets pruned  
Integer prune_g2

Factor 2/2 in the bipartite subgraph that gets pruned  
Integer num_pruned

Number of vertices (levels) that were pruned 
properties (misc)  Description  
Integer verbose


Boolean timeit


Boolean store_sample


Real finite_condition


Real compute_rre

Relative residual error:  e_k  e  /  e   
Real rre_depvar_norm


Vector rre_varname


Vector rre_true_residual

properties (weightspecific)  Description  
Boolean has_weights


Variable weight

unsorted weight  
String weight_var

Weighting variable  
String weight_type

Weight type (pw, fw, etc) 
properties (absorbed degreesoffreedom computations)  Description  
Integer G_extended

Number of intercepts plus slopes  
Integer df_a_redundant

e(mobility)  
Integer df_a_initial


Integer df_a

df_a_inital  df_a_redundant  
Vector doflist_M


Vector doflist_K


Vector doflist_M_is_exact


Vector doflist_M_is_nested


Vector is_slope


Integer df_a_nested

Redundant due to bein nested; used for: r2_a r2_a_within rmse 
properties (VCE and cluster variables)  Description  
String vcetype


Integer num_clusters


Varlist clustervars


Varlist base_clustervars


String vceextra

properties (regressionspecific)  Description  
String varlist

y x1 x2 x3 x4 z1 z2 z3  
String depvar

y  
String indepvars

x1 x2  
Boolean drop_singletons


String absorb

contents of absorb()  
String select_if

If condition  
String select_in

In condition  
String model

ols, iv  
String summarize_stats


Boolean summarize_quietly


StringRowVector dofadjustments

firstpair pairwise cluster continuous  
Varname groupvar


String residuals


RowVector kept

1 if the regressors are not deemed as omitted (by partial_out+cholsolve+invsym)  
String diopts

properties (output)  Description  
String cmdline


String subcmd


String title


Boolean converged


Integer iteration_count

e(ic)  
Varlist extended_absvars


String notes


Integer df_r


Integer df_m


Integer N_clust


Integer N_clust_list


Real rss


Real rmse


Real F


Real tss


Real tss_within


Real sumweights


Real r2


Real r2_within


Real r2_a


Real r2_a_within


Real ll


Real ll_0

methods  Description  
Void update_sorted_weights () 

Matrix partial_out () 

Void _partial_out () 
inplace alternative to partial_out()


Variables project_one_fe () 

Void prune_1core () 

Void _expand_1core () 

Void estimate_dof () 

Void estimate_cond () 

Void save_touse () 

Void store_alphas () 

Void save_variable () 

Void post_footnote () 

Void post () 

Void reload (copy=0) 
methods (LSMRspecific)  Description  
Real lsmr_norm () 

Vector lsmr_A_mult () 

Vector lsmr_At_mult () 
Several useful Mata functions are included. For instance,
void reghdfe_solve_ols(
HDFE ,
X,
... )
TODO: Update this example
{inp None}