This help file describes how to use reghdfe to within other programs, either in Stata or Mata. It discusses three types of tools that might be useful for developers:
1. Ancillary commands from ftools that are used by reghdfe, such as ms_get_version.
2. Undocumented options of reghdfe.
3. The . viewsource reghdfe.mata
Mata class behind reghdfe, which can be used to build efficient Mata estimation programs.
These commands are nested in order of integration with reghdfe. Someone writing a command independent of reghdfe might still benefit from #1. Someone writing a command that calls reghdfe a few times, such as ivreghdfe, sumhdfe, or did_imputation might benefit from #2. And someone writing a command that calls reghdfe multiple times, such as ppmlhdfe might also be interested in #3, due to the increase in efficiency and Mata integration.
ms_get_version
It's possible your command will depend on other user-written commands, in the same way as reghdfe
depends on ftools
. To ensure compatibility and reproducibility, you can use the ms_get_version
command to ensure that users are not running versions of these programs that are not too old. For instance, reghdfe version 6.12.0 requires ftools of at least version 2.46.0.
The syntax of ms_get_version
is:
ms_get_version command [,
min_version(str)
min_date(str)
]
Options | Description | |
min_version(str)
|
minimum version. Only supports semantic versioning of the form x.y.z. Note that versions are defined in the first line an ado-file, and can be verified by typing which <command>. | |
min_date(str)
|
(less used) minimum date of the program. Supports dates of the form 1jan2018 or 01Jan2018. |
min_version
also stores the following local variables:
`version_number' | version of the requested program | |
`version_date' | date of the requested program | |
`package_version' | concatenation of `version_number' and `version_string' |
An sample usage of ms_get_version, currently used by reghdfe, is:
ms_get_version ftools, min_version("2.46.0")
Sometimes you might not want to run the entire reghdfe command, but stop at some point and only compute certain objects. There are several objects that allow this.
A) Compute HDFE Nata object but stop before partialling out variables
reghdfe
... ,
nopartialout
[options]
This step will parse all inputs and initialize the HDFE object of the FixedEffects
class. Note that although the regression variables (depvar and indepvar) are not processed, if they have missing values the sample will reflect that.
For instance, the sample ado-file below is enough to create a program that reports the number of singletons in a regression, without having to actually computed:
show_singletons.ado
prog show_singletons
qui reghdfe `0' nopartial
noi ereturn list
mata: st_local("n", strofreal(HDFE.num_singletons))
di as text "there are `n' singletons"
end
qui include "reghdfe.mata", adopath
B) Compute HDFE mata object, partial out the variables, but stop before regressing
reghdfe
... ,
noregress
[options]
This step is as A), but will also partial out the variables wrt. the fixed effects and save the resulting information in the HDFE.solution object. For instance, HDFE.solution.data will contained the partialled-out data, and HDFE.solution.depvar will contain the name of the dependent variable.
This option can be used to (amongst other things) partial out all the variables only once, and then run regressions on the same sample and same regressors but with multiple left-hand-side variables (useful with very large datasets).
C) Run regression but keep the HDFE Mata object
reghdfe
... ,
keepmata
[options]
By saving the HDFE object, this allows further manipulations of the fixed effects data, although the data corresponding to the partialled-out variables is not preserved.
In order to use reghdfe's Mata functions without your own ado-file, you need to add the following at the end of your file:
include "reghdfe.mata", adopath
This dynamically loads all the reghdfe Mata functions and classes, so they are accessible to the ado-file. This alternative is preferred to sharing precompiled Mata objects, which would require compilation for multiple versions of Stata/Mata (or for the lowest possible version of Stata/Mata).
To construct the object, you can do:
class FixedEffects HDFE // Optional declaration
HDFE = FixedEffects() // Note that you can replace "HDFE" with whatever name you choose
HDFE.absvars = "firm_id year"
...
HDFE.init()
...
For more information, see the code of the Estimate function of reghdfe.ado
TODO: update this list
properties (factors) | Description | |
Integer N
|
number of obs | |
Integer M
|
Sum of all possible FE coefs | |
Factors factors
|
||
Vector sample
|
||
Varlist absvars
|
||
Varlist ivars
|
||
Varlist cvars
|
||
Boolean has_intercept
|
||
RowVector intercepts
|
||
RowVector num_slopes
|
||
Integer num_singletons
|
||
Boolean save_any_fe
|
||
Boolean save_all_fe
|
||
Varlist targets
|
||
RowVector save_fe
|
properties (optimization options) | Description | |
Real tolerance
|
||
Integer maxiter
|
||
String transform
|
Kaczmarz Cimmino Symmetric_kaczmarz (k c s) | |
String acceleration
|
Acceleration method. None/No/Empty is none\ | |
Integer accel_start
|
Iteration where we start to accelerate /set it at 6? 2?3? | |
string slope_method
|
||
Boolean prune
|
Whether to recursively prune degree-1 edges | |
Boolean abort
|
Raise error if convergence failed? | |
Integer accel_freq
|
Specific to Aitken's acceleration | |
Boolean storing_alphas
|
1 if we should compute the alphas/fes | |
Real conlim
|
specific to LSMR | |
Real btol
|
specific to LSMR |
properties (optimization objects) | Description |
BipartiteGraph bg
|
Used when pruning 1-core vertices | |
Vector pruned_weight
|
temp. weight for the factors that were pruned | |
Integer prune_g1
|
Factor 1/2 in the bipartite subgraph that gets pruned | |
Integer prune_g2
|
Factor 2/2 in the bipartite subgraph that gets pruned | |
Integer num_pruned
|
Number of vertices (levels) that were pruned |
properties (misc) | Description | |
Integer verbose
|
||
Boolean timeit
|
||
Boolean store_sample
|
||
Real finite_condition
|
||
Real compute_rre
|
Relative residual error: || e_k - e || / || e || | |
Real rre_depvar_norm
|
||
Vector rre_varname
|
||
Vector rre_true_residual
|
properties (weight-specific) | Description | |
Boolean has_weights
|
||
Variable weight
|
unsorted weight | |
String weight_var
|
Weighting variable | |
String weight_type
|
Weight type (pw, fw, etc) |
properties (absorbed degrees-of-freedom computations) | Description | |
Integer G_extended
|
Number of intercepts plus slopes | |
Integer df_a_redundant
|
e(mobility) | |
Integer df_a_initial
|
||
Integer df_a
|
df_a_inital - df_a_redundant | |
Vector doflist_M
|
||
Vector doflist_K
|
||
Vector doflist_M_is_exact
|
||
Vector doflist_M_is_nested
|
||
Vector is_slope
|
||
Integer df_a_nested
|
Redundant due to bein nested; used for: r2_a r2_a_within rmse |
properties (VCE and cluster variables) | Description | |
String vcetype
|
||
Integer num_clusters
|
||
Varlist clustervars
|
||
Varlist base_clustervars
|
||
String vceextra
|
properties (regression-specific) | Description | |
String varlist
|
y x1 x2 x3 x4 z1 z2 z3 | |
String depvar
|
y | |
String indepvars
|
x1 x2 | |
Boolean drop_singletons
|
||
String absorb
|
contents of absorb() | |
String select_if
|
If condition | |
String select_in
|
In condition | |
String model
|
ols, iv | |
String summarize_stats
|
||
Boolean summarize_quietly
|
||
StringRowVector dofadjustments
|
firstpair pairwise cluster continuous | |
Varname groupvar
|
||
String residuals
|
||
RowVector kept
|
1 if the regressors are not deemed as omitted (by partial_out+cholsolve+invsym) | |
String diopts
|
properties (output) | Description | |
String cmdline
|
||
String subcmd
|
||
String title
|
||
Boolean converged
|
||
Integer iteration_count
|
e(ic) | |
Varlist extended_absvars
|
||
String notes
|
||
Integer df_r
|
||
Integer df_m
|
||
Integer N_clust
|
||
Integer N_clust_list
|
||
Real rss
|
||
Real rmse
|
||
Real F
|
||
Real tss
|
||
Real tss_within
|
||
Real sumweights
|
||
Real r2
|
||
Real r2_within
|
||
Real r2_a
|
||
Real r2_a_within
|
||
Real ll
|
||
Real ll_0
|
methods | Description | |
Void update_sorted_weights () |
||
Matrix partial_out () |
||
Void _partial_out () |
in-place alternative to partial_out()
|
|
Variables project_one_fe () |
||
Void prune_1core () |
||
Void _expand_1core () |
||
Void estimate_dof () |
||
Void estimate_cond () |
||
Void save_touse () |
||
Void store_alphas () |
||
Void save_variable () |
||
Void post_footnote () |
||
Void post () |
||
Void reload (copy=0) |
methods (LSMR-specific) | Description | |
Real lsmr_norm () |
||
Vector lsmr_A_mult () |
||
Vector lsmr_At_mult () |
Several useful Mata functions are included. For instance,
void reghdfe_solve_ols(
HDFE ,
X,
... )
TODO: Update this example
{inp None}