[BAYES] bayesmh — | Bayesian regression using Metropolis-Hastings algorithm |
Univariate linear models
bayesmh
depvar [indepvars] [if] [in] [weight],
likelihood
(
modelspec)
prior
(
priorspec)
[reffects(varname)
options]
Multivariate linear models
Multivariate normal linear regression with common regressors
bayesmh
depvars =
[indepvars] [if] [in] [weight],
likelihood
(mvnormal(
...))
prior
(
priorspec)
[options]
Multivariate normal regression with outcome-specific regressors
bayesmh
(
[eqname1:
]depvar1 [indepvars1])
(
[eqname2:
]depvar2 [indepvars2])
[...] [if] [in] [weight],
likelihood
(mvnormal(
...))
prior
(
priorspec)
[options]
Multiple-equation linear models
bayesmh
(
eqspec)
[(
eqspec)
] [...] [if] [in] [weight],
Nonlinear models
Univariate nonlinear regression
bayesmh
depvar =
(
subexpr)
[if] [in] [weight],
likelihood
(
modelspec)
prior
(
priorspec)
[options]
Multivariate normal nonlinear regression
bayesmh
(
depvar1 =
(
subexpr1))
(
depvar2 =
(
subexpr2))
[...] [if] [in] [weight],
likelihood
(mvnormal(
...))
prior
(
priorspec)
[options]
The syntax of eqspec is
varspec [if] [in] [weight],
likelihood
(
modelspec)
[noconstant
]
The syntax of varspec is one of the following:
for single outcome for multiple outcomes with common regressors for multiple outcomes with outcome-specific regressors(
[eqname1:
]depvar1 [indepvars1])
(
[eqname2:
]depvar2 [indepvars2])
[...]
subexpr, subexpr1, subexpr2, and so on are substitutable expressions; see Substitutable expressions for details.
The syntax of modelspec is
model | Description | |
Continuous | ||
normal(var) |
normal regression with variance var | |
lognormal(var) |
lognormal regression with variance var | |
lnormal(var) |
synonym for lognormal()
|
|
exponential |
exponential regression | |
mvnormal(Sigma) |
multivariate normal regression with covariance matrix Sigma | |
Discrete | ||
probit |
probit regression | |
logit |
logistic regression | |
logistic |
logistic regression; synonym for logit
|
|
binlogit(n) |
binomial regression with logit link | |
oprobit |
ordered probit regression | |
ologit |
ordered logistic regression | |
poisson |
Poisson regression | |
Generic | ||
llf(subexpr) |
substitutable expression for observation-level log-likelihood function | |
exposure(varname_e) |
include ln(varname_e) in model with coefficient constrained to 1; allowed only with poisson
|
|
noglmtransform |
do not apply GLM-type transformation to the linear predictor or nonlinear specification; allowed only with exponential , poisson , or binlogit()
|
|
A distribution argument is a number for scalar arguments such as var; a variable name, varname (except for matrix arguments); a matrix for matrix arguments such as Sigma; a model parameter paramspec; an expression expr; or a substitutable expression, subexpr. See BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions. {synoptset 22}{nobreak None} {marker modelopts}{nobreak None} {synopthdr None:modelopts} {synoptline None} {synopt None}offset(varname_o) include varname_o in model with coefficient constrained to 1; not allowed with normal() and mvnormal()
|
The syntax of priorspec is
where the simplest specification of paramref is
paramspec [paramspec] [...]]
Also see Referring to model parameters for other specifications.
The syntax of paramspec is
{
[eqname:
]param[, matrix
]}
where the parameter label eqname and parameter name param are valid Stata names. Model parameters are either scalars such as {
var
}
, {
mean
}
, {
scale:beta
}
, or matrices such as {
Sigma, matrix
}
and {
Scale:V, matrix
}
. For scalar parameters, you can use param=
# to specify an initial value. For example, you can specify, {
var=1
}
, {mean=1.267}
, or {shape:alpha=3}
.
priordist | Description | |
Univariate continuous | ||
normal(mu,var) |
normal with mean mu and variance var | |
lognormal(mu,var) |
lognormal with mean mu and variance var | |
lnormal(mu,var) |
synonym for lognormal()
|
|
uniform(a,b) |
uniform on (a,b) | |
gamma(alpha,beta) |
gamma with shape alpha and scale beta | |
igamma(alpha,beta) |
inverse gamma with shape alpha and scale beta | |
exponential(beta) |
exponential with scale beta | |
beta(a,b) |
beta with shape parameters a and b | |
chi2(df) |
central chi-squared with degrees of freedom df | |
jeffreys |
Jeffreys prior for variance of a normal distribution | |
Multivariate continuous | ||
mvnormal(d,mean,Sigma) |
multivariate normal of dimension d with mean vector mean and covariance matrix Sigma; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud
|
|
mvnormal0(d,Sigma) |
multivariate normal of dimension d with zero mean vector and covariance matrix Sigma | |
mvn0(d,Sigma) |
synonym for mvnormal0()
|
|
zellnersg( d, g, mean,{ var})
|
Zellner's g-prior of dimension d with g degrees of freedom, mean vector mean, and variance parameter { var} ; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud
|
|
zellnersg0( d, g,{ var})
|
Zellner's g-prior of dimension d with g degrees of freedom, zero mean vector, and variance parameter { var}
|
|
wishart(d,df,V) |
Wishart of dimension d with degrees of freedom df and scale matrix V | |
iwishart(d,df,V) |
inverse Wishart of dimension d with degrees of freedom df and scale matrix V | |
jeffreys(d) |
Jeffreys prior for covariance of a multivariate normal distribution of dimension d | |
Discrete | ||
bernoulli(p) |
Bernoulli with success probability p | |
index(p1,...,pk) |
discrete indices 1, 2, ..., k with probabilities p1, p2, ..., pk | |
poisson(mu) |
Poisson with mean mu |
Generic | ||
flat |
flat prior; equivalent to density(1) or logdensity(0)
|
|
density ( f)
|
generic density f | |
logdensity ( logf)
|
generic logdensity logf | |
Dimension d is a positive #. | ||
A distribution argument is a number for scalar arguments such as var, alpha, beta; a Stata matrix for matrix arguments such as Sigma and V; a model parameter, paramspec; an expression, expr; or a substitutable expression subexpr. | ||
{marker generic_f} f is a nonnegative number, #; an expression expr; or a substitutable expression, subexpr. | ||
{marker generic_logf} logf is a number, #; an expression, expr; or a substitutable expression, subexpr. | ||
When mvnormal() or mvnormal0() of dimension d is applied to paramref with n parameters (n!=d), paramref is reshaped into a matrix with d columns, and its rows are treated as independent samples from the specified mvnormal() distribution. If such reshaping is not possible, an error is issued. See example 25 for application of this feature. |
Options | Description | |
Model | ||
noconstant |
suppress constant term; not allowed with ordered and nonlinear models | |
* | likelihood ( modelspec)
|
distribution for the likelihood model |
* | prior ( priorspec)
|
prior for model parameters; this option may be repeated |
dryrun |
show model summary without estimation | |
Model 2 | ||
redefine ( label:i. varname)
|
specify a random-effects linear form | |
xbdefine ( label: varlist)
|
specify a linear form | |
block ( paramref[, blockopts])
|
specify a block of model parameters; this option may be repeated | |
initial ( initspec)
|
initial values for model parameters | |
nomleinitial |
suppress the use of maximum likelihood estimates as starting values | |
Simulation | ||
mcmcsize(#) |
MCMC sample size; default is mcmcsize(10000)
|
|
burnin(#) |
burn-in period; default is burnin(2500)
|
|
thinning(#) |
thinning interval; default is thinning(1)
|
|
rseed(#) |
random-number seed | |
exclude ( paramref)
|
specify model parameters to be excluded from the simulation results | |
Adaption | ||
adaptation ( adaptopts)
|
control the adaptive MCMC procedure | |
scale(#) |
initial multiplier for scale factor; default is scale(2.38)
|
|
covariance(cov) |
initial proposal covariance; default is the identity matrix | |
Reporting | ||
clevel(#) |
set credible interval level; default is clevel(95)
|
|
hpd |
display HPD credible intervals instead of the default equal-tailed credible intervals | |
batch(#) |
specify length of block for batch-mean calculations; default is batch(0)
|
|
nomodelsummary |
suppress model summary | |
noexpression |
suppress output of expressions from model summary | |
blocksummary |
display block summary | |
dot |
display dots every 100 iterations and iteration numbers every 1,000 iterations | |
dots( #[, every(#) ])
|
display dots as simulation is performed | |
noshow(paramref) |
specify model parameters to be excluded from the output | |
showreffects(paramref) |
specify random-effects parameters to be included in the output | |
notable |
suppress estimation table | |
noheader |
suppress output header | |
title(string) |
display string as title above the table of parameter estimates | |
saving( filename[, replace])
|
save simulation results to filename.dta
|
|
display_options | control spacing, line width, and base and empty cells | |
Advanced | ||
search ( search_options)
|
control the search for feasible initial values | |
corrlag(#) |
specify maximum autocorrelation lag; default varies | |
corrtol(#) |
specify autocorrelation tolerance; default is corrtol(0.01)
|
|
* Options likelihood() and prior() are required. prior() must be specified for all model parameters. | ||
Options prior() and block() can be repeated. | ||
indepvars and paramref may contain factor variables; see fvvarlist. | ||
With multiple-equations specifications, a local if specified within an equation is applied together with the global if specified with the command. | ||
Only fweight s are allowed; see weight. | ||
With multiple-equations specifications, local weights or (weights specified within an equation) override global weights (weights specified with the command). | ||
See [BAYES] bayesmh postestimation for features available after estimation. |
blockopts | Description | |
gibbs |
requests Gibbs sampling; available for selected models only and not allowed with scale() , covariance() , or adaptation()
|
|
split |
requests that all parameters in a block be treated as separate blocks | |
reffects |
requests that all parameters in a block be treated as random-effects parameters | |
scale(#) |
initial multiplier for scale factor for current block; default is scale(2.38) ; not allowed with gibbs
|
|
covariance(#) |
initial proposal covariance for the current block; default is the identity matrix; not allowed with gibbs
|
|
adaptation(adaptopts) |
control the adaptive MCMC procedure of the current block; not allowed with gibbs
|
|
Only tarate() and tolerance() may be specified in the adaptation() option. |
adaptopts | Description | |
every(#) |
adaptation interval; default is every(100)
|
|
maxiter(#) |
maximum number of adaptation loops; default is maxiter(25) or max{25,floor(burnin()/every()) } whenever default values of these options are modified |
|
miniter(#) |
minimum number of adaptation loops; default is miniter(5)
|
|
alpha(#) |
parameter controlling acceptance rate (AR); default is alpha(0.75)
|
|
beta(#) |
parameter controlling proposal covariance; default is beta(0.8)
|
|
gamma(#) |
parameter controlling adaptation rate; default is gamma(0)
|
|
* | tarate(#)
|
target acceptance rate (TAR); default is parameter specific |
* | tolerance(#)
|
tolerance for AR; default is tolerance(0.01)
|
* Only starred options may be specified in the adaptation() option specified within block() . |
Statistics > Bayesian analysis > Estimation
bayesmh
fits a variety of Bayesian models using Metropolis-Hastings (MH) algorithm. It provides various likelihood models and prior distributions for you to choose from. Likelihood models include univariate normal linear and nonlinear regressions, multivariate normal linear and nonlinear regressions, generalized linear models such as logit and Poisson regressions, and multiple-equations linear models. Prior distributions include continuous distributions such as uniform, Jeffreys, normal, gamma, multivariate normal, and Wishart and discrete distributions such as Bernoulli and Poisson. You can also program your own Bayesian models; see [BAYES] bayesmh evaluators.
noconstant
suppresses the constant term (intercept) from the regression model. By default, bayesmh
automatically includes a model parameter {
depname:_cons}
in all regression models except ordered and nonlinear models. Excluding the constant term may be desirable when there is a factor variable, the base level of which absorbs the constant term in the linear combination.
{marker dist_spec} likelihood
(
modelspec)
specifies the distribution of the data in the regression model. This option specifies the likelihood portion of the Bayesian model. This option is required. modelspec specifies one of the supported likelihood distributions. A location parameter of these distributions is automatically parameterized as a linear combination of the specified independent variables and needs not be specified. Other parameters may be specified as arguments to the distribution separated by commas. Each argument may be a real number (#), a variable name (except for matrix parameters), a predefined matrix, a model parameter specified in { }
, a Stata expression, or a substitutable expression containing model parameters; see BAYES bayesmhRemarksandexamplesDeclaringmodelparametersDeclaring model parameters and BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.
For some likelihood models, option likelihood()
provides suboptions subopts in likelihood(..., subopts)
. subopts is offset()
, exposure()
, and baseoutcome()
.
offset(varname_o)
specifies that varname_o be included in the regression model with the coefficient constrained to be 1. This option is available with probit
, logit
, binlogit()
, oprobit
, ologit
, and poisson
.
exposure(varname_e)
specifies a variable that reflects the amount of exposure over which the depvar events were observed for each observation; ln(varname_e) with coefficient constrained to be 1 is entered into the log-link function. This option is available with poisson
.
noglmtransform
requests that the GLM-type transformation not be applied to linear and nonlinear predictors of Poisson, exponential, and binomial logistic regression models. If this option is not specified, Poisson, exponential, and binomial logistic regressions are assumed, which apply the corresponding GLM transformation to the linear predictor to model the mean function. The GLM transformation is mu = E(Y|X) = h(X'beta), where h() is an inverse link function. For example, for Poisson and exponential regression models, h(x) = exp(x). For generalized nonlinear models, mu = h{f(X,beta)}, where f() is a nonlinear function of predictors. If this option is specified, the linear and nonlinear specifications are assumed to directly model the mean mu of the Poisson distribution, the scale parameter b=1/mu of the exponential distribution, and the probability of success mu/n of the binomial distribution. This option is useful with constant-only models for modeling outcome distributions directly; see BAYES bayesmhRemarksandexamplesBetabinomialmodelBeta-binomial model under Remarks and examples in [BAYES] bayesmh.
prior(priorspec)
specifies a prior distribution for model parameters. This option is required and may be repeated. A prior must be specified for each model parameter. Model parameters may be scalars or matrices but both types may not be combined in one prior statement. If multiple scalar parameters are assigned a single univariate prior, they are considered independent, and the specified prior is used for each parameter. You may assign a multivariate prior of dimension d to d scalar parameters. Also see Referring to model parameters below and BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.
All likelihood()
and prior()
combinations are allowed, but they are not guaranteed to correspond to proper posterior distributions. You need to think carefully about the model you are building and evaluate its convergence thoroughly.
dryrun
specifies to show the summary of the model that would be fit without actually fitting the model. This option is recommended for checking specifications of the model before fitting the model.
reffects(varname)
specifies a random-effects variable, a variable identifying the group structure for the random effects, with univariate linear models. This option is useful for fitting two-level random-intercept models. A random-effects variable is treated as a factor variable with no base level. As such, you can refer to random-effects parameters or, simply, random effects associated with varname using a conventional factor-variable notation. For example, you can use {
depvar:i.
varname}
to refer to all random-effects parameters of varname. These parameters must be included in a single prior statement, usually a normal distribution with variance specified by an additional parameter. The random-effects parameters are assumed to be conditionally independent across levels of varname given all other model parameters. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block()
option. See example 23.
redefine(
label:i.
varname)
specifies a random-effects linear form that can be used in substitutable expressions. You can use {
label:}
to refer to the linear form in substitutable expressions. You can specify {
label:i.
varname}
to refer to all random-effects parameters associated with varname. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block()
option. This option is useful for fitting multilevel models. See example 29.
xbdefine(
label:
varlist)
specifies a linear form of the variables in varlist that can be used in substitutable expressions. You can use the specification {
label:}
to refer to the linear form in substitutable expressions. For any varname in varlist, you can use {
label:
varname}
to refer to the corresponding parameter. This option is useful with nonlinear specifications when the linear form contains many variables and provides more efficient computation in such cases.
block(
paramref[,
blockopts])
specifies a group of model parameters for the blocked MH algorithm. By default, all parameters except matrices are treated as one block, and each matrix parameter is viewed as a separate block. You can use the block()
option to separate scalar parameters in multiple blocks. Technically, you can also use block()
to combine matrix parameters in one block, but this is not recommended. The block()
option may be repeated to define multiple blocks. Different types of model parameters, such as scalars and matrices, may not be specified in one block()
. Parameters within one block are updated simultaneously, and each block of parameters is updated in the order it is specified; the first specified block is updated first, the second is updated second, and so on. See BAYES bayesmhRemarksandexamplesImprovingefficiencyoftheMHalgorithm---blockingofparametersImproving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.
blockopts include gibbs
, split
, reffects
, scale()
, covariance()
, and adaptation()
.
gibbs
option specifies to use Gibbs sampling to update parameters in the block. This option is allowed only for specific combinations of likelihood models and prior distributions; see BAYES bayesmhMethodsandformulasGibbssamplingforsomelikelihood-priorandprior-hyperpriorconfigurationsGibbs sampling for some likelihood-prior and prior-hyperprior configurations in [BAYES] bayesmh. For more information, see BAYES bayesmhRemarksandexamplesGibbsandhybridMHsamplingGibbs and hybrid MH sampling in [BAYES] bayesmh. gibbs
may not be combined with reffects
, scale()
, covariance()
, or adaptation()
.
split
specifies that all parameters in a block are treated as separate blocks. This may be useful for levels of factor variables.
reffects
specifies that the parameters associated with the levels of a factor variable included in the likelihood specification be treated as random-effects parameters. Random-effects parameters must be included in one prior statement and are assumed to be conditionally independent across levels of a grouping variable given all other model parameters. reffects
requires that parameters be specified as {
depvar:i.
varname}
, where i.
varname is the corresponding factor variable in the likelihood specification, and may not be combined with block()
's suboptions gibbs
and split
. This option is useful for fitting hierarchical or multilevel models. See example 25 for details.
scale(#)
specifies an initial multiplier for the scale factor corresponding to the specified block. The initial scale factor is computed as #/sqrt{n_p None} for continuous parameters and as #/n_p for discrete parameters, where n_p is the number of parameters in the block. By default, # is equal to 2.38 (that is, scale(2.38)
) is the default. If specified, this option overrides the respective setting from the scale()
option specified with the bayesmh
command. scale()
may not be combined with gibbs
.
covariance(matname)
specifies a scale matrix matname to be used to compute an initial proposal covariance matrix corresponding to the specified block. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma = matname. By default, Sigma is the identity matrix. If specified, this option overrides the respective setting from the covariance()
option specified with the bayesmh
command. covariance()
may not be combined with gibbs
.
adaptation(tarate())
and adaptation(tolerance())
specify block-specific TAR and acceptance tolerance. If specified, they override the respective settings from the adaptation()
option specified with the bayesmh
command. adaptation()
may not be combined with gibbs
.
initial(initspec)
specifies initial values for the model parameters to be used in the simulation. You can specify a parameter name, its initial value, another parameter name, its initial value, and so on. For example, to initialize a scalar parameter alpha
to 0.5 and a 2x2 matrix Sigma
to the identity matrix I(2)
, you can type
bayesmh
..., initial({
alpha
}0.5
{
Sigma,m}
I(2))
...
You can also specify a list of parameters using any of the specifications described in Referring to model parameters. For example, to initialize all regression coefficients from equations y1
and y2
to zero, you can type
bayesmh
...,
initial({
y1:
}
{y2:}
0)
...
The general specification of initspec is
paramref # [paramref # [...]]
Curly braces may be omitted for scalar parameters but must be specified for matrix parameters. Initial values declared using this option override the default initial values or any initial values declared during parameter specification in the likelihood()
option. See BAYES bayesmhRemarksandexamplesSpecifyinginitialvaluesSpecifying initial values in [BAYES] bayesmh for details.
nomleinitial
suppresses using maximum likelihood estimates (MLEs) starting values for regression coefficients. By default, when no initial values are specified, MLE values (when available) are used as initial values. If nomleinitial
is specified and no initial values are provided, bayesmh
uses ones for positive scalar parameters, zeros for other scalar parameters, and identity matrices for matrix parameters. nomleinitial
may be useful for providing an alternative starting state when checking convergence of MCMC.
mcmcsize(#)
specifies the target MCMC sample size. The default MCMC sample size is mcmcsize(10000)
. The total number of iterations for the MH algorithm equals the sum of the burn-in iterations and the MCMC sample size in the absence of thinning. If thinning is present, the total number of MCMC iterations is computed as burnin()
+ (mcmcsize()
- 1) x thinning()
+ 1. Computation time of the MH algorithm is proportional to the total number of iterations. The MCMC sample size determines the precision of posterior summaries, which may be different for different model parameters and will depend on the efficiency of the Markov chain. Also see BAYES bayesmhRemarksandexamplesBurn-inperiodandMCMCsamplesizeBurn-in period and MCMC sample size in [BAYES] bayesmh.
burnin(#)
specifies the number of iterations for the burn-in period of MCMC. The values of parameters simulated during burn-in are used for adaptation purposes only and are not used for estimation. The default is burnin(2500)
. Typically, burn-in is chosen to be as long as or longer than the adaptation period. Also see BAYES bayesmhRemarksandexamplesBurn-inperiodandMCMCsamplesizeBurn-in period and MCMC sample size and BAYES bayesmhRemarksandexamplesConvergenceofMCMCConvergence of MCMC in [BAYES] bayesmh.
thinning(#)
specifies the thinning interval. Only simulated values from every (1+k x #
)th iteration for k = 0, 1, 2, ... are saved in the final MCMC sample; all other simulated values are discarded. The default is thinning(1)
; that is, all simulation values are saved. Thinning greater than one is typically used for decreasing the autocorrelation of the simulated MCMC sample.
rseed(#)
sets the random-number seed. This option can be used to reproduce results. rseed(#)
is equivalent to typing set
seed
# prior to calling bayesmh
; see [R] set seed and BAYES bayesmhRemarksandexamplesReproducingresultsReproducing results in [BAYES] bayesmh.
exclude(paramref)
specifies which model parameters should be excluded from the final MCMC sample. These model parameters will not appear in the estimation table, and postestimation features will not be available for these parameters. This option is useful for suppressing nuisance model parameters. For example, if you have a factor predictor variable with many levels but you are only interested in the variability of the coefficients associated with its levels, not their actual values, then you may wish to exclude this factor variable from the simulation results. If you simply want to omit some model parameters from the output, see the noshow()
option.
adaptation(adaptopts)
controls adaptation of the MCMC procedure. Adaptation takes place every prespecified number of MCMC iterations and consists of tuning the proposal scale factor and proposal covariance for each block of model parameters. Adaptation is used to improve sampling efficiency. Provided defaults are based on theoretical results and may not be sufficient for all applications. See BAYES bayesmhRemarksandexamplesAdaptationoftheMHalgorithmAdaptation of the MH algorithm in [BAYES] bayesmh for details about adaptation and its parameters.
adaptopts are any of the following options:
every(#)
specifies that adaptation be attempted every #th iteration. The default is every(100)
. To determine the adaptation interval, you need to consider the maximum block size specified in your model. The update of a block with k model parameters requires the estimation of k x k covariance matrix. If the adaptation interval is not sufficient for estimating the k(k+1)/2 elements of this matrix, the adaptation may be insufficient.
maxiter(#)
specifies the maximum number of adaptive iterations. Adaptation includes tuning of the proposal covariance and of the scale factor for each block of model parameters. Once the TAR is achieved within the specified tolerance, the adaptation stops. However, no more than # adaptation steps will be performed. The default is variable and is computed as max{25,floor(burnin()/adaptation(every()))
}.
maxiter()
is usually chosen to be no greater than (mcmcsize()
+burnin()
)/adaptation(every())
.
miniter(#)
specifies the minimum number of adaptive iterations to be performed regardless of whether the TAR has been achieved. The default is miniter(5)
. If the specified miniter()
is greater than maxiter()
, then miniter()
is reset to maxiter()
. Thus, if you set maxiter(0)
, then no adaptation will be performed.
alpha(#)
specifies a parameter controlling the adaptation of the AR. alpha()
should be in [0,1]. The default is alpha(0.75)
.
beta(#)
specifies a parameter controlling the adaptation of the proposal covariance matrix. beta()
must be in [0,1]. The closer beta()
is to zero, the less adaptive the proposal covariance. When beta()
is zero, the same proposal covariance will be used in all MCMC iterations. The default is beta(0.8)
.
gamma(#)
specifies a parameter controlling the adaptation rate of the proposal covariance matrix. gamma()
must be in [0,1]. The larger the value of gamma()
, the less adaptive the proposal covariance. The default is gamma(0)
.
tarate(#)
specifies the TAR for all blocks of model parameters; this is rarely used. tarate()
must be in (0,1). The default AR is 0.234 for blocks containing continuous multiple parameters, 0.44 for blocks with one continuous parameter, and 1/n_maxlev for blocks with discrete parameters, where n_maxlev is the maximum number of levels for a discrete parameter in the block.
tolerance(#)
specifies the tolerance criterion for adaptation based on the TAR. tolerance()
should be in (0,1). Adaptation stops whenever the absolute difference between the current and TARs is less than tolerance()
. The default is tolerance(0.01)
.
scale(#)
specifies an initial multiplier for the scale factor for all blocks. The initial scale factor is computed as #/sqrt{n_p None} for continuous parameters and #/n_p for discrete parameters, where n_p is the number of parameters in the block. By default, # is equal to 2.38; that is, scale(2.38)
is the default.
covariance(cov)
specifies a scale matrix cov to be used to compute an initial proposal covariance matrix. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma=matname. By default, Sigma is the identity matrix. Partial specification of Sigma is also allowed. The rows and columns of cov should be named after some or all model parameters. According to some theoretical results, the optimal proposal covariance is the posterior covariance matrix of model parameters, which is usually unknown.
clevel(#)
specifies the credible level, as a percentage, for equal-tailed and HPD credible intervals. The default is clevel(95)
or as set by [BAYES] set clevel.
hpd
specifies the display of HPD credible intervals instead of the default equal-tailed credible intervals.
batch(#)
specifies the length of the block for calculating batch means, batch standard deviation, and MCSE using batch means. The default is batch(0)
, which means no batch calculations. When batch()
is not specified, MCSE is computed using effective sample sizes instead of batch means. Option batch()
may not be combined with corrlag()
or corrtol()
.
nomodelsummary
suppresses the detailed summary of the specified model. Model summary is reported by default.
noexpression
suppresses the output of expressions from the model summary. Expressions (when specified) are reported by default.
blocksummary
displays the summary of the specified blocks. This option is useful when block()
is specified and may not be combined with dryrun
.
dots
and dots(#)
specify to display dots as simulation is performed. dots(#)
displays a dot every # iterations. During the adaptation period, a symbol a
is displayed instead of a dot. If dots(
...,
every(#)
)
is specified, then an iteration number is displayed every #th iteration instead of a dot or a
. dots(,
every(#)
)
is equivalent to dots(1,
every(#)
)
. dots
displays dots every 100 iterations and iteration numbers every 1,000 iterations; it is a synonym for dots(100),
every(1000)
. By default, no dots are displayed (dots(0)
).
noshow(paramref)
specifies a list of model parameters to be excluded from the output. Do not confuse this option with exclude()
, which excludes the specified parameters from the MCMC sample.
showreffects(paramref)
is used with option reffects()
and specifies a list of random-effects parameters to be included in the output. By default, all random-effects parameters introduced by reffects()
are excluded from the output as if you have specified the noshow()
option.
notable
suppresses the estimation table from the output. By default, a summary table is displayed containing all model parameters except those listed in the exclude()
and noshow()
options. Regression model parameters are grouped by equation names. The table includes six columns and reports the following statistics using the MCMC simulation results: posterior mean, posterior standard deviation, MCMC standard error or MCSE, posterior median, and credible intervals.
noheader
suppresses the output header either at estimation or upon replay.
title(string)
specifies an optional title for the command that is displayed above the table of the parameter estimates. The default title is specific to the specified likelihood model.
saving(
filename[, replace
])
saves simulation results in filename.dta
. The replace
option specifies to overwrite filename.dta
if it exists. If the saving()
option is not specified, bayesmh
saves simulation results in a temporary file for later access by postestimation commands. This temporary file will be overridden every time bayesmh
is run and will also be erased if the current estimation results are cleared. saving()
may be specified during estimation or on replay.
The saved dataset has the following structure. Variance _index
records iteration numbers. bayesmh
saves only states (sets of parameter values) that are different from one iteration to another and the frequency of each state in variable _frequency
. (Some states may be repeated for discrete parameters.) As such, _index
may not necessarily contain consecutive integers. Remember to use _frequency
as a frequency weight if you need to obtain any summaries of this dataset. Values for each parameter are saved in a separate variable in the dataset. Variables containing values of parameters without equation names are named as eq0_p
#, following the order in which parameters are declared in bayesmh
. Variables containing values of parameters with equation names are named as eq
#_p
#, again following the order in which parameters are defined. Parameters with the same equation names will have the same variable prefix eq
#. For example,
. bayesmh y x1, likelihood(normal({var})) saving(mcmc)
...
will create a dataset mcmc.dta
with variable names eq1_p1
for {y:x1}
, eq1_p2
for {y:_cons}
, and eq0_p1
for {var}
. Also see macros e(parnames)
and e(varnames)
for the correspondence between parameter names and variable names.
In addition, bayesmh
saves variable _loglikelihood
to contain values of the log likelihood from each iteration and variable _logposterior
to contain values of log posterior from each iteration.
display_options: vsquish
, noemptycells
, baselevels
, allbaselevels
, nofvlabel
, fvwrap(#)
, fvwrapon(style)
, and nolstretch
; see [R] estimation options.
search(search_options)
searches for feasible initial values. search_options
are on
, repeat(#)
, and off
.
search(on)
is equivalent to search(repeat(500))
. This is the default.
search(repeat(
k))
, k > 0, specifies the number of random attempts to be made to find a feasible initial-value vector, or initial state. The default is repeat(500)
. An initial-value vector is feasible if it corresponds to a state with positive posterior probability. If feasible initial values are not found after k attempts, an error will be issued. repeat(0)
(rarely used) specifies that no random attempts be made to find a feasible starting point. In this case, if the specified initial vector does not correspond to a feasible state, an error will be issued.
search(off)
prevents bayesmh
from searching for feasible initial values. We do not recommend specifying this option.
corrlag(#)
specifies the maximum autocorrelation lag used for calculating effective sample sizes. The default is min{500,mcmcsize()
/2}. The total autocorrelation is computed as the sum of all lag-k autocorrelation values for k from 0 to either corrlag()
or the index at which the autocorrelation becomes less than corrtol()
if the latter is less than corrlag()
. Options corrlag()
and batch()
may not be combined.
corrtol(#)
specifies the autocorrelation tolerance used for calculating effective sample sizes. The default is corrtol(0.01)
. For a given model parameter, if the absolute value of the lag-k autocorrelation is less than corrtol()
, then all autocorrelation lags beyond the kth lag are discarded. Options corrtol()
and batch()
may not be combined.
Remarks are presented under the following headings:
Using bayesmh Declaring model parameters Referring to model parameters Substitutable expressionsThe bayesmh
command for Bayesian analysis includes three functional components: setting up a posterior model, performing MCMC simulation, and summarizing and reporting results. The first component, the model-building step, requires some experience in the practice of Bayesian statistics and, as any modeling task, is probably the most demanding. You should specify a posterior model that is statistically correct and that represents the observed data. Another important aspect is the computational feasibility of the model in the context of the MH MCMC procedure implemented in bayesmh
. The provided MH algorithm is adaptive and, to a degree, can accommodate various statistical models and data structures. However, careful model parametrization and well-specified initial values and MCMC sampling scheme are crucial for achieving a fast-converging Markov chain and consequently good results. Simulation of MCMC must be followed by a thorough investigation of the convergence of the MCMC algorithm. Once you are satisfied with the convergence of the simulated chains, you may proceed with posterior summaries of the results and their interpretation. Below we discuss the three major steps of using bayesmh
and provide recommendations.
Model parameters are typically declared, meaning first introduced, in the arguments of distributions specified in options likelihood()
and prior()
. We will refer to model parameters that are declared in the prior distributions (and not the likelihood distributions) as hyperparameters. Model parameters may also be declared within the parameter specification of the prior()
option, but this is more rare.
bayesmh
distinguishes between two types of model parameters: scalar and matrix. All parameters must be specified in curly braces, { }
. There are two ways for declaring a scalar parameter: param
and {
eqname:
param}
, where param and eqname are valid Stata names.
The specification of a matrix parameter is similar, but you must use the matrix
suboptions: {
param,
matrix
}
and {
eqname:
param,
matrix
}
. The most common application of matrix model parameters is for specifying the variance-covariance matrix of a multivariate normal distribution.
All matrices are assumed to be symmetric and only the elements in the lower diagonal are reported in the output. Only a few multivariate prior distributions are available for matrix parameters: wishart()
, iwishart()
, and jeffreys()
. In addition to being symmetric, these distributions require that the matrices be positive definite.
It is your responsibility to declare all parameters of your model, except regression coefficients in linear models. For a linear model, bayesmh
automatically creates a regression coefficient with the name {
depvar:
indepvar}
for each independent variable indepvar in the model and, if noconstant
is not specified, an intercept parameter {
depvar}
. In the presence of factor variables, bayesmh
will create a parameter {
depvar:
level}
for each level indicator level and a parameter {
depvar:
inter}
for each interaction indicator inter; see fvvarlists. (It is still your responsibility, however, to specify prior distributions for the regression parameters.)
For example,
. bayesmh y x,
...
will automatically have two regression parameters: {y None:x}
and {y None}
, whereas
. bayesmh y x, noconstant
...
will have only one: {y None:x}
.
For a univariate normal linear regression, we may want to additionally declare the scalar variance parameter by
. bayesmh y x, likelihood(normal({sig2 None}))
...
We can label the variance parameter, as follows:
. bayesmh y x, likelihood(normal({var:sig2}))
...
We can declare a hyperparameter for {sig2 None}
using
. bayesmh y x, likelihood(normal({sig2 None})) prior({sig2 None}, igamma({df None},2))
...
where the hyperparameter {df None}
is declared in the inverse-gamma prior distribution for {sig2 None}
.
For a multivariate normal linear regression, in addition to four regression parameters declared automatically by bayesmh
: {y1 None:x}
, {y1 None}
, {y2 None:x}
, and {y2 None}
, we may also declare a parameter for the variance-covariance matrix:
{cmd:. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, matrix}))} ...
or abbreviate matrix
to m
for short:
{cmd:. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, m}))} ...
After a model parameter is declared, we may need to refer to it in our further model specification. We will definitely need to refer to it when we specify its prior distribution. We may also need to use it as an argument in the prior distributions of other parameters or need to specify it in the block()
option for blocking of model parameters; see BAYES bayesmhRemarksandexamplesImprovingefficiencyoftheMHalgorithm---blockingofparametersImproving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.
To refer to one parameter, we simply use its definition: {
param}
, {
eqname:
param}
, {
param,
matrix
}
, or {
eqname:
param,
matrix
}
. There are several ways in which you can refer to multiple parameters. You can refer to multiple model parameters in the parameter specification paramref of the prior(paramref, ...)
option, of the block(paramref, ...)
option, or of the initial(paramref #)
option.
The most straightforward way to refer to multiple scalar model parameters is to simply list them individually, as follows:
{param1 None} {param2 None}
...
but there are shortcuts.
For example, the alternative to the above is
{param1 param2}
...
where we simply list the names of all parameters inside one set of curly braces.
If parameters have the same equation name, you can refer to all of the parameters with that equation name as follows. Suppose that we have three parameters with the same equation name eqname
, then the specification
{eqname None:param1} {eqname None:param2} {eqname None:param3}
is the same as the specification
{eqname None}
or the specification
{eqname None:param1 param2 param3}
The above specification is useful if we want to refer to a subset of parameters with the same equation name. For example, in the above, if we wanted to refer to only param1
and param2
, we could type
{eqname None:param1 param2}
If a factor variable is used in the specification of the regression function, you can use the same factor-variable specification within paramref to refer to the coefficients associated with the levels of that factor variable; see fvvarlists.
For example, factor variables are useful for constructing multilevel Bayesian models. Suppose that variable id
defines the second level of hierarchy in a two-level random-effects model. We can fit a Bayesian random-intercept model as follows.
. bayesmh y x i.id, likelihood(normal({var}))
prior({y None:i.id}, normal(0,{tau None}))
...
Here we used {y None:i.id}
in the prior specification to refer to all levels of id
.
Similarly, we can add a random coefficient for a continuous covariate x
by typing
. bayesmh y c.x##i.id, likelihood(normal({var}))
prior({y None:i.id}, normal(0,{tau1 None}))
prior({y None:c.x#i.id}, normal(0,{tau2 None}))
...
You can mix and match all of the specifications above in one parameter specification, paramref.
To refer to multiple matrix model parameters, you can use {
paramlist,
matrix
{cmd:} to refer to matrix parameters with names paramlist and {
eqname:
paramlist,
matrix
}
to refer to matrix parameters with names in paramlist and with equation name eqname.
For example, the specification
{eqname None:Sigma1,m}
{eqname None:Sigma2,m}
{cmd:{Sigma3,m}} {cmd:{Sigma4,m}}
is the same as the specification
{eqname None:Sigma1 Sigma2,m}
{Sigma3 Sigma4,m}
You cannot refer to both scalar and matrix parameters in one paramref specification.
For referring to model parameters in postestimation commands, see BAYES bayesmhpostestimationRemarksandexamplesDifferentwaysofspecifyingmodelparametersDifferent ways of specifying model parameters in [BAYES] bayesmh postestimation.
You may use substitutable expressions in bayesmh
to define nonlinear expressions subexpr, arguments of outcome distributions in option likelihood()
, observation-level log likelihood in option llf()
, arguments of prior distributions in option prior()
, and generic prior distributions in prior()
's suboptions density()
and logdensity()
. Substitutable expressions are just like any other mathematical expression in Stata, except that they may include model parameters.
To specify a substitutable expression in your bayesmh
model, you must comply with the following rules:
1. Model parameters are bound in braces: {mu None}
, {var:sigma2}
, {cmd:{Sigma, matrix}}, and {Cov None:Sigma, matrix}
.
2. Linear combinations can be specified using the notation {
eqname:
varlist}
. For example,
{
xb:mpg price weight}
is equivalent to
{xb_mpg None}*mpg
+
{xb_price None}*price
+ {xb_weight None}*weight
3. There is a small caveat with using the {
eqname:
name}
specification 2 when name corresponds to both one of the variables in the dataset and a parameter in the model. The linear-combination specification takes precedence in this case. For example, {eq None:var}
will be expanded to {eq_var None}*var
, where var
is the variable in a dataset and eq_var
is the coefficient corresponding to this variable. To refer directly to the coefficient, you must use {eq_var None}
.
4. Initial values are given by including an equal sign and the initial value inside the braces, for example, {cmd:{b1=1.267}}, {cmd:{gamma=3}}, etc. If you do not specify an initial value, that parameter is initialized to one for positive scalar parameters and to zero for other scalar parameters, or it is initialized to its MLE, if available. The initial()
option overrides initial values provided in substitutable expressions. Initial values for matrices must be specified in the initial()
option. By default, matrix parameters are initialized with identity matrices.
Setup
webuse oxygen
Bayesian normal linear regression with noninformative priors
set seed 14
bayesmh change age group, likelihood(normal({var}))
Bayesian normal linear regression with normal and inverse-gamma priors
set seed 14
bayesmh change age group, likelihood(normal({var}))
Bayesian normal linear regression with multivariate Zellner’s g-prior
set seed 14
bayesmh change age group, likelihood(normal({var}))
Update parameter {var}
separately from other model coefficients
set seed 14
bayesmh change age group, likelihood(normal({var}))
Use Gibbs sampling for parameter {var}
and display the summary about blocks
set seed 14
bayesmh change age group, likelihood(normal({var}))
Bayesian logistic regression model with a noninformative prior
webuse hearthungary
set seed 14
bayesmh disease restecg isfbs age male, likelihood(logit)
Bayesian ordered probit model including hyperparameter {lambda None}
webuse fullauto
replace length = length/10
set seed 14
bayesmh rep77 foreign length mpg, likelihood(oprobit)
Setup
sysuse auto, clear
replace weight = weight/1000
replace length = length/100
replace mpg = mpg/10
Bayesian multivariate normal model including matrix parameter {Sigma None}
for the covariance matrix
set seed 14
{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100))
{cmd: prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons})
{cmd: block({Sigma,m}) dots}{p_end}
Request additional burn-in and more frequent adaptation
set seed 14
{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100))
{cmd:prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons})
{cmd:block({Sigma,m}) dots} burnin(5000) adaptation(every(50))
Request Gibbs sampling for covariance matrix {Sigma None}
set seed 14
{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100))
{cmd:prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons})
{cmd:block({Sigma,m}, gibbs) dots}{p_end}
Setup
webuse pig, clear
Bayesian linear random-intercept model
set seed 14
fvset base none id
bayesmh weight week i.id, likelihood(normal({var_0})) noconstant
Bayesian linear random-intercept model using the reffects()
option
set seed 14
bayesmh weight week, reffects(id) likelihood(normal({var_0})) noconstant
Setup
webuse coal
Analysis of a change point problem with target MCMC sample size of 20,000
set seed 14
bayesmh count = ({mu1 None}*sign(year<{cp None})+{mu2 None}*sign(year>={cp None})),
bayesmh
stores the following in e()
:
Scalars | ||
e(N) |
number of observations | |
e(k) |
number of parameters | |
e(k_sc) |
number of scalar parameters | |
e(k_mat) |
number of matrix parameters | |
e(n_eq) |
number of equations | |
e(mcmcsize) |
MCMC sample size | |
e(burnin) |
number of burn-in iterations | |
e(mcmciter) |
total number of MCMC iterations | |
e(thinning) |
thinning interval | |
e(arate) |
overall AR | |
e(eff_min) |
minimum efficiency | |
e(eff_avg) |
average efficiency | |
e(eff_max) |
maximum efficiency | |
e(clevel) |
credible interval level | |
e(hpd) |
1 if hpd is specified; 0 otherwise |
|
e(batch) |
batch length for batch-mean calculations | |
e(corrlag) |
maximum autocorrelation lag | |
e(corrtol) |
autocorrelation tolerance | |
e(dic) |
deviation information criterion | |
e(lml_lm) |
log marginal-likelihood using Laplace-Metropolis method | |
e(scale) |
initial multiplier for scale factor; scale()
|
|
e(block #_gibbs)
|
1 if Gibbs sampling is used in #th block; 0 otherwise |
|
e(block #_reffects)
|
1 if the parameters in #th block are random effects; 0 otherwise |
|
e(block #_scale)
|
#th block initial multiplier for scale factor | |
e(block #_tarate)
|
#th block target adaptation rate | |
e(block #_arate_last)
|
#th block AR from the last adaptive iteration | |
e(block #_tolerance)
|
#th block adaptation tolerance | |
e(adapt_every) |
adaptation iterations adaptation(every())
|
|
e(adapt_maxiter) |
maximum number of adaptive iterations adaptation(maxiter())
|
|
e(adapt_miniter) |
minimum number of adaptive iterations adaptation(miniter())
|
|
e(adapt_alpha) |
adaptation parameter adaptation(alpha())
|
|
e(adapt_beta) |
adaptation parameter adaptation(beta())
|
|
e(adapt_gamma) |
adaptation parameter adaptation(gamma())
|
|
e(adapt_tolerance) |
adaptation tolerance adaptation(tolerance())
|
|
e(repeat) |
number of attempts used to find feasible initial values |
Macros | ||
e(cmd) |
bayesmh |
|
e(cmdline) |
command as typed | |
e(method) |
sampling method | |
e(depvars) |
names of dependent variables | |
e(eqnames) |
names of equations | |
e(likelihood) |
likelihood distribution (one equation) | |
e(likelihood #)
|
likelihood distribution for #th equation | |
e(prior) |
prior distribution | |
e(prior #)
|
prior distribution, if more than one prior() is specified |
|
e(priorparams) |
parameter specification in prior()
|
|
e(priorparams #)
|
parameter specification from #th prior() , if more than one prior() is specified |
|
e(parnames) |
names of model parameters except exclude()
|
|
e(postvars) |
variable names corresponding to model parameters in e(parnames) |
|
e(subexpr) |
substitutable expression | |
e(subexpr #)
|
substitutable expression, if more than one | |
e(wtype) |
weight type (one equation) | |
e(wtype #)
|
weight type for #th equation | |
e(wexp) |
weight expression (one equation) | |
e(wexp #)
|
weight expression for #th equation | |
e(block #_names)
|
parameter names from #th block | |
e(exclude) |
names of excluded parameters | |
e(filename) |
name of the file with simulation results | |
e(scparams) |
scalar model parameters | |
e(matparams) |
matrix model parameters | |
e(pareqmap) |
model parameters in display order | |
e(title) |
title in estimation output | |
e(rngstate) |
random-number state at the time of simulation | |
e(search) |
on , repeat() , or off
|
Matrices | ||
e(mean) |
posterior means | |
e(sd) |
posterior standard deviations | |
e(mcse) |
MCSE | |
e(median) |
posterior medians | |
e(cri) |
credible intervals | |
e(Cov) |
variance-covariance matrix of parameters | |
e(ess) |
effective sample sizes | |
e(init) |
initial values vector |
Functions | ||
e(sample) |
mark estimation sample |