Help for bayesmh

Title

[BAYES] bayesmh Bayesian regression using Metropolis-Hastings algorithm

Syntax

Univariate linear models

bayesmh depvar [indepvars] [if] [in] [weight], likelihood(modelspec) prior(priorspec) [reffects(varname) options]

Multivariate linear models

Multivariate normal linear regression with common regressors

bayesmh depvars = [indepvars] [if] [in] [weight], likelihood(mvnormal(...)) prior(priorspec) [options]

Multivariate normal regression with outcome-specific regressors

bayesmh ([eqname1:]depvar1 [indepvars1])

([eqname2:]depvar2 [indepvars2]) [...] [if] [in] [weight],

likelihood(mvnormal(...)) prior(priorspec) [options]

Multiple-equation linear models

bayesmh (eqspec) [(eqspec)] [...] [if] [in] [weight],

prior(priorspec) [options]

Nonlinear models

Univariate nonlinear regression

bayesmh depvar = (subexpr) [if] [in] [weight],

likelihood(modelspec) prior(priorspec) [options]

Multivariate normal nonlinear regression

bayesmh (depvar1 = (subexpr1))

(depvar2 = (subexpr2)) [...] [if] [in] [weight],

likelihood(mvnormal(...)) prior(priorspec) [options]

The syntax of eqspec is

varspec [if] [in] [weight], likelihood(modelspec) [noconstant]

The syntax of varspec is one of the following:

for single outcome

[eqname:]depvar [indepvars]

for multiple outcomes with common regressors

depvars = [indepvars]

for multiple outcomes with outcome-specific regressors

([eqname1:]depvar1 [indepvars1]) ([eqname2:]depvar2 [indepvars2]) [...]

subexpr, subexpr1, subexpr2, and so on are substitutable expressions; see Substitutable expressions for details.

The syntax of modelspec is

model [, modelopts]

model Description
Continuous
normal(var) normal regression with variance var
lognormal(var) lognormal regression with variance var
lnormal(var) synonym for lognormal()
exponential exponential regression
mvnormal(Sigma) multivariate normal regression with covariance matrix Sigma
Discrete
probit probit regression
logit logistic regression
logistic logistic regression; synonym for logit
binlogit(n) binomial regression with logit link
oprobit ordered probit regression
ologit ordered logistic regression
poisson Poisson regression
Generic
llf(subexpr) substitutable expression for observation-level log-likelihood function
exposure(varname_e) include ln(varname_e) in model with coefficient constrained to 1; allowed only with poisson
noglmtransform do not apply GLM-type transformation to the linear predictor or nonlinear specification; allowed only with exponential, poisson, or binlogit()
A distribution argument is a number for scalar arguments such as var; a variable name, varname (except for matrix arguments); a matrix for matrix arguments such as Sigma; a model parameter paramspec; an expression expr; or a substitutable expression, subexpr. See BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions. {synoptset 22}{nobreak None} {marker modelopts}{nobreak None} {synopthdr None:modelopts} {synoptline None} {synopt None}offset(varname_o)include varname_o in model with coefficient constrained to 1; not allowed with normal() and mvnormal()

The syntax of priorspec is

paramref, priordist

where the simplest specification of paramref is

paramspec [paramspec] [...]]

Also see Referring to model parameters for other specifications.

The syntax of paramspec is

{[eqname:]param[, matrix]}

where the parameter label eqname and parameter name param are valid Stata names. Model parameters are either scalars such as {var}, {mean}, {scale:beta}, or matrices such as {Sigma, matrix} and {Scale:V, matrix}. For scalar parameters, you can use param=# to specify an initial value. For example, you can specify, {var=1}, {mean=1.267}, or {shape:alpha=3}.

priordist Description
Univariate continuous
normal(mu,var) normal with mean mu and variance var
lognormal(mu,var) lognormal with mean mu and variance var
lnormal(mu,var) synonym for lognormal()
uniform(a,b) uniform on (a,b)
gamma(alpha,beta) gamma with shape alpha and scale beta
igamma(alpha,beta) inverse gamma with shape alpha and scale beta
exponential(beta) exponential with scale beta
beta(a,b) beta with shape parameters a and b
chi2(df) central chi-squared with degrees of freedom df
jeffreys Jeffreys prior for variance of a normal distribution
Multivariate continuous
mvnormal(d,mean,Sigma) multivariate normal of dimension d with mean vector mean and covariance matrix Sigma; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud
mvnormal0(d,Sigma) multivariate normal of dimension d with zero mean vector and covariance matrix Sigma
mvn0(d,Sigma) synonym for mvnormal0()
zellnersg(d,g,mean,{var}) Zellner's g-prior of dimension d with g degrees of freedom, mean vector mean, and variance parameter {var}; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud
zellnersg0(d,g,{var}) Zellner's g-prior of dimension d with g degrees of freedom, zero mean vector, and variance parameter {var}
wishart(d,df,V) Wishart of dimension d with degrees of freedom df and scale matrix V
iwishart(d,df,V) inverse Wishart of dimension d with degrees of freedom df and scale matrix V
jeffreys(d) Jeffreys prior for covariance of a multivariate normal distribution of dimension d
Discrete
bernoulli(p) Bernoulli with success probability p
index(p1,...,pk) discrete indices 1, 2, ..., k with probabilities p1, p2, ..., pk
poisson(mu) Poisson with mean mu
Generic
flat flat prior; equivalent to density(1) or logdensity(0)
density(f) generic density f
logdensity(logf) generic logdensity logf
Dimension d is a positive #.
A distribution argument is a number for scalar arguments such as var, alpha, beta; a Stata matrix for matrix arguments such as Sigma and V; a model parameter, paramspec; an expression, expr; or a substitutable expression subexpr.
{marker generic_f} f is a nonnegative number, #; an expression expr; or a substitutable expression, subexpr.
{marker generic_logf} logf is a number, #; an expression, expr; or a substitutable expression, subexpr.
When mvnormal() or mvnormal0() of dimension d is applied to paramref with n parameters (n!=d), paramref is reshaped into a matrix with d columns, and its rows are treated as independent samples from the specified mvnormal() distribution. If such reshaping is not possible, an error is issued. See example 25 for application of this feature.
Options Description
Model
noconstant suppress constant term; not allowed with ordered and nonlinear models
* likelihood(modelspec) distribution for the likelihood model
* prior(priorspec) prior for model parameters; this option may be repeated
dryrun show model summary without estimation
Model 2
redefine(label:i.varname) specify a random-effects linear form
xbdefine(label:varlist) specify a linear form
block(paramref[, blockopts]) specify a block of model parameters; this option may be repeated
initial(initspec) initial values for model parameters
nomleinitial suppress the use of maximum likelihood estimates as starting values
Simulation
mcmcsize(#) MCMC sample size; default is mcmcsize(10000)
burnin(#) burn-in period; default is burnin(2500)
thinning(#) thinning interval; default is thinning(1)
rseed(#) random-number seed
exclude(paramref) specify model parameters to be excluded from the simulation results
Adaption
adaptation(adaptopts) control the adaptive MCMC procedure
scale(#) initial multiplier for scale factor; default is scale(2.38)
covariance(cov) initial proposal covariance; default is the identity matrix
Reporting
clevel(#) set credible interval level; default is clevel(95)
hpd display HPD credible intervals instead of the default equal-tailed credible intervals
batch(#) specify length of block for batch-mean calculations; default is batch(0)
nomodelsummary suppress model summary
noexpression suppress output of expressions from model summary
blocksummary display block summary
dot display dots every 100 iterations and iteration numbers every 1,000 iterations
dots(#[, every(#)]) display dots as simulation is performed
noshow(paramref) specify model parameters to be excluded from the output
showreffects(paramref) specify random-effects parameters to be included in the output
notable suppress estimation table
noheader suppress output header
title(string) display string as title above the table of parameter estimates
saving(filename[, replace]) save simulation results to filename.dta
display_options control spacing, line width, and base and empty cells
Advanced
search(search_options) control the search for feasible initial values
corrlag(#) specify maximum autocorrelation lag; default varies
corrtol(#) specify autocorrelation tolerance; default is corrtol(0.01)
* Options likelihood() and prior() are required. prior() must be specified for all model parameters.
Options prior() and block() can be repeated.
indepvars and paramref may contain factor variables; see fvvarlist.
With multiple-equations specifications, a local if specified within an equation is applied together with the global if specified with the command.
Only fweights are allowed; see weight.
With multiple-equations specifications, local weights or (weights specified within an equation) override global weights (weights specified with the command).
See [BAYES] bayesmh postestimation for features available after estimation.
blockopts Description
gibbs requests Gibbs sampling; available for selected models only and not allowed with scale(), covariance(), or adaptation()
split requests that all parameters in a block be treated as separate blocks
reffects requests that all parameters in a block be treated as random-effects parameters
scale(#) initial multiplier for scale factor for current block; default is scale(2.38); not allowed with gibbs
covariance(#) initial proposal covariance for the current block; default is the identity matrix; not allowed with gibbs
adaptation(adaptopts) control the adaptive MCMC procedure of the current block; not allowed with gibbs
Only tarate() and tolerance() may be specified in the adaptation() option.
adaptopts Description
every(#) adaptation interval; default is every(100)
maxiter(#) maximum number of adaptation loops; default is maxiter(25) or max{25,floor(burnin()/every())} whenever default values of these options are modified
miniter(#) minimum number of adaptation loops; default is miniter(5)
alpha(#) parameter controlling acceptance rate (AR); default is alpha(0.75)
beta(#) parameter controlling proposal covariance; default is beta(0.8)
gamma(#) parameter controlling adaptation rate; default is gamma(0)
* tarate(#) target acceptance rate (TAR); default is parameter specific
* tolerance(#) tolerance for AR; default is tolerance(0.01)
* Only starred options may be specified in the adaptation() option specified within block().

Statistics > Bayesian analysis > Estimation

Description

bayesmh fits a variety of Bayesian models using Metropolis-Hastings (MH) algorithm. It provides various likelihood models and prior distributions for you to choose from. Likelihood models include univariate normal linear and nonlinear regressions, multivariate normal linear and nonlinear regressions, generalized linear models such as logit and Poisson regressions, and multiple-equations linear models. Prior distributions include continuous distributions such as uniform, Jeffreys, normal, gamma, multivariate normal, and Wishart and discrete distributions such as Bernoulli and Poisson. You can also program your own Bayesian models; see [BAYES] bayesmh evaluators.

Options

Model

noconstant suppresses the constant term (intercept) from the regression model. By default, bayesmh automatically includes a model parameter {depname:_cons} in all regression models except ordered and nonlinear models. Excluding the constant term may be desirable when there is a factor variable, the base level of which absorbs the constant term in the linear combination.

{marker dist_spec} likelihood(modelspec) specifies the distribution of the data in the regression model. This option specifies the likelihood portion of the Bayesian model. This option is required. modelspec specifies one of the supported likelihood distributions. A location parameter of these distributions is automatically parameterized as a linear combination of the specified independent variables and needs not be specified. Other parameters may be specified as arguments to the distribution separated by commas. Each argument may be a real number (#), a variable name (except for matrix parameters), a predefined matrix, a model parameter specified in { }, a Stata expression, or a substitutable expression containing model parameters; see BAYES bayesmhRemarksandexamplesDeclaringmodelparametersDeclaring model parameters and BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.

For some likelihood models, option likelihood() provides suboptions subopts in likelihood(..., subopts). subopts is offset(), exposure(), and baseoutcome().

offset(varname_o) specifies that varname_o be included in the regression model with the coefficient constrained to be 1. This option is available with probit, logit, binlogit(), oprobit, ologit, and poisson.

exposure(varname_e) specifies a variable that reflects the amount of exposure over which the depvar events were observed for each observation; ln(varname_e) with coefficient constrained to be 1 is entered into the log-link function. This option is available with poisson.

noglmtransform requests that the GLM-type transformation not be applied to linear and nonlinear predictors of Poisson, exponential, and binomial logistic regression models. If this option is not specified, Poisson, exponential, and binomial logistic regressions are assumed, which apply the corresponding GLM transformation to the linear predictor to model the mean function. The GLM transformation is mu = E(Y|X) = h(X'beta), where h() is an inverse link function. For example, for Poisson and exponential regression models, h(x) = exp(x). For generalized nonlinear models, mu = h{f(X,beta)}, where f() is a nonlinear function of predictors. If this option is specified, the linear and nonlinear specifications are assumed to directly model the mean mu of the Poisson distribution, the scale parameter b=1/mu of the exponential distribution, and the probability of success mu/n of the binomial distribution. This option is useful with constant-only models for modeling outcome distributions directly; see BAYES bayesmhRemarksandexamplesBetabinomialmodelBeta-binomial model under Remarks and examples in [BAYES] bayesmh.

prior(priorspec) specifies a prior distribution for model parameters. This option is required and may be repeated. A prior must be specified for each model parameter. Model parameters may be scalars or matrices but both types may not be combined in one prior statement. If multiple scalar parameters are assigned a single univariate prior, they are considered independent, and the specified prior is used for each parameter. You may assign a multivariate prior of dimension d to d scalar parameters. Also see Referring to model parameters below and BAYES bayesmhRemarksandexamplesSpecifyingargumentsoflikelihoodmodelsandpriordistributionsSpecifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.

All likelihood() and prior() combinations are allowed, but they are not guaranteed to correspond to proper posterior distributions. You need to think carefully about the model you are building and evaluate its convergence thoroughly.

dryrun specifies to show the summary of the model that would be fit without actually fitting the model. This option is recommended for checking specifications of the model before fitting the model.

Model 2

reffects(varname) specifies a random-effects variable, a variable identifying the group structure for the random effects, with univariate linear models. This option is useful for fitting two-level random-intercept models. A random-effects variable is treated as a factor variable with no base level. As such, you can refer to random-effects parameters or, simply, random effects associated with varname using a conventional factor-variable notation. For example, you can use {depvar:i.varname} to refer to all random-effects parameters of varname. These parameters must be included in a single prior statement, usually a normal distribution with variance specified by an additional parameter. The random-effects parameters are assumed to be conditionally independent across levels of varname given all other model parameters. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block() option. See example 23.

redefine(label:i.varname) specifies a random-effects linear form that can be used in substitutable expressions. You can use {label:} to refer to the linear form in substitutable expressions. You can specify {label:i.varname} to refer to all random-effects parameters associated with varname. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block() option. This option is useful for fitting multilevel models. See example 29.

xbdefine(label:varlist) specifies a linear form of the variables in varlist that can be used in substitutable expressions. You can use the specification {label:} to refer to the linear form in substitutable expressions. For any varname in varlist, you can use {label:varname} to refer to the corresponding parameter. This option is useful with nonlinear specifications when the linear form contains many variables and provides more efficient computation in such cases.

block(paramref[, blockopts]) specifies a group of model parameters for the blocked MH algorithm. By default, all parameters except matrices are treated as one block, and each matrix parameter is viewed as a separate block. You can use the block() option to separate scalar parameters in multiple blocks. Technically, you can also use block() to combine matrix parameters in one block, but this is not recommended. The block() option may be repeated to define multiple blocks. Different types of model parameters, such as scalars and matrices, may not be specified in one block(). Parameters within one block are updated simultaneously, and each block of parameters is updated in the order it is specified; the first specified block is updated first, the second is updated second, and so on. See BAYES bayesmhRemarksandexamplesImprovingefficiencyoftheMHalgorithm---blockingofparametersImproving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.

blockopts include gibbs, split, reffects, scale(), covariance(), and adaptation().

gibbs option specifies to use Gibbs sampling to update parameters in the block. This option is allowed only for specific combinations of likelihood models and prior distributions; see BAYES bayesmhMethodsandformulasGibbssamplingforsomelikelihood-priorandprior-hyperpriorconfigurationsGibbs sampling for some likelihood-prior and prior-hyperprior configurations in [BAYES] bayesmh. For more information, see BAYES bayesmhRemarksandexamplesGibbsandhybridMHsamplingGibbs and hybrid MH sampling in [BAYES] bayesmh. gibbs may not be combined with reffects, scale(), covariance(), or adaptation().

split specifies that all parameters in a block are treated as separate blocks. This may be useful for levels of factor variables.

reffects specifies that the parameters associated with the levels of a factor variable included in the likelihood specification be treated as random-effects parameters. Random-effects parameters must be included in one prior statement and are assumed to be conditionally independent across levels of a grouping variable given all other model parameters. reffects requires that parameters be specified as {depvar:i.varname}, where i.varname is the corresponding factor variable in the likelihood specification, and may not be combined with block()'s suboptions gibbs and split. This option is useful for fitting hierarchical or multilevel models. See example 25 for details.

scale(#) specifies an initial multiplier for the scale factor corresponding to the specified block. The initial scale factor is computed as #/sqrt{n_p None} for continuous parameters and as #/n_p for discrete parameters, where n_p is the number of parameters in the block. By default, # is equal to 2.38 (that is, scale(2.38)) is the default. If specified, this option overrides the respective setting from the scale() option specified with the bayesmh command. scale() may not be combined with gibbs.

covariance(matname) specifies a scale matrix matname to be used to compute an initial proposal covariance matrix corresponding to the specified block. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma = matname. By default, Sigma is the identity matrix. If specified, this option overrides the respective setting from the covariance() option specified with the bayesmh command. covariance() may not be combined with gibbs.

adaptation(tarate()) and adaptation(tolerance()) specify block-specific TAR and acceptance tolerance. If specified, they override the respective settings from the adaptation() option specified with the bayesmh command. adaptation() may not be combined with gibbs.

initial(initspec) specifies initial values for the model parameters to be used in the simulation. You can specify a parameter name, its initial value, another parameter name, its initial value, and so on. For example, to initialize a scalar parameter alpha to 0.5 and a 2x2 matrix Sigma to the identity matrix I(2), you can type

bayesmh ..., initial({alpha}0.5{Sigma,m} I(2)) ...

You can also specify a list of parameters using any of the specifications described in Referring to model parameters. For example, to initialize all regression coefficients from equations y1 and y2 to zero, you can type

bayesmh ..., initial({y1:} {y2:} 0) ...

The general specification of initspec is

paramref # [paramref # [...]]

Curly braces may be omitted for scalar parameters but must be specified for matrix parameters. Initial values declared using this option override the default initial values or any initial values declared during parameter specification in the likelihood() option. See BAYES bayesmhRemarksandexamplesSpecifyinginitialvaluesSpecifying initial values in [BAYES] bayesmh for details.

nomleinitial suppresses using maximum likelihood estimates (MLEs) starting values for regression coefficients. By default, when no initial values are specified, MLE values (when available) are used as initial values. If nomleinitial is specified and no initial values are provided, bayesmh uses ones for positive scalar parameters, zeros for other scalar parameters, and identity matrices for matrix parameters. nomleinitial may be useful for providing an alternative starting state when checking convergence of MCMC.

Simulation

mcmcsize(#) specifies the target MCMC sample size. The default MCMC sample size is mcmcsize(10000). The total number of iterations for the MH algorithm equals the sum of the burn-in iterations and the MCMC sample size in the absence of thinning. If thinning is present, the total number of MCMC iterations is computed as burnin() + (mcmcsize() - 1) x thinning() + 1. Computation time of the MH algorithm is proportional to the total number of iterations. The MCMC sample size determines the precision of posterior summaries, which may be different for different model parameters and will depend on the efficiency of the Markov chain. Also see BAYES bayesmhRemarksandexamplesBurn-inperiodandMCMCsamplesizeBurn-in period and MCMC sample size in [BAYES] bayesmh.

burnin(#) specifies the number of iterations for the burn-in period of MCMC. The values of parameters simulated during burn-in are used for adaptation purposes only and are not used for estimation. The default is burnin(2500). Typically, burn-in is chosen to be as long as or longer than the adaptation period. Also see BAYES bayesmhRemarksandexamplesBurn-inperiodandMCMCsamplesizeBurn-in period and MCMC sample size and BAYES bayesmhRemarksandexamplesConvergenceofMCMCConvergence of MCMC in [BAYES] bayesmh.

thinning(#) specifies the thinning interval. Only simulated values from every (1+k x #)th iteration for k = 0, 1, 2, ... are saved in the final MCMC sample; all other simulated values are discarded. The default is thinning(1); that is, all simulation values are saved. Thinning greater than one is typically used for decreasing the autocorrelation of the simulated MCMC sample.

rseed(#) sets the random-number seed. This option can be used to reproduce results. rseed(#) is equivalent to typing set seed # prior to calling bayesmh; see [R] set seed and BAYES bayesmhRemarksandexamplesReproducingresultsReproducing results in [BAYES] bayesmh.

exclude(paramref) specifies which model parameters should be excluded from the final MCMC sample. These model parameters will not appear in the estimation table, and postestimation features will not be available for these parameters. This option is useful for suppressing nuisance model parameters. For example, if you have a factor predictor variable with many levels but you are only interested in the variability of the coefficients associated with its levels, not their actual values, then you may wish to exclude this factor variable from the simulation results. If you simply want to omit some model parameters from the output, see the noshow() option.

Adaptation

adaptation(adaptopts) controls adaptation of the MCMC procedure. Adaptation takes place every prespecified number of MCMC iterations and consists of tuning the proposal scale factor and proposal covariance for each block of model parameters. Adaptation is used to improve sampling efficiency. Provided defaults are based on theoretical results and may not be sufficient for all applications. See BAYES bayesmhRemarksandexamplesAdaptationoftheMHalgorithmAdaptation of the MH algorithm in [BAYES] bayesmh for details about adaptation and its parameters.

adaptopts are any of the following options:

every(#) specifies that adaptation be attempted every #th iteration. The default is every(100). To determine the adaptation interval, you need to consider the maximum block size specified in your model. The update of a block with k model parameters requires the estimation of k x k covariance matrix. If the adaptation interval is not sufficient for estimating the k(k+1)/2 elements of this matrix, the adaptation may be insufficient.

maxiter(#) specifies the maximum number of adaptive iterations. Adaptation includes tuning of the proposal covariance and of the scale factor for each block of model parameters. Once the TAR is achieved within the specified tolerance, the adaptation stops. However, no more than # adaptation steps will be performed. The default is variable and is computed as max{25,floor(burnin()/adaptation(every()))}.

maxiter() is usually chosen to be no greater than (mcmcsize()+burnin())/adaptation(every()).

miniter(#) specifies the minimum number of adaptive iterations to be performed regardless of whether the TAR has been achieved. The default is miniter(5). If the specified miniter() is greater than maxiter(), then miniter() is reset to maxiter(). Thus, if you set maxiter(0), then no adaptation will be performed.

alpha(#) specifies a parameter controlling the adaptation of the AR. alpha() should be in [0,1]. The default is alpha(0.75).

beta(#) specifies a parameter controlling the adaptation of the proposal covariance matrix. beta() must be in [0,1]. The closer beta() is to zero, the less adaptive the proposal covariance. When beta() is zero, the same proposal covariance will be used in all MCMC iterations. The default is beta(0.8).

gamma(#) specifies a parameter controlling the adaptation rate of the proposal covariance matrix. gamma() must be in [0,1]. The larger the value of gamma(), the less adaptive the proposal covariance. The default is gamma(0).

tarate(#) specifies the TAR for all blocks of model parameters; this is rarely used. tarate() must be in (0,1). The default AR is 0.234 for blocks containing continuous multiple parameters, 0.44 for blocks with one continuous parameter, and 1/n_maxlev for blocks with discrete parameters, where n_maxlev is the maximum number of levels for a discrete parameter in the block.

tolerance(#) specifies the tolerance criterion for adaptation based on the TAR. tolerance() should be in (0,1). Adaptation stops whenever the absolute difference between the current and TARs is less than tolerance(). The default is tolerance(0.01).

scale(#) specifies an initial multiplier for the scale factor for all blocks. The initial scale factor is computed as #/sqrt{n_p None} for continuous parameters and #/n_p for discrete parameters, where n_p is the number of parameters in the block. By default, # is equal to 2.38; that is, scale(2.38) is the default.

covariance(cov) specifies a scale matrix cov to be used to compute an initial proposal covariance matrix. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma=matname. By default, Sigma is the identity matrix. Partial specification of Sigma is also allowed. The rows and columns of cov should be named after some or all model parameters. According to some theoretical results, the optimal proposal covariance is the posterior covariance matrix of model parameters, which is usually unknown.

Reporting

clevel(#) specifies the credible level, as a percentage, for equal-tailed and HPD credible intervals. The default is clevel(95) or as set by [BAYES] set clevel.

hpd specifies the display of HPD credible intervals instead of the default equal-tailed credible intervals.

batch(#) specifies the length of the block for calculating batch means, batch standard deviation, and MCSE using batch means. The default is batch(0), which means no batch calculations. When batch() is not specified, MCSE is computed using effective sample sizes instead of batch means. Option batch() may not be combined with corrlag() or corrtol().

nomodelsummary suppresses the detailed summary of the specified model. Model summary is reported by default.

noexpression suppresses the output of expressions from the model summary. Expressions (when specified) are reported by default.

blocksummary displays the summary of the specified blocks. This option is useful when block() is specified and may not be combined with dryrun.

dots and dots(#) specify to display dots as simulation is performed. dots(#) displays a dot every # iterations. During the adaptation period, a symbol a is displayed instead of a dot. If dots(..., every(#)) is specified, then an iteration number is displayed every #th iteration instead of a dot or a. dots(, every(#)) is equivalent to dots(1, every(#)). dots displays dots every 100 iterations and iteration numbers every 1,000 iterations; it is a synonym for dots(100), every(1000). By default, no dots are displayed (dots(0)).

noshow(paramref) specifies a list of model parameters to be excluded from the output. Do not confuse this option with exclude(), which excludes the specified parameters from the MCMC sample.

showreffects(paramref) is used with option reffects() and specifies a list of random-effects parameters to be included in the output. By default, all random-effects parameters introduced by reffects() are excluded from the output as if you have specified the noshow() option.

notable suppresses the estimation table from the output. By default, a summary table is displayed containing all model parameters except those listed in the exclude() and noshow() options. Regression model parameters are grouped by equation names. The table includes six columns and reports the following statistics using the MCMC simulation results: posterior mean, posterior standard deviation, MCMC standard error or MCSE, posterior median, and credible intervals.

noheader suppresses the output header either at estimation or upon replay.

title(string) specifies an optional title for the command that is displayed above the table of the parameter estimates. The default title is specific to the specified likelihood model.

saving(filename[, replace]) saves simulation results in filename.dta. The replace option specifies to overwrite filename.dta if it exists. If the saving() option is not specified, bayesmh saves simulation results in a temporary file for later access by postestimation commands. This temporary file will be overridden every time bayesmh is run and will also be erased if the current estimation results are cleared. saving() may be specified during estimation or on replay.

The saved dataset has the following structure. Variance _index records iteration numbers. bayesmh saves only states (sets of parameter values) that are different from one iteration to another and the frequency of each state in variable _frequency. (Some states may be repeated for discrete parameters.) As such, _index may not necessarily contain consecutive integers. Remember to use _frequency as a frequency weight if you need to obtain any summaries of this dataset. Values for each parameter are saved in a separate variable in the dataset. Variables containing values of parameters without equation names are named as eq0_p#, following the order in which parameters are declared in bayesmh. Variables containing values of parameters with equation names are named as eq#_p#, again following the order in which parameters are defined. Parameters with the same equation names will have the same variable prefix eq#. For example,

. bayesmh y x1, likelihood(normal({var})) saving(mcmc) ...

will create a dataset mcmc.dta with variable names eq1_p1 for {y:x1}, eq1_p2 for {y:_cons}, and eq0_p1 for {var}. Also see macros e(parnames) and e(varnames) for the correspondence between parameter names and variable names.

In addition, bayesmh saves variable _loglikelihood to contain values of the log likelihood from each iteration and variable _logposterior to contain values of log posterior from each iteration.

display_options: vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), and nolstretch; see [R] estimation options.

Advanced

search(search_options) searches for feasible initial values. search_options are on, repeat(#), and off.

search(on) is equivalent to search(repeat(500)). This is the default.

search(repeat(k)), k > 0, specifies the number of random attempts to be made to find a feasible initial-value vector, or initial state. The default is repeat(500). An initial-value vector is feasible if it corresponds to a state with positive posterior probability. If feasible initial values are not found after k attempts, an error will be issued. repeat(0) (rarely used) specifies that no random attempts be made to find a feasible starting point. In this case, if the specified initial vector does not correspond to a feasible state, an error will be issued.

search(off) prevents bayesmh from searching for feasible initial values. We do not recommend specifying this option.

corrlag(#) specifies the maximum autocorrelation lag used for calculating effective sample sizes. The default is min{500,mcmcsize()/2}. The total autocorrelation is computed as the sum of all lag-k autocorrelation values for k from 0 to either corrlag() or the index at which the autocorrelation becomes less than corrtol() if the latter is less than corrlag(). Options corrlag() and batch() may not be combined.

corrtol(#) specifies the autocorrelation tolerance used for calculating effective sample sizes. The default is corrtol(0.01). For a given model parameter, if the absolute value of the lag-k autocorrelation is less than corrtol(), then all autocorrelation lags beyond the kth lag are discarded. Options corrtol() and batch() may not be combined.

Remarks

Remarks are presented under the following headings:

Using bayesmh Declaring model parameters Referring to model parameters Substitutable expressions

Using bayesmh

The bayesmh command for Bayesian analysis includes three functional components: setting up a posterior model, performing MCMC simulation, and summarizing and reporting results. The first component, the model-building step, requires some experience in the practice of Bayesian statistics and, as any modeling task, is probably the most demanding. You should specify a posterior model that is statistically correct and that represents the observed data. Another important aspect is the computational feasibility of the model in the context of the MH MCMC procedure implemented in bayesmh. The provided MH algorithm is adaptive and, to a degree, can accommodate various statistical models and data structures. However, careful model parametrization and well-specified initial values and MCMC sampling scheme are crucial for achieving a fast-converging Markov chain and consequently good results. Simulation of MCMC must be followed by a thorough investigation of the convergence of the MCMC algorithm. Once you are satisfied with the convergence of the simulated chains, you may proceed with posterior summaries of the results and their interpretation. Below we discuss the three major steps of using bayesmh and provide recommendations.

Declaring model parameters

Model parameters are typically declared, meaning first introduced, in the arguments of distributions specified in options likelihood() and prior(). We will refer to model parameters that are declared in the prior distributions (and not the likelihood distributions) as hyperparameters. Model parameters may also be declared within the parameter specification of the prior() option, but this is more rare.

bayesmh distinguishes between two types of model parameters: scalar and matrix. All parameters must be specified in curly braces, { }. There are two ways for declaring a scalar parameter: param and {eqname:param}, where param and eqname are valid Stata names.

The specification of a matrix parameter is similar, but you must use the matrix suboptions: {param, matrix} and {eqname:param, matrix}. The most common application of matrix model parameters is for specifying the variance-covariance matrix of a multivariate normal distribution.

All matrices are assumed to be symmetric and only the elements in the lower diagonal are reported in the output. Only a few multivariate prior distributions are available for matrix parameters: wishart(), iwishart(), and jeffreys(). In addition to being symmetric, these distributions require that the matrices be positive definite.

It is your responsibility to declare all parameters of your model, except regression coefficients in linear models. For a linear model, bayesmh automatically creates a regression coefficient with the name {depvar:indepvar} for each independent variable indepvar in the model and, if noconstant is not specified, an intercept parameter {depvar}. In the presence of factor variables, bayesmh will create a parameter {depvar:level} for each level indicator level and a parameter {depvar:inter} for each interaction indicator inter; see fvvarlists. (It is still your responsibility, however, to specify prior distributions for the regression parameters.)

For example,

. bayesmh y x, ...

will automatically have two regression parameters: {y None:x} and {y None}, whereas

. bayesmh y x, noconstant ...

will have only one: {y None:x}.

For a univariate normal linear regression, we may want to additionally declare the scalar variance parameter by

. bayesmh y x, likelihood(normal({sig2 None})) ...

We can label the variance parameter, as follows:

. bayesmh y x, likelihood(normal({var:sig2})) ...

We can declare a hyperparameter for {sig2 None} using

. bayesmh y x, likelihood(normal({sig2 None})) prior({sig2 None}, igamma({df None},2)) ...

where the hyperparameter {df None} is declared in the inverse-gamma prior distribution for {sig2 None}.

For a multivariate normal linear regression, in addition to four regression parameters declared automatically by bayesmh: {y1 None:x}, {y1 None}, {y2 None:x}, and {y2 None}, we may also declare a parameter for the variance-covariance matrix:

{cmd:. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, matrix}))} ...

or abbreviate matrix to m for short:

{cmd:. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, m}))} ...

Referring to model parameters

After a model parameter is declared, we may need to refer to it in our further model specification. We will definitely need to refer to it when we specify its prior distribution. We may also need to use it as an argument in the prior distributions of other parameters or need to specify it in the block() option for blocking of model parameters; see BAYES bayesmhRemarksandexamplesImprovingefficiencyoftheMHalgorithm---blockingofparametersImproving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.

To refer to one parameter, we simply use its definition: {param}, {eqname:param}, {param, matrix}, or {eqname:param, matrix}. There are several ways in which you can refer to multiple parameters. You can refer to multiple model parameters in the parameter specification paramref of the prior(paramref, ...) option, of the block(paramref, ...) option, or of the initial(paramref #) option.

The most straightforward way to refer to multiple scalar model parameters is to simply list them individually, as follows:

{param1 None} {param2 None} ...

but there are shortcuts.

For example, the alternative to the above is

{param1 param2} ...

where we simply list the names of all parameters inside one set of curly braces.

If parameters have the same equation name, you can refer to all of the parameters with that equation name as follows. Suppose that we have three parameters with the same equation name eqname, then the specification

{eqname None:param1} {eqname None:param2} {eqname None:param3}

is the same as the specification

{eqname None}

or the specification

{eqname None:param1 param2 param3}

The above specification is useful if we want to refer to a subset of parameters with the same equation name. For example, in the above, if we wanted to refer to only param1 and param2, we could type

{eqname None:param1 param2}

If a factor variable is used in the specification of the regression function, you can use the same factor-variable specification within paramref to refer to the coefficients associated with the levels of that factor variable; see fvvarlists.

For example, factor variables are useful for constructing multilevel Bayesian models. Suppose that variable id defines the second level of hierarchy in a two-level random-effects model. We can fit a Bayesian random-intercept model as follows.

. bayesmh y x i.id, likelihood(normal({var})) prior({y None:i.id}, normal(0,{tau None})) ...

Here we used {y None:i.id} in the prior specification to refer to all levels of id.

Similarly, we can add a random coefficient for a continuous covariate x by typing

. bayesmh y c.x##i.id, likelihood(normal({var})) prior({y None:i.id}, normal(0,{tau1 None})) prior({y None:c.x#i.id}, normal(0,{tau2 None})) ...

You can mix and match all of the specifications above in one parameter specification, paramref.

To refer to multiple matrix model parameters, you can use {paramlist, matrix{cmd:} to refer to matrix parameters with names paramlist and {eqname:paramlist, matrix} to refer to matrix parameters with names in paramlist and with equation name eqname.

For example, the specification

{eqname None:Sigma1,m} {eqname None:Sigma2,m} {cmd:{Sigma3,m}} {cmd:{Sigma4,m}}

is the same as the specification

{eqname None:Sigma1 Sigma2,m} {Sigma3 Sigma4,m}

You cannot refer to both scalar and matrix parameters in one paramref specification.

For referring to model parameters in postestimation commands, see BAYES bayesmhpostestimationRemarksandexamplesDifferentwaysofspecifyingmodelparametersDifferent ways of specifying model parameters in [BAYES] bayesmh postestimation.

Substitutable expressions

You may use substitutable expressions in bayesmh to define nonlinear expressions subexpr, arguments of outcome distributions in option likelihood(), observation-level log likelihood in option llf(), arguments of prior distributions in option prior(), and generic prior distributions in prior()'s suboptions density() and logdensity(). Substitutable expressions are just like any other mathematical expression in Stata, except that they may include model parameters.

To specify a substitutable expression in your bayesmh model, you must comply with the following rules:

1. Model parameters are bound in braces: {mu None}, {var:sigma2}, {cmd:{Sigma, matrix}}, and {Cov None:Sigma, matrix}.

2. Linear combinations can be specified using the notation {eqname:varlist}. For example,

{xb:mpg price weight} is equivalent to

{xb_mpg None}*mpg + {xb_price None}*price + {xb_weight None}*weight

3. There is a small caveat with using the {eqname:name} specification 2 when name corresponds to both one of the variables in the dataset and a parameter in the model. The linear-combination specification takes precedence in this case. For example, {eq None:var} will be expanded to {eq_var None}*var, where var is the variable in a dataset and eq_var is the coefficient corresponding to this variable. To refer directly to the coefficient, you must use {eq_var None}.

4. Initial values are given by including an equal sign and the initial value inside the braces, for example, {cmd:{b1=1.267}}, {cmd:{gamma=3}}, etc. If you do not specify an initial value, that parameter is initialized to one for positive scalar parameters and to zero for other scalar parameters, or it is initialized to its MLE, if available. The initial() option overrides initial values provided in substitutable expressions. Initial values for matrices must be specified in the initial() option. By default, matrix parameters are initialized with identity matrices.

Examples


Setup

webuse oxygen

Bayesian normal linear regression with noninformative priors

set seed 14
bayesmh change age group, likelihood(normal({var})) 

Bayesian normal linear regression with normal and inverse-gamma priors

set seed 14
bayesmh change age group, likelihood(normal({var})) 

Bayesian normal linear regression with multivariate Zellner’s g-prior

set seed 14
bayesmh change age group, likelihood(normal({var})) 

Update parameter {var} separately from other model coefficients

set seed 14
bayesmh change age group, likelihood(normal({var})) 

Use Gibbs sampling for parameter {var} and display the summary about blocks

set seed 14
bayesmh change age group, likelihood(normal({var})) 

Bayesian logistic regression model with a noninformative prior

webuse hearthungary	
set seed 14
bayesmh disease restecg isfbs age male, likelihood(logit) 

Bayesian ordered probit model including hyperparameter {lambda None}

webuse fullauto
replace length = length/10
set seed 14
bayesmh rep77 foreign length mpg, likelihood(oprobit) 

Setup

sysuse auto, clear
replace weight = weight/1000
replace length = length/100
replace mpg = mpg/10

Bayesian multivariate normal model including matrix parameter {Sigma None} for the covariance matrix

set seed 14

{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100)) {cmd: prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons}) {cmd: block({Sigma,m}) dots}{p_end}

Request additional burn-in and more frequent adaptation

set seed 14

{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100)) {cmd:prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons}) {cmd:block({Sigma,m}) dots} burnin(5000) adaptation(every(50))

Request Gibbs sampling for covariance matrix {Sigma None}

set seed 14

{cmd:. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))} prior({mpg None:_cons} [weight] {length None:_cons}, normal(0,100)) {cmd:prior({Sigma,m}, iwishart(3,100,I(3)))} block({mpg None:_cons} [weight] {length None:_cons}) {cmd:block({Sigma,m}, gibbs) dots}{p_end}


Setup

webuse pig, clear

Bayesian linear random-intercept model

set seed 14
fvset base none id
bayesmh weight week i.id, likelihood(normal({var_0})) noconstant 

Bayesian linear random-intercept model using the reffects() option

set seed 14
bayesmh weight week, reffects(id) likelihood(normal({var_0})) noconstant 

Setup

webuse coal

Analysis of a change point problem with target MCMC sample size of 20,000

set seed 14
bayesmh count = ({mu1 None}*sign(year<{cp None})+{mu2 None}*sign(year>={cp None})), 

Stored results

bayesmh stores the following in e():

Scalars
e(N) number of observations
e(k) number of parameters
e(k_sc) number of scalar parameters
e(k_mat) number of matrix parameters
e(n_eq) number of equations
e(mcmcsize) MCMC sample size
e(burnin) number of burn-in iterations
e(mcmciter) total number of MCMC iterations
e(thinning) thinning interval
e(arate) overall AR
e(eff_min) minimum efficiency
e(eff_avg) average efficiency
e(eff_max) maximum efficiency
e(clevel) credible interval level
e(hpd) 1 if hpd is specified; 0 otherwise
e(batch) batch length for batch-mean calculations
e(corrlag) maximum autocorrelation lag
e(corrtol) autocorrelation tolerance
e(dic) deviation information criterion
e(lml_lm) log marginal-likelihood using Laplace-Metropolis method
e(scale) initial multiplier for scale factor; scale()
e(block#_gibbs) 1 if Gibbs sampling is used in #th block; 0 otherwise
e(block#_reffects) 1 if the parameters in #th block are random effects; 0 otherwise
e(block#_scale) #th block initial multiplier for scale factor
e(block#_tarate) #th block target adaptation rate
e(block#_arate_last) #th block AR from the last adaptive iteration
e(block#_tolerance) #th block adaptation tolerance
e(adapt_every) adaptation iterations adaptation(every())
e(adapt_maxiter) maximum number of adaptive iterations adaptation(maxiter())
e(adapt_miniter) minimum number of adaptive iterations adaptation(miniter())
e(adapt_alpha) adaptation parameter adaptation(alpha())
e(adapt_beta) adaptation parameter adaptation(beta())
e(adapt_gamma) adaptation parameter adaptation(gamma())
e(adapt_tolerance) adaptation tolerance adaptation(tolerance())
e(repeat) number of attempts used to find feasible initial values
Macros
e(cmd) bayesmh
e(cmdline) command as typed
e(method) sampling method
e(depvars) names of dependent variables
e(eqnames) names of equations
e(likelihood) likelihood distribution (one equation)
e(likelihood#) likelihood distribution for #th equation
e(prior) prior distribution
e(prior#) prior distribution, if more than one prior() is specified
e(priorparams) parameter specification in prior()
e(priorparams#) parameter specification from #th prior(), if more than one prior() is specified
e(parnames) names of model parameters except exclude()
e(postvars) variable names corresponding to model parameters in e(parnames)
e(subexpr) substitutable expression
e(subexpr#) substitutable expression, if more than one
e(wtype) weight type (one equation)
e(wtype#) weight type for #th equation
e(wexp) weight expression (one equation)
e(wexp#) weight expression for #th equation
e(block#_names) parameter names from #th block
e(exclude) names of excluded parameters
e(filename) name of the file with simulation results
e(scparams) scalar model parameters
e(matparams) matrix model parameters
e(pareqmap) model parameters in display order
e(title) title in estimation output
e(rngstate) random-number state at the time of simulation
e(search) on, repeat(), or off
Matrices
e(mean) posterior means
e(sd) posterior standard deviations
e(mcse) MCSE
e(median) posterior medians
e(cri) credible intervals
e(Cov) variance-covariance matrix of parameters
e(ess) effective sample sizes
e(init) initial values vector
Functions
e(sample) mark estimation sample