BIEMS: A Fortran 90 Program for Calculating Bayes Factors for Inequality and Equality Constrained Models

This paper discusses a Fortran 90 program referred to as BIEMS (Bayesian inequality and equality constrained model selection) that can be used for calculating Bayes factors of multivariate normal linear models with equality and/or inequality constraints between the model parameters versus a model containing no constraints, which is referred to as the unconstrained model. The prior that is used under the unconstrained model is the conjugate expected-constrained posterior prior and the prior under the constrained model is proportional to the unconstrained prior truncated in the constrained space. This results in Bayes factors that appropriately balance between model fit and complexity for a broad class of constrained models. When the set of equality and/or inequality constraints in the model represents a hypothesis that applied researchers have in, for instance, (M)AN(C)OVA, (multivariate) regression, or repeated measurements, the obtained Bayes factor can be used to determine how much evidence is provided by the data in favor of the hypothesis in comparison to the unconstrained model. If several hypotheses are under investigation, the Bayes factors between the constrained models can be calculated using the obtained Bayes factors from BIEMS. Furthermore, posterior model probabilities of constrained models are provided which allows the user to compare the models directly with each other.


Introduction
This paper discusses a program that can be used to determine which model (or hypothesis) receives most evidence from the data given a set of models that contain equality constraints and/or inequality constraints on the model parameters. This will be done using the Bayes factor, a Bayesian model selection criterion. The program BIEMS (Bayesian inequality and equality constrained model selection) is written in Fortran 90 using IMSL subroutines (Visual Numerics 2003). A user friendly interface has been developed for practitioners who are not familiar with the technical details of statistical modeling. The interface allows one to simply specify equality and inequality constraints on the parameters of interest. From the website http://tinyurl.com/informativehypotheses, it is possible to download a compiled version of the program, including the interface. Source codes are available along with this manuscript.
The program is designed for applied researchers, e.g., social or medical, who have specific expectations that can be formulated in terms of equality and inequality constraints between the model parameters. Hoijtink et al. (2008) refer to such expectations as informative hypotheses in contrast to hypotheses that do not contain any information about the relationship between the model parameters, such as the standard null and alternative in classical hypothesis testing. For references about the use of Bayes factors to test scientific theories, see Kass and Raftery (1995) and the references therein. The methodology that is used in BIEMS is based on Mulder et al. (2010) and Mulder et al. (2009). The first reference contains the technical details of the method, such as, prior specification. The latter focusses on repeated measurements and is more accessible for applied researchers.
In Section 2.1, the multivariate normal linear model is discussed. Section 2.2 describes how time-varying covariates can be included in the model. The formulation of model constraints is addressed in Section 2.3. Section 3 describes the specification of the unconstrained default prior which is the conjugate expected-constrained posterior prior (CECPP). In Section 3.1, two basic properties of this prior are discussed. Section 3.2 describes how linear restrictions are specified to obtain so-called constrained posterior priors. Subsequently, the hyper-parameters of the constrained posterior priors are calculated in Section 3.3 and the construction of the CECPP is given in Section 3.4 including the corresponding posterior distribution. In Section 3.5, it is discussed how minimal training samples are sampled which are used to construct the constrained posterior priors. In Section 4.1, it is described how the Bayes factor is estimated using MCMC (Markov chain Monte Carlo) sampling for an inequality constrained model versus the unconstrained model. Section 4.2 illustrates how the estimation error of the Bayes factor is determined. In Section 4.3, the procedure is discussed for estimating Bayes factors of models containing equality constraints versus the unconstrained model. Section 5 provides several short examples of model implementations for analysis of variance (ANOVA), linear regression, multivariate analysis of variance (MANOVA), and repeated measures. In Section 6, a short conclusion is given.

Constrained model specification 2.1. The multivariate normal linear model
The multivariate normal linear model can be written as In this model, a P -variate dependent variable y i , i = 1, . . . , N , is explained by a vector d i of size J containing group membership, i.e., d ij = 1 if the ith observation belongs to group j 0 otherwise, a K-variate explanatory variable x i , and ε i a multivariate Gaussian error with ε i ∼ N (0, Σ). Furthermore, M is a J × P matrix with group means where the (j, p)th element µ jp denotes the mean (or intercept) of the pth dependent variable for the jth group and A is a K × P matrix with regression coefficients where the (k, p)th element α kp denotes the coefficient of the kth covariate for the pth dependent variable.
For N observations, the model can be written as where The likelihood of the multivariate normal linear model (2) is given by where β is a vector of length D = P (J + K), which is the vectorization of B, i.e., vec(B) = β, β is the maximum likelihood estimator of β, witĥ B = (X X) −1 X Y, The least squares estimate of Σ, which is provided in the output of BIEMS, equalsΣ = S/(N − J − K). Expression (5) shows that the likelihood is proportional to a normal-inverse-Wishart distribution. For this reason, a multivariate normal-inverse Wishart prior for {β, Σ} is used in BIEMS which is generalized natural conjugate (Press 2005, p. 253-256), i.e., π 0 (β, Σ) = π 0 (β)π 0 (Σ). The unconstrained prior is discussed in Section 3.

Modeling time-varying covariates
When modeling repeated measurements using the multivariate normal linear model, timevarying covariates can be included in (2) according to with w iu is a vector of length P containing the scores on the uth time-varying covariates of the ith observation, G u = diag(γ u1 , . . . , γ uP ), where γ up is the regression parameter of the uth time-varying covariate measured at the pth time point, for u, = 1, . . . , U . The matrices G u are diagonal to ensure that the uth time-varying covariate measured at time point p is not regressed on the outcome variable measured at time point p = p. In the vectorization of B, i.e., β, which is important when specifying the (in)equality constraints of the model in (10), the zeros are omitted (see also Section 5.5). Note here that it is also possible to model more than one dependent variable as repeated measurement.
When time-varying covariates are present, the likelihood of the parameters in the structured parameter matrix B given Σ can be obtained as follows. First, an unstructured parameter matrix is substituted for B, which is denoted by where G * u , for u = 1, . . . , U , are unstructured parameter matrices which replace the diagonal matrices G u , for u = 1, . . . , U , B in (7). The vector with all off-diagonal parameters of the matrices G * u will be denoted by γ * which contains 1 2 U P (P − 1) elements. Therefore, the vectorization of B * is a permutation of the parameter vector (β , γ * ) , where β is the vectorization of B in (7) with zeros omitted. The likelihood of the unstructured parameter matrix B * and Σ is given in (5), and therefore, the likelihood of β, γ * |Σ is multivariate normal. Subsequently, the likelihood of β|Σ, γ * = 0 is also multivariate normal and given by whereβ andγ * are the maximum likelihood estimates, and cov(γ * , β), cov(β, γ * ), var(β), and var(γ * ) are the appropriate submatrices of the covariance matrix Σ⊗(X X) −1 . Subsequently, the conditional posterior of β|Σ, γ * = 0 can be obtained by combining this likelihood with the multivariate normal prior of β which is discussed in the next section.
An example of a model including a time-varying covariate is given in Section 5.5. More details about modeling time-varying covariates using the multivariate normal linear model can be found in Mulder et al. (2009).

Equality and inequality constrained multivariate normal linear models
The system of L t inequality constraints on the parameters β of the inequality constrained model M t can be written as where R t is a L t × D matrix containing inequality coefficients and r t is a vector of length L t of constants. The augmented matrix [R t |r t ] should be implemented in the file inequality_constraints.txt in the stand-alone version of BIEMS (Appendix B). Examples are provided in Section 5.
For an equality constrained model M t , the system of equalities is given by where Q t is a H t × D matrix containing H t equality coefficients and q t is a vector of length H t containing constants. The augmented matrix [Q t |q t ] should be given in the file equality_constraints.txt (Appendix B). When combining (8) and (9), we obtain a constrained model with inequalities and equalities.
The system of constraints of model M t will be denoted by where ≥ t denotes that each constraint of this system is either a ">" or a "=", depending on the model M t .

Default prior specification
In order to compute Bayes factors, priors distributions must be specified for β and Σ. Because all constrained models are nested in the unconstrained model M 0 : β ∈ R D , one only needs to specify a prior under the unconstrained model and use truncations of this prior under the constrained models, i.e., π t (β, Σ) = c −1 t π 0 (β, Σ)1 Mt (β) with c t = β∈Mt π 0 (β, Σ)∂β∂Σ, where π 0 denotes the prior under M 0 and π t denotes the prior under M t . This prior choice has two important advantages. First, only a single prior π 0 needs to be specified. Second, the Bayes factor B t0 of a constrained model against the unconstrained model can be obtained without computing the marginal likelihoods of the models. This is discussed in the Section 4. In this section, it will be described how a default prior under the unconstrained model can be obtained which results in Bayes factors that are effective for evaluating models containing equality and inequality constraints on the parameters.

Properties of the unconstrained prior
In Mulder et al. (2010) and Mulder et al. (2009), a prior specification was proposed that resulted in Bayes factors with an appropriate Occam's razor for testing equality and inequality constrained models due to an effective balance between model fit and model complexity. The implementation of this method in BIEMS is discussed in this section.
The CECPP is used as unconstrained prior for β ). The CECPP is based on constrained posterior priors, which are obtained as follows. For the covariance matrix Σ, Jeffreys' prior is used, i.e., π 0 (Σ) ∝ |Σ| −(P +1)/2 . This noninformative improper prior can be used because of arguments of orthogonal transformations of scale and location parameters (Jeffreys 1961).
The CECPP of β has the following three key properties.
1. It is based on minimal training samples which are subsets of the data that contain minimal information to obtain a proper posterior prior (e.g., Berger and Pericchi 1996). This property is essential for testing models with equality constraints on the model parameters. The Bartlett paradox (Bartlett 1957) is then avoided.
2. It is centered on the boundary of the constrained parameter space under investigation. The boundary of the constrained space under investigation is defined as This property is essential for testing models containing inequality constraints. Note that B must be nonempty which implies that the models under investigation must be comparable (Mulder et al. 2010). For instance, M 1 : β = 0 and M 2 : β > 1 are not comparable but M 1 : β = 0 and M 3 : β > 0 are comparable. In this case, the CECPP of β is centered at β = 0. Note that B = {0} in this case. In the interface of BIEMS (Appendix A), it is automatically checked whether the constrained models are comparable when using the default prior specification setting.
3. The parameters that are compared with each other in the constrained models have equal variances in the CECPP. This is essential for incorporating the complexity of inequality constrained models with unequal weights on the parameters of interest, e.g., M 1 : β 1 > 2β 2 > 0. This will be elaborated in Section 3.3.
In the remaining part of this section, it will be discussed how the CECPP and the corresponding posterior distribution is obtained.

Linear restrictions for constrained posterior priors
The CECPP can be seen as an average over all constrained posterior priors (CPPs). In contrast to standard posterior priors which are obtained by updating a noninformative improper prior with a minimal training sample, CPPs are obtained by updating a noninformative improper prior with a minimal training sample under certain linear restrictions. The linear restrictions ensure that the constrained posterior priors are located on the boundary of the constrained parameter space. Note that standard posterior priors are located around the likelihood, and therefore, generally will not satisfy the second property in Section 3.1. Mulder et al. (2010) showed that the resulting intrinsic Bayes factors may not be effective for testing inequality constrained models. In this section, it will be discussed how the linear restrictions are specified to obtain constrained posterior priors that satisfy the second property.
Two sets of linear restrictions will be constructed: One for the CPP means, say [R 0,M |r 0,M ], and one for the CPP variances, say [R 0,V |r 0,V ]. The reason that different sets of restrictions are needed can for instance be seen when testing M 1 : β = 0 versus M 0 : β ∈ R 1 . The restriction for the CPP mean will be β = 0 to ensure that the CPP mean of β equals 0 (to satisfy the second property), while for the CPP variance of β no restriction in needed. Note that if β = 0 would be a restriction for the CPP variance, the CPP variance of β would become equal to zero which is undesirable. On the hand, when testing M 2 : β 1 > β 2 > β 3 versus M 0 : β ∈ R 3 , the restriction for the CPP means and CPP variances are identical, namely, so that the CPP means are identical and the CPP variances are identical for β 1 , β 2 , and β 3 , for every randomly selected minimal training sample.
The restrictions for the CPP means, [R 0,M |r 0,M ], are identical to the row echelon form of [R 0 |r 0 ] where the zero rows are omitted. This ensures that the CPP means are located on the boundary B. The use of the row echelon form with zero rows omitted is important for obtaining the hyperparameter of CPPs which is discussed in the following section.
The restrictions for the CPP variances are obtained from the row echelon form of [R * 0 |0] where the zero rows are removed. Hence, the vector of constants does not play a role when restricting the CPP variances, i.e., r 0,V = 0. The matrix R * 0 is obtained fromR 0 as follows.
If time-varying covariates are also incorporated in the model, additional restrictions of the form γ * = 0 are automatically added to the mean restrictions [R 0,M |r 0,M ] and variance restrictions [R 0,V |0]. The mean and variance restrictions that were used to obtain the CPPs can be found at "View/Edit Default Prior" in the interface. Via "Specify restrictions manually", a user can also specify the linear restrictions for β manually. However, we recommend to use the default setting as specified above.
In the stand-alone version of BIEMS, the restriction matrix for the CPP means can be manually specified in the file restriction_matrix.txt. The restrictions for the CPP variances are based on the rules stated above. By default, the system of restrictions is equal to the row echelon form of the system of constraints of the model M t , i.e., [R t |r t ]. Note that the same restrictions must be used when computing B t0 . This is automatically done when using the BIEMS interface.

Hyper-parameters of the constrained posterior priors
Based on the linear restrictions for the means and the linear restrictions for the variances, CPPs are obtained from randomly selected minimal training samples, [Y l |X l ] where l is the training sample index, using restricted regression (Judge et al. 1980). The CPP of β is multivariate normally distributed, whereβ R l is the restricted maximum likelihood (RML) estimate that is obtained from the minimal training sample and the restrictions [R 0,M |r 0,M ] andΨ R l is the diagonal covariance matrix that is obtained from the minimal training sample and the linear restrictions [R 0,V |0].
The RML estimate of β maximizes the likelihood of [Y l |X l ] under the system of restrictions R 0,M β = r 0,M , which yieldŝ whereX l = I P ⊗ X l andβ l is the unrestricted ML estimate of β given [Y l |X l ]. This is explained in more detail in Mulder et al. (2010). Because R 0,Mβ R l = r 0,M holds, for all training samples, all CPPs are centered on the boundary under investigation B.
Based on the minimal training sample and the restrictions R 0,V β = 0, restricted regression is used to obtain the covariance matrix, which is given bŷ where 1 K+P U is a vector of length K + P U with ones, I D is an identity matrix of size D = P (J + K + P U ),Σ R l is the RML estimate of the error covariance matrix Σ, and vec(B R l ) =β R l . See Mulder et al. (2010) for a thorough explanation of (15). In (15),Ψ R l is the covariance matrix that is obtained when using restricted regression under the assumption of a diagonal covariance matrix Σ and independent variables in β in the constrained posterior prior. Because the training sample [Y l |X l ] of size N l = J + K + P U β 1 β 2 β1 2 β = 2 π 3 π 2 π 1 Figure 1: Contours of different priors: π 1 represents a uniform prior for (β 1 , β 2 ), π 2 represents identical normal priors for (β 1 , β 2 ), and π 3 represents normal priors with var(β 1 ) = 4var(β 2 ). The complexity of the inequality constrained model M 1 : β 1 > 2β 2 > 0 corresponds to the relative size of the grey areas. Hence, c 1 1 = 0.0625, c 2 1 = 0.0738, and c 3 1 = 0.125.
is minimal under the unrestricted model, and not under the restricted model, the variances are rescaled to a single observation by multiplyingΨ Subsequently, when multiplying this with 1/N * l , the variances are rescaled according to a training sample size of size N * l . By default, the value of N * l is chosen equal to the minimal training sample size under the restricted model, which is equal to the number of free parameters under the restricted model. This is equal to the number of free parameters in the unrestricted model D minus the number of pivots in [R 0,M |r 0,M ], plus the number of diagonal elements of Σ, which equals P . The user can adjust the scale N * l to construct more or less informative constrained posterior priors. This can be changed in the interface under "Scale of prior variance".
In the CPP covariance matrixΨ R l , the variances of parameters are equal which are compared with each other in the constrained models under investigation, i.e., var(β d 1 ) = var(β d 2 ) under the restriction v 1 β d 1 = v 2 β d 2 for ∀v 1 , v 2 = 0. Note that this differs from the CPP variance proposed by Mulder et al. (2010) where v 2 1 var(β d 1 ) = v 2 2 var(β d 2 ) under the same restriction. To justify this choice, we consider an example of an inequality constrained model with nonequal weights on the parameters of interest, e.g., M 1 : β 1 > 2β 2 > 0. In Mulder et al. (2010), the complexity of an inequality constrained model was defined as the prior probability mass in the inequality constrained space assuming a uniform prior on (− , ) for → ∞. For model M 1 , this implies that a measure of complexity that is equal to 1/16 = 0.0625 according to this definition. In the Bayes factor based on CPPs, model complexity of M 1 is incorporated as the CPP probability mass in the inequality constrained space {β|β 1 > 2β 2 > 0}. Under the linear restrictions β 1 = β 2 = 0 for the CPP means and β 1 = β 2 for the CPP variances, such that var(β 1 ) = var(β 2 ) and E(β 1 ) = E(β 2 ) = 0 in the CPP, model complexity is incorporated as approximately 0.0738. The probabilities 0.0625 and 0.0738 correspond to an appropriate measure of complexity, based on, respectively, identical uniform priors or identical normal priors for β 1 and β 2 when evaluating M 1 : β 1 > 2β 2 > 0. This is not the case for the variance restriction β 1 = 2β 2 (suggested by Mulder et al. 2010) yielding var(β 1 ) = 4var(β 2 ) and a substantial different measure for the complexity for M 1 : β 1 > 2β 2 > 0, namely 0.125. This is illustrated in Figure 1.
Note that when time-varying covariates are also incorporated in the model, the restrictions γ * = 0, which fix the dummy parameters to zero, are also present in [R 0,V |0]. As a result, the variances of the parameters γ * are equal to zero inΨ R l . Therefore, the CECPP covariance matrix of the parameters β * , i.e., the full parameter vector β with the dummy parameters γ * omitted, can be obtained by removing the rows and columns inΨ R l that belong to the dummy parameters γ * .

CECPP and posterior
The CECPP of β, given by is the conjugate form of the empirical expected-posterior prior (Pérez and Berger 2002) when using constrained posterior priors instead of standard posterior priors. Note here that Jeffreys' prior is used for the covariance matrix Σ. The hyper parameters of the CECPP of β are robustly estimated from 1,000,000 draws based on 1,000 draws from the constrained posterior priors of 1,000 randomly selected training samples according to the scheme in Table 1.
In step (vi), the CECPP means were calculated as the sample means of the RML estimatesβ R l when R 0,M contains rows with more than two non-zero elements. The reason is that, although the medians are more robust to outliers, the vector of medians (median l β R l,1 , . . . , median l β R l,D ) will generally not satisfy the mean restrictions [R 0,M |r 0,M ] in this case. Consequently, the second key property (Section 3.1) would not be satisfied when using the medians. In all other cases, the CECPP means are calculated as the sample medians. As a result, the CECPP will be located on the boundary of the constrained space of interest.
Due to sampling fluctuations and restrictions on the mean with unequal weights, e.g., β 1 = 2β 2 , an additional modification of the CECPP standard deviations was performed at the end of step (vii) which was based on restricted regression as in (14). This ensures that the parameters that are compared with each other have equal CECPP variances. Consequently, the CECPP variances will satisfy the third key property (Section 3.1).
The conditional posterior distributions of β|Σ, Y and Σ|β, Y can be determined using Bayes' law from the CECPP and the likelihood in (5), (i) Initialize the training sample size to N l = J + K + P U + 1.
(ii) Randomly draw a minimal training sample [Y l |X l ] of size N l .

Issues with minimal training samples
Because the β's are modeled independently in the constrained posterior prior (Ψ R l is diagonal), training samples of size J + K + P U + 1 generally result in appropriate constrained posterior priors. However, in some cases, e.g., data with many identical observations, the diagonal elements of these matrices may be (approximately) zero. This problem can be overcome by increasing the training sample size such that the variance in the training data increases. In BIEMS, the training sample size is increased by 1 after 5 training samples have been sequentially selected for which the estimate of the generalized error variance based on the training sample, i.e., det(Σ R l /N l ), is smaller than 0.05 times the estimate of the generalized error variance of the complete data matrix, i.e., det(S/N ), whereΣ R l is the estimate of Σ based on the RML estimate of B of the lthe training sample given in (16). Therefore, the scheme in Table 2 is used where the roman numbers refer to the steps in Table 1. Note that the constrained posterior prior variance of β is not influenced by the size of the minimal training sample because the constrained posterior prior variances of the β's are scaled according to the number of free parameters under the restriction [R 0,M |r 0,M ], which remains unchanged.

Initialize
2. Set N l = N l + 1, V = 0, and the training sample counter to 1.

Calculating Bayes factors
When the CECPP is used as prior under the unconstrained model M 0 and the prior under the inequality constrained model M t is equal to the CECPP truncated in the parameter space of the constrained model M t , denoted by M t , the Bayes factor of model M t against model M 0 can be be expressed by where π 0 (β|Y) is the marginal posterior of β that is based on the CECPP and the complete data Y. The derivation can be found in Klugkist and Hoijtink (2007). They refer to the unconstrained prior as the encompassing prior.
BIEMS automatically computes all Bayes factors B t0 of each constrained model M t against the unconstrained model M 0 . Because the same CECPP is used for calculating the Bayes factors B t0 , for t = 1, . . . , T , the Bayes factor between two constrained models M 1 and M 2 can be computed by For an inequality constrained model M t , BIEMS automatically computes the Bayes factor of model M t versus its complement, i.e., M t : β ∈ M t , which equals In the BIEMS interface (Appendix A), the posterior model probabilities can be computed by pressing the "Compute" button in the "Posterior model probabilities" frame. These are calculated according to

Estimating f t and c t for an inequality constrained model
The quantities f t and c t are estimated using a Gibbs sampler (e.g., Gelman et al. 2004). The complexity c t can be estimated by sampling S draws from the multivariate normal CECPP of β in (17). The estimate is equal to the proportion of draws satisfying the inequality constraints of model M t , i.e.,ĉ where β (s) 0 is the sth draw from the CECPP, for s = 1, . . . , S (S = 20, 000 by default, modifiable in the interface, see Appendix A, or in the file input_BIEMS.txt, see Appendix B), and I(·) is the identity function. Similarly, f t can be estimated as the proportion of draws from the unconstrained posterior in (18) that satisfy the inequality constraints of model M t , i.e.,f where β (s) N is the sth draw from the unconstrained posterior. Based on the posterior conditionals of β|Σ, Y and Σ|β, Y in (18), a sample from the unconstrained posterior is obtained using the MCMC procedure in Table 3.
In the first step, the initial value of Σ is obtained using the posterior estimateB N , with

More efficient estimation for small c t
Estimating c t and f t using the above described method can be computationally intensive for models with small values for these probabilities. For example, for c 1 = 1/10! ≈ 2.76e−7 corresponding to a model M 1 : β 1 > . . . > β 10 based on the CECPP with identical priors for each β. A sample of at least 1e9 draws is needed to obtain a fair estimate of c 1 . For this reason, c t and f t are estimated in steps of two inequalities. For model M 1 , the probabilities 4. Repeat steps 2 and 3 for s = 1, . . . , S. c t and f t can be expressed stepwise as P (β 9 < β 10 |β 1 < . . . < β 9 ) = c 1,1 · . . . · c 1,5 for the prior (23) where the second index of c and f denotes the step.
Based on the CECPP, these (conditional) probabilities are equal to 1 6 , 1 20 , 1 42 , 1 72 , and 1 10 , respectively. The first probability P (β 1 < β 2 < β 3 ) can be estimated by directly sampling from the unconstrained CECPP and posterior and the proportion of draws satisfying β 1 < β 2 < β 3 are the estimates c 1 and f 1 , respectively. The estimates of the conditional probabilities are equal to the proportion of draws satisfying each pair of inequality constraints based on a sample from the conditional distributions of where m β d and s 2 β d are the conditional mean and variance of β d , respectively, and β L and β U are the lower and upper bound of β d , respectively. If , the conditional distribution is normal. Sampling from the truncated normal distribution can be done using inverse probability sampling (Gelfand et al. 1992). The number of draws to estimate these conditional probabilities is 20,000 by default. Thus, the total number of draws to estimate c 1 is 100,000. The same methodology is used to calculate the posterior probability f 1 .

Burn-in and convergence
BIEMS does not provide explicit MCMC output because sampling the mean parameter vector β and the covariance matrix Σ is relatively straightforward in the multivariate normal linear model using the Gibbs sampler presented in Table 3. By starting at the posterior expectation of Σ, no burn-in period is needed for obtaining a posterior sample, and consequently, convergence is not an issue. The same holds for obtaining a sample from the CECPP of β which is multivariate normal and independent of Σ.
When sampling under inequality constraints as was discussed above, the starting values for β are equal to the ML estimates. When the starting values are not located in the allowed parameter space, e.g., the ML estimate equals (β 1 , . . . ,β 5 ) = (1, 1, 1, 1, 0) when calculating P (β 3 < β 4 < β 5 |β 1 < β 2 < β 3 ), a burn-in period of 1,000 iterations is used to obtain a starting value for β and Σ in the highest probability density region under the restrictions at hand. The length of this burn-in period was always sufficient during the debugging of the program, also in extreme situations.
Furthermore, by default different seeds are used for every MCMC run which always resulted in accurate estimates of the Bayes factor during the debugging of the program. If necessary, the user can increase the MCMC sample size to obtain an estimate of the Bayes factor with higher accuracy. The computation of the sampling error of the Bayes factor is discussed later in this section.

Inverse probability sampling when the normal density is approximately zero
If the posterior density of β p is approximately zero in the allowed parameter space, say (β L , β U ), inverse probability sampling (Gelfand et al. 1992), according to β (s) = Φ −1 (u (s) ), with u (s) ∼ U (Φ(β L ), Φ(β U )) where Φ is the conditional posterior cdf of β, does not work because Φ(β L ) ≈ Φ(β U ) ≈ 0 or 1. This occurs for instance when the constraints badly fit the data.
When the allowed parameter space is an interval (β L , β U ) with approximately zero density, which is the case when the conditional posterior is β ∼ N (m β , s 2 β ) with m β β L or m β β U , the conditional posterior of β in the interval (β L , β U ) is approximated using an exponential distribution, i.e., π(β) = λ exp(−λβ). If m β β L , the parameter λ is obtained by first determining the distance ξ from the lower bound β L for which the posterior density is halved, i.e., Because the exponential density is halved at log(2)/λ, it holds that λ = log(2)/ξ with ξ given above. Based on this λ, a draw is obtained from the exponential distribution, In the case of m β β U , the procedure is similar.

Sampling error of the estimate of the Bayes factor
The estimation of the probabilities f t and c t as proportions of, respectively, the posterior and prior draws that satisfy the inequality constraints results in an estimation error of the Bayes factor. This estimation error is proportional to the reciprocal of the sample size. In BIEMS, this estimation error is reported in the output file. In this section, it is illustrated how the standard error of the estimate is calculated.
In BIEMS, the estimates of these probabilities are obtained as the proportion of hits in the CECPP and posterior, for c 1,h and f 1,h , h = 1, . . . , 5, respectively, satisfying the inequality constraints. Hence, each probability has a beta distribution given by beta where h is the number of draws satisfying the inequality constraints in step h from a total of S draws from CECPP or posterior. Subsequently, a sample of the Bayes factor B 10 can be obtained by sampling from these beta-distributions. The standard error of the estimated Bayes factor B 10 is equal to the standard deviation of this sample.
A small simulation study was conducted to determine the relationship between the sample size per two inequalities and the estimation error of the Bayes factor. A data set of size N = 10 was generated from a population with means β = (0, 0, 0, 1, 1, 1, 2, 2, 2, 2) and covariance matrix Σ = I 10 . The Bayes factor of M 1 : β 1 < . . . < β 10 versus M 0 : β ∈ R 10 was equal to 2.0e6. In Figure 2, a graph is displayed where the standard error of the Bayes factor is plotted versus the sample size per two inequalities. Given the large outcome of the Bayes factor, these standard errors are reasonably low. The default sample size is equal to 20,000 iterations per two inequalities. The sample size can be adjusted in the interface (Appendix A) and in the file input_BIEMS.txt (Appendix B) if a more accurate estimate is needed.

Equality constrained models
For constrained models with equality constraints between the model parameters (possibly in addition to inequalities), f t and c t cannot be estimated by directly using the above described method because sampled values are never exactly equal. In this section, it is described how the Bayes factor of such models versus the unconstrained model is estimated in BIEMS based on the methodology given in the previous section. The idea was suggested by (Laudy 2006, p. 115).
First, the equality constraints in model M t are replaced by approximate equalities, e.g., The evaluation bound δ 0 is chosen equal to the CECPP standard error of β 1 . Thus, the constrained model M t is approximated by an inequality constrained model, which is denoted by M t,δ 0 . The Bayes factor of model M t,δ 0 versus M 0 , i.e., B (t,δ 0 )0 , can be estimated using the method of the previous section.
Subsequently, the Bayes factor is estimated between the inequality constrained model M t,δ 1 and the inequality constrained model M t,δ 0 , with δ 1 = δ 0 /2, as the ratio of proportion of draws from the posterior and CECPP truncated at the inequality constraints of model M t,δ 0 that satisfy the inequality constraints of model M t,δ 1 . CECPP and posterior samples under Figure 3: Draws from the unconstrained posterior of (β 1 , β 2 ) (black), and subsequent constrained posteriors by |β 1 − β 2 | < δ w , with δ a+1 = δ a /2, δ 0 = 0.9. The dashed line is β 1 = β 2 . The data was generated for two groups of size 100 each with means β = (0, 1.5) and error variance σ 2 = 1. the inequality constrained model M t,δ 0 can be obtained using inverse probability sampling (Gelfand et al. 1992) and the method described in Section 4.1.

The Savage-Dickey density ratio
Consider the simple selection problem of an equality constrained model M 1 : β = m versus the unconstrained model M 0 : β ∈ R 1 . The Bayes factor in (19) is then equal to the Savage-Dickey density ratio (Dickey, 1971), which is given by  (26). The Bayes factors provided by BIEMS using the above described mechanism were equal to B 10 = 5.35 (with sampling error 7.4e−2) for the first data set and B 10 = 0.24 (with sampling error 4.6e−3). Hence, the true Bayes factors lie well within the confidence intervals provided by BIEMS from which it can be concluded that the methodology can also be used for models containing equality constraints between the parameters. The effectiveness of this procedure was also shown by van Wesel et al. (2011).

Short examples
In this section several commonly used models are briefly discussed including two constrained models of interest. In each example, the parameter matrix B, the vector β, and different constrained models are given. Also the specification of the restriction matrix is discussed in each example. In Section 5.6, five empirical studies are briefly discussed which were analyzed using BIEMS. The analyses of these studies can be found on the internet site http://tinyurl.com/informativehypotheses.
Note here that the order of the µ's in B and β depends on the coding of the groups 1 to 6 in data.txt. Subsequently, the following model selection problem tests the presence of an interaction effect in a specific direction The constraints of each model needs to be separately specified in the interface. For instance for M 1 , the inequality constraints are µ 11 − µ 21 > µ 12 − µ 22 and µ 12 − µ 22 > µ 13 − µ 23 . The set of model constraints is then For M 2 , the inequality ">" is replaced by an equality sign "=". The default set of restrictions for the specification of the prior distribution are equal to the row echelon forms of for the CPP means and for the CPP variances. The resulting CECPP means and variances also satisfy these restrictions.
The (in)equality constraints in each model must be separately specified in the model, e.g., α 1 > α 2 , α 1 > α 3 , α 1 > α 4 , and α 1 > α 5 for model M 2 . The default set of restrictions are obtained by taking the row echelon form of the combination of the constraints of both models.

This results in an identical set of restrictions for the CPP means and variances which is given by
Based on these restrictions, the CECPP of α 1 , . . . , α 5 are identical. After obtaining the Bayes factors B 10 and B 20 , the Bayes factor between these two models is equal to B 12 = B 10 /B 20 . The posterior model probabilities can also be computed in the interface of BIEMS (Appendix A).
When the data is not standardized, the constraints of models M 1 and M 2 are evaluated for unstandardized regression coefficients. Generally this is not recommendable because covariates may be measured on different scales. For this reason, it is recommended to standardize the data which can be specified in Step 2 in the interface (Appendix A) and in the file input_biems.txt (Appendix B). As a result, the model constraints are evaluated for standardized regression coefficients.

Regression 2: Variable selection
Again, consider the regression model above with five exploratory variables and the following model selection problem This is a variable selection problem where model M 1 only incorporates the first covariate to explain the dependent variable, model M 2 incorporates the first, second, and fourth covariate, and the unconstrained model M 0 incorporates all five covariates. Similarly as in the previous section, we recommend to standardize the data.
In the interface where both models are simultaneously implemented, the default set of restrictions for the CPP means are given by In the stand-alone version where each model is separately evaluated, this set of restrictions must be manually set for model M 2 (for M 1 , these restrictions correspond with the default setting). There are no restrictions for the CPP variances in the default setting, and therefore the CECPP variances will generally differ between the α's.

MANCOVA
Consider an experiment where three outcome variables are measured for two different groups with one covariate, i.e., B =   µ 11 µ 12 µ 13 µ 21 µ 22 µ 23 α 11 α 12 α 13   , and β = (µ 11 , µ 21 , α 11 , . . . , α 13 ) , and the following models M 1 : µ 11 > µ 21 , µ 12 < µ 22 , µ 13 > 2µ 23 > 0 Under model M 1 , the third measurement mean of the first group µ 13 is more than twice as large as the measurement mean of the second group µ 23 , which is larger than zero. Such constraints can occur in practice when the µ's are the means of measurement differences. We also recommend to standardize the explanatory variables in a MANCOVA. The Bayes factor of model M 1 against its complement M 2 is automatically calculated, which can be viewed by pressing the button "Detailed output" after the Bayes factors have been calculated.

Empirical studies
Here, we briefly discuss five empirical studies which were analyzed using BIEMS. Replication materials are available along with the manuscript and the data sets can also be found on the website http://tinyurl.com/informativehypotheses. The first data set was discussed by Kammers et al. (2009), the second data was discussed by Mulder et al. (2009), and the third, fourth, and fifth data sets were discussed by Mulder et al. (2010).

Repeated measurements 1: Rubber hand illusion
In an experiment discussed by Kammers et al. (2009b), the occluded right index finger and the visible index finger of a rubber hand were stroked either synchronously (illusion condition) or asynchronously (control condition). After this stimulation period, one of five perceptual localization responses was collected for N = 14 subjects. The following constrained models were of interest where µ p is the measurement mean of the perceived location of the occluded hand, for p = 1, . . . , 5. Under model M 1 , the location of the occluded hand is unaffected by the amount of new proprioceptive information. Under model M 2 , the perceived location of the occluded hand is affected when moving the occluded hand, although it is independent of the amount of movements. Under model M 3 , the perceived location of the hand is influenced by the number of movements of the occluded hand.

Repeated measurements 2: CONAMORE
The conflict and management of relationships study (CONAMORE; Meeus et al. 2004) is an ongoing longitudinal study in the Netherlands. The goal of the study is to asses relationships of adolescents with parents and best friend as well as the emotional states of the participants. In this example, the data consisted of N = 340 Dutch participants who were measured once a year in a period of five years (from the age of 13 until 17). This was a part of a larger data set described in Selfhout et al. (2009).
The outcome variable was self-reported satisfaction with peer relations. There were two groups: boys and girls (coded by j = 1 and 2 in data.txt, respectively) and one time-varying covariate "generalized anxiety disorder" which was measured at each wave. Four models were analyzed, where µ gp and µ bp is the mean of self-reported satisfaction of girls and boys at the age of p adjusted for generalized anxiety disorder, respectively. Model M 1 states that there is no difference in self-reported satisfaction between boys and girls at each wave. Model M 2 states that girls score their satisfaction with peer higher than boys at each wave. Model M 3 states that girls score higher on self-reported satisfaction than boys and that the difference increases at each wave. Finally, model M 4 states that that there is no difference between boys and girls at the age of 13 and 14, but from the age of 15 girls score higher than boys and the difference increases at each wave.

Multivariate one-sided testing: Cell counts of HIV-positive newborn infants
In this example, researchers were interested whether ritonavir therapy had a positive effect on the immunoreconstruction of 36 HIV-positive newborn infants (Sleasman et al. 1999). The analysis was limited to the cell counts of CD45RA T and CD45RO T, which were measured at birth and after 24 weeks of therapy (Larocque & Labarre, 2004). The models under investigation were where µ = (µ 1 , µ 2 ) which contain the mean differences between the cell counts (after 24 weeks minus at birth) of the two cell types, and 0 = (0, 0) .

Multivariate analysis of variance: Does vinylidene fluoride cause liver damage?
Vinylidene fluoride is suspected of causing liver damage. To investigate this, 4 groups of 10 male Fischer-344 rats received different dosages of this substance by inhalation. Three serum enzyme levels (SDH, SGOT, and SGPT), which often indicate liver damage, were measured for each rat. The data, printed in (Silvapulle and Sen 2005, p. 10), appeared in a report of Litton Bionetics Inc. (1984).
The data can be analyzed with a MANOVA model. When denoting µ j = (µ j1 , µ j2 , µ j3 ) with µ jp the mean of the pth serum level of the jth group, for j = 1, . . . , 4, where the dosage of vinylidene fluoride increases from group 1 to 4, the following two models were of interest: Model M 1 assumes that vinylidene fluoride has no influence on the three serum levels. Model M 2 assumes that each serum level increases when increasing the dosage of vinylidene fluoride.

Multivariate regression: Cigarette burn and sugar percentage
The data that was provided by (Anderson and Bancroft 1952, p. 205) consisted of chemical components of 25 tobacco leaf samples consisting of six explanatory variables: nitrogen (x 1 ), chlorine (x 2 ), potassium (x 3 ), phosphorus (x 4 ), calcium (x 5 ), and magnesium (x 6 ), and two dependent variables: rate of cigarette burn in inches per 1,000 seconds (y 1 ) and per cent sugar in the leaf (y 2 ). After standardizing the dependent variables and covariates, the following models with inequality and equality constraints on the standardized regression coefficients α kp with k = 1, . . . , 6 and p = 1, 2 (inspired by the results of Bedrick and Tsai 1994) were analyzed M 1 : α 11 = . . . = α 61 , α 12 = . . . = α 62 M 2 : 0 < (α 11 , α 31 , −α 41 , α 51 , −α 61 ) < −α 21 , and 0 < (α 12 , α 32 , α 42 , α 52 , −α 62 ) < α 22 , Model M 1 assumes all standardized regression coefficients corresponding to a dependent variable to be equal, model M 2 assumes that chlorine dominates the other explanatory variables in predicting the two dependent variables and that all standardized regression coefficients are either smaller than zero or larger than zero, and model M 3 assumes that each dependent variable is explained by only two specific explanatory variables where chlorine is dominant over the other explanatory variable.

Conclusions
The program that was discussed in this paper can be used to determine the Bayes factor of a multivariate normal linear model with equality and/or inequality constraints between the parameters against the unconstrained model. The outcome can be interpreted as the relative amount of evidence that is provided by the data in favor of the constrained model versus the unconstrained model. Due to the use of the CECPP ), which is based on certain linear restrictions and minimal training samples, the obtained Bayes factors appropriately balance between model fit and complexity for a broad class of equality and inequality constrained models.
A user-friendly interface (Appendix A) can be downloaded from the website http://tinyurl. com/informativehypotheses, which will allow researchers who are not familiar with the technical details of statistical modeling to use the proposed methodology. Also a stand-alone version (Appendix B) can be downloaded from this website. Because the Bayes factor of each constrained model against the unconstrained model is computed separately in the stand-alone version, it is important that the same set of restrictions (Section 3.2, Appendix B), and therefore the same CECPP, is used for each computation. In the interface, this is automatically done. The interface also provides posterior model probabilities of all models under investigation. Based on the posterior model probabilities, the constrained models can be compared directly with each other concerning the amount of evidence provided by the data for each constrained model.

A. User manual of the BIEMS interface
The BIEMS interface can be obtained from http://tinyurl.com/informativehypotheses. After downloading and unzipping the folder containing this software package, the executable BIEMS.exe will be found after opening the folder. The purpose of this appendix is to describe how to use the interface. Only a data file data.txt needs to be provided, all the other information can be entered using the the BIEMS Windows interface. Execution of BIEMS will render output that can be viewed within the Windows interface and will be placed in the subfolder BIEMS_calfBF.

A.1. Data
The data in data.txt should be formatted as whitespace-or tab-separated values without header and with no missing values, where each column corresponds to a variable. BIEMS uses all variables in the data file, so only the variables of interest should be included. In addition, the variables in the data file need to be in the correct order. Consider an example with three dependent variables, one explanatory variable, one time-varying covariate, and two groups (i.e., using the notation introduced in Section 2, P = 3, K = 1, U = 1, J = 2). As was described in Sections 2.1 and 2.2, the observed variables in the data must ordered in data.txt as follows that is, first the dependent variables, then the explanatory variables, than the time-varying explanatory variables and finally the variable containing group membership. The group column is required, although it does not need to be sorted. If there is no grouping in the data, it should contain all 1's.

A.2. Step 1: Data selection
After clicking on BIEMS.exe the opening screen ( Figure 5) will unfold. Clicking the button Select or Generate that can be found under the label Step 1 in the upper left hand corner of the screen, renders the choice to generate data (which is discussed in the last section of this manual) or to import your own data. After importing your own data, these will be displayed in a new screen ( Figure 6). In this screen you have to provide the Number of dependent variables, Number of explanatory variables and Number of time varying explanatory variables. After pressing Ok, a screen ( Figure 7) will pop up that allows you to specify the constrained models of interest. This screen will be discussed in the next section. A.3.
Step 2: Specifying models In the upper left hand corner of the model specification screen (Figure 7) you find a frame which is entitled 1: Compose. The tool in this section allows you to specify model constraints. If a model of interest is, for example, µ(1, 1) > µ(2, 1) > µ(3, 1) > µ(4, 1) with respect to four means in a one-way ANOVA, this can be specified by subsequently entering: µ(1,1) > µ(2,1) followed by a click on Add to list, µ(2,1) > µ(3,1) followed by a click on Add to list, µ(3,1) > µ(4,1) followed by a click on Add to list.
In the bottom left hand corner of the screen displayed in Figure 7, there is a frame 2: List of Inequalities and Equalities. Clicking on Define as Model places the composed model in the top right hand corner in the frame List of Models. Once this has been done, another model can be specified using the tool in the frame 1: Compose. To provide some examples of inequality and equality constraints one can specify: µ(1, 1) − µ(2, 1) = µ(3, 1) − µ(4, 1), µ(1, 1) − 2 * µ(2, 1) + µ(3, 1) > 0, µ(1, 1) − µ(2, 1) > 2.5, Figure 6: Data file selector screen. The model must be specified be setting the appropriate number of dependent variables, number of explanatory variables, and number of time-varying explanatory variables.
Note that the Edit and Delete buttons in the frames 2: List of Inequalities and Equalities and 3: List of Models can be used to modify or delete constraints in frame 2 and models in frame 3, respectively. As soon as the models of interest are specified the default prior distribution can be generated by clicking the button Generate Default Prior which can be found in the top of the screen. Before discussing this, some additional features in the model specification screen will be elaborated.

A.4. Additional features in the model specification screen
In the bottom right hand corner of the model specification screen (Figure 7), the frame 4: Settings can be found. Here the following default values for the computation of Bayes factors and posterior model probabilities can be modified: Ticking the circle in front of Specify restrictions manually allows manual specification of the restrictions used to construct the prior distribution using the tool in frame 1: Compose that was previously used to compose models. The restrictions that are used to specify the prior distribution are discussed in Section 3.2 of the paper. The scale of the prior variance can manually be adjusted by ticking the circle in front of Manually. The desired scale can be entered in the box located on the right of Manually.
The default value of the scale of the prior distribution is discussed in Section 3.3.
The default MCMC sample size can be adjusted. The MCMC sample size is discussed in Section 4.1 and in Section 4.2.
The default number of Max BF steps can be adjusted. This number of steps is denoted by A in Section 4.3.
The last item, which can be found in the lower right hand corner of this screen, is important. Here it can be indicated whether or not the dependent variables and the (time-varying) explanatory variables should be standardized. Note that regression coefficients in a (multivariate) multiple regression are only comparable if both the dependent and independent variables are standardized (Section 5.2). Note also that in the context of an (M)ANCOVA, i.e., there are groups and continuous explanatory variables in the model, the continuous predictors have to be standardized (Section 5.4).

The button
Step Back brings you back to the previous screen and deletes the information in the current screen. The button Close will end the current session with BIEMS. A.5.
Step 3: Generate the default prior distribution As soon as the models of interest are specified, the default prior distribution can be generated by clicking the button Generate Default Prior which can be found in the top of the screen under Step 3. A screen will open (Figure 8) showing the prior means and covariance matrix, prior scale and training sample size, and matrices containing the restrictions used for the computation of the prior means and variances. More information with respect to the content of this screen can be found in Sections 3.3 and 3.4. The screen can be closed by clicking the Ok button. Note however that it is possible to change the prior means and prior covariance matrix by typing new values for the old values. A.6.
Step 4: View/edit default prior Pressing the button View/Edit Default Prior which can be found in the top of the screen under Step 4, renders a return to the screen discussed in the previous section.

A.7. Step 4: Calculate Bayes factors
Pressing the button Calculate Bayes factors, which can be found in the top of the screen under Step 4, leads to a screen ( Figure 9) where the Bayes factors are calculated. These will be displayed when the MCMC computation is done. For each model, detailed output can be inspected by clicking the buttons Detailed output. Furthermore, by pressing (In)equality constraints, the (in)equality constraints of each model can be viewed. The main features of the output will be discussed in the next section. Finally, the posterior model probabilities will be computed after pressing Calculate in the appropriate frame. These will be displayed in the frame below.

A.8. Output
The results of an analysis are given in the file output_BIEMS.txt. At the top of the file under the label ***Bayes factor***, the Bayes factor for the constrained model M t against the unconstrained model M 0 is given (Section 4), along with its standard error (Section 4.2): 69.76646 If a model only contains inequality constraints, the estimated model complexity c t and estimated model fit f t (Section 4.1) are provided, as well as the Bayes factor for the model against its complement (Section 4). Under the label ***hits/BF in substeps***, the details of the computation of f t , c t and the Bayes factor in each substep are provided (Section 4.1 under the label 'More efficient estimation for small c t ').
In the output under the label ***Descriptive Statistics***, the maximum likelihood estimate of the parameter matrix B and the least squares estimate of Σ are given (Section 2.1). Under the label ***posterior***, the posterior means and variances of B are displayed. Under the label ***Prior***, the CECPP mean β EC 0 is displayed in its matrix form B EC 0 , followed by CECPP variances which are the diagonal elements of the prior covariance matrix Ψ EC 0 in (17), and finally the complete covariance matrix Ψ EC 0 is provided. These hyperparameters are ordered in essentially the same way as the matrix B, e.g., CECPP mean of B 1.294 1.294 1.294 (group 1) 1.294 1.294 1.294 (group 2) 4.038 4.692 4.557 (covariate 1) 1.354 2.134 2.808 (time-varying covariate 1) in the case of three dependent variables, two groups, one explanatory variable and one time varying covariate. Note that the row containing the estimates for the regression coefficients of the time-varying covariate correspond with the estimates of the diagonal elements of the matrix G 1 in Section 2.2. Finally, under the label ***Prior Restrictions***, the restrictions that were used for the computation of the hyperparameters of the constrained posterior priors (CPPs) are presented. First, the restrictions for the CPP means are presented, followed by the restrictions for the CPP variances. These restrictions are elaborated in Sections 3.2 and 3.3. A.9.
Step 1: Data generation After clicking on BIEMS.exe, the opening screen ( Figure 5) will unfold. Clicking the button Select or Generate renders the choice to generate data or to import your own data. If Generate is chosen, a screen will open ( Figure 10) that allows the specification of a population from which data will be generated. The population is described by the multivariate normal linear model presented in Equation 1 in this paper, extended with a multivariate normal In the opening screen, the Number of dependent variabels, Number of Groups and Number of explanatory variables have to be given. By clicking the button Use Settings, a frame will be opened where the following characteristics must be given: The sample size of each group; The error covariance matrix Σ which is introduced a few lines below Equation 1; The group means M and the regression coefficients A which can be found in Equation 1 in this paper; The means and covariance matrix of the explanatory variables in each group.
After all parameters are specified, two options are available: simulate a data set from this population or generate a data set from the population that has sample parameters of the error covariance matrix, group means, regression coefficients, means and covariance matrix of the explanatory variables that are equal to the population parameters. If the latter is desired, the box labeled Generate exact data has to be ticked before clicking the button Generate.
Data generated from a population with known properties can be useful to run tests with BIEMS, for example, to get an impression of the sample sizes needed in order to be able to evaluate the models of interest.

B. User manual of BIEMS (stand-alone version)
The purpose of this appendix is to describe the practical details of the stand-alone version of BIEMS which can be downloaded from the web site http://tinyurl.com/ informativehypotheses. Unlike the interface, this version only computes the Bayes factor of one constrained model against the unconstrained model. It is not possible to specify multiple constrained models. For this reason, when more than one model is under investigation, say M 1 and M 2 , it is important that the same linear restrictions and therefore the same CECPP is used when computing the Bayes factors of each constrained model against the unconstrained model, B 10 and B 20 . Subsequently, the two constrained models can directly be compared with each other by calculating the Bayes factor of M 1 against M 2 by B 12 = B 10 /B 20 .
A running example will be used throughout this appendix for illustration. The BIEMS program consists of one executable file biems.exe, a data file data.txt, a settings file input_BIEMS.txt and three further input files containing the constraints of the model and the restrictions on the prior necessary for calculating the Bayes factor of the constrained model against the unconstrained model (equality_constraints.txt, inequality_constraints .txt and restriction_matrix.txt). To calculate the Bayes factors of a model, set the model specification by modifying these text files with a plain text editor and run biems.exe. This will generate a file output_BIEMS.txt containing the output for the model. Note that this will overwrite previous versions of output_BIEMS.txt when present.

B.1. Data
The data in data.txt should be formatted as whitespace-or tab-separated values without header and with no missing values, where each column corresponds to a variable. BIEMS uses all variables in the data file, so only the variables of interest should be included. In addition, the variables in the data file need to be in the correct order. Throughout this appendix we consider an example with three dependent variables, one explanatory variable, one time-varying covariate, and two groups (i.e., P = 3, K = 1, U = 1, J = 2 when using the notation of Section 2.1 and 2.2). The observed variables in the data must ordered in data.txt as follows

B.2. General settings
The general settings for BIEMS are specified in input_BIEMS.txt, which is divided into five input lines. On the first input line, the main properties of the data are defined: These are the number of dependent variables, the number of explanatory variables, the number of time-varying covariates and the number of observations, P , K, U and N respectively (Sections 2.1 and 2.2). The iseed value is the seed of the random number generator, which can be set to any positive integer. It can also be set to -1. In that case a different random seed is used each time.
The second input line contains the number of constraints and restrictions that are placed on the model. We consider a model with 3 inequality constraints and 4 equality constraints (given in the next section), i.e., Input 2: #inequalities #equalities #restrictions 3 4 -1 These values should correspond to the number of equality constraints and inequality constraints of the model defined in the files equality_constraints.txt and inequality_constraints.txt, respectively. When the value for #restrictions is set to -1, the default set of restrictions on the prior is used which is equivalent to the row echelon form of the set of model constraints (see also Section 3.2). When the restrictions are manually specified in the file restriction_matrix.txt (discussed in the following section), for instance when multiple constrained models are under investigation and the default sets of restrictions are not identical across the models, the number of restrictions should then be stated in #restrictions.
The third input line allows the user to change some default values used by BIEMS: Input 3: sample size maxBF steps scale of CPP variance -1 -1 -1 Each of these options are set to -1, in which case the program's defaults are used. The first option specifies the number of draws used to estimate the (conditional) probabilities for each set of two inequality constraints, which defaults to 20,000 (see also Section 4.1 and 4.2). The second option gives the maximum number of steps used to approximate the Bayes factor for equality constraints (Section 4.3). By default, BIEMS uses as many steps as it takes for the Bayes factors between adjacent steps to become approximately one. The third option gives the scale of the variances of the constrained posterior prior for β. By default this is equal to the number of free parameters under the restricted model (Section 3.3).
The fourth and fifth input lines allow the user to standardize the dependent and explanatory variables: Input 4: standardize DV/IV 0 2 Input 5: covariates to standardize (if standardize IV = 2) 1 0 The first value on input line 4 indicates whether to standardize the dependent variables, the second value whether to standardize the explanatory variables and time-varying covariates (0 = no, 1 = yes). The second value can also be set to 2, to indicate that only some of the explanatory variables and/or time-varying covariates should be standardized. This is specified on input line 5, which should then contain a line of 0's and 1's of length K + U , indicating for each explanatory variable and time-varying covariate whether it should be standardized or not. As specified above for the example, only the explanatory variable is standardized. When testing constraints on the regression coefficients, it is recommendable to standardize the dependent variable(s) and the (tim-varying) explanatory variable(s), so that actually the constraints on the standardized regression coefficients are evaluated. Also in the case of a (M)ANCOVA, we recommend to standardize the explanatory variables.

B.3. Constraints and restrictions
The constraints of the model and restrictions on the prior are set in the files equality_constraints.txt, inequality_constraints.txt and restriction_matrix.txt.
If #restrictions is set to −1 in input_BIEMS.txt, the latter file does not need to be specified. The first two of these files contain the matrices [Q t |q t ] and [R t |r t ], as defined in Section 2.3. All three files have the same structure, with each row corresponding to a constraint or restriction.
The model under evaluation has the following constraints: Thus, it is expected that the group means do not change over time and the mean of group 2 is higher than the mean of group 1 at each time point. This is specified in the input files as follows (parameter names have been added here for convenience): equality_constraints.txt µ 11 µ 21 α 11 γ 11 µ 12 µ 22 α 12 γ 12 µ 13 µ 23 α 13 γ 13 In this example, the default set of restrictions (i.e., #restrictions = -1 in input_BIEMS.txt) can be used for constructing the construct the conjugate expected-constrained posterior prior (CECPP). This is equal to the row echelon form of the combination of inequality and equality constraints of the model under evaluation. When the constraints in inequality_ constraints.txt and equality_constraints.txt are combines and row operations are performed, the system is containing two rows with zeros. After removing these two rows, the following set of restrictions is used restriction_matrix.txt µ 11 µ 21 α 11 γ 11 µ 12 µ 22 α 12 γ 12 µ 13 µ 23 α 13 γ 13

B.4. Output
The results of an analysis are given in the file output_BIEMS.txt. At the top of the file the Bayes factor for the constrained model against the unconstrained model is given, along with its standard error (see Section 4.2): Bayes factor of the constrained model M_t versus the unconstrained model M_0: If only inequality constraints were used, the estimated model complexity c t and estimated model fit f t are provided, as well as the Bayes factor for the model against its complement (Section 4). Following this are some more details for each of the substeps.
At the bottom of the file some descriptive statistics are given: the maximum likelihood (ML) estimate of the parameter matrix B (when time-varying covariates are present the restricted ML estimate of B is provided given that γ * = 0 (Section 2.2)), the posterior means and variances of B, as well as the means, variances and covariances for the conjugate expectedconstrained posterior prior. The values given under ***prior*** correspond to the symbols in equation (5.15) as follows: under CECPP means of B are the values of β EC 0 (in matrix form B), CECPP variance of B contains the CECPP variances of the elements in B, and CECPP covariance matrix of beta represents the prior covariance matrix Ψ EC 0 of β. Parameters for these and the earlier descriptive statistics are ordered in essentially the same way as the matrix B, e.g., CECPP mean of B where the columns represent the measurements at three different time points. As a result of the CECPP restrictions, the prior means of the adjusted measurement means µ are identical in the above matrix. The complete matrix Ψ EC 0 is given as CECPP covariance matrix of beta, and the order of the elements in B is identical to the order of the elements in β = vec(B). Finally under ***Prior Restrictions***, the restriction matrices are given that were used to obtain the CECPP means and the CECPP variances of β.