R Package OBsMD for Follow-Up Designs in an Objective Bayesian Framework

Fractional factorial experiments often produce ambiguous results due to confounding among the factors; as a consequence more than one model is consistent with the data. Thus, the practical problem is how to choose additional runs in order to discriminate among the rival models and to identify the active factors. The R package OBsMD solves this problem by implementing the objective Bayesian methodology proposed by Consonni and Deldossi (2016). The main feature of this approach is that the follow-up designs are obtained through the use of just two functions, OBsProb() and OMD() without requiring any prior speciﬁcations, being fully automatic. Thus OBsMD provides a simple tool for conducting a design of experiments to solve real world problems.


Introduction
Fractional factorial designs (often referred to as screening experiments) are generally used during the early stage of an investigation when there is limited scientific knowledge. Unfortunately, such designs often do not lead to unequivocal conclusions regarding the model structure and the combination of factors that influence the response variable. From a Bayesian perspective, this means that the posterior distributions will be flat on both model and factor spaces. In these circumstances, a follow-up design is needed, as the aim is to choose extra runs in order to resolve this ambiguity.
The strategy proposed by Consonni and Deldossi (2016) and implemented in the OBsMD package (Nai Ruscone and Deldossi 2020) for R (R Core Team 2020), is explained in Section 2 and illustrated via a tutorial in Section 5.
This proposal represents the objective Bayesian version of the criterion proposed by Meyer, Steinberg, and Box (1996), implemented in the R package BsMD (Barrios 2020) based on the Fortran 90 bundle mdopt written by Meyer (1996). The OBsMD's core is written in a Fortran 90 bundle obtained by modifying in a suitable way, because of the adoption of the objective priors, the package mdopt of Meyer (1996). The program was converted to subroutines to be run from the R package. The package is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/packages=OBsMD. The adoption of an objective Bayesian approach (Berger and Pericchi 2001), where the priors are derived by default methods based on the statistical model, implies that no prior specifications are required to be provided by the users so that the approach is fully automatic.
The paper is organized as follows. After a brief recall of classical strategies for augmenting an experimental design, Section 2 presents the objective Bayesian approach implemented in the R package OBsMD through the use of an example. A discussion of the choice of some relevant arguments for the functions OBsProb() and OMD() is reported in Section 3. Section 4 schematically introduces the functions OBsProb() and OMD() and their arguments. Finally a tutorial using the MetalCutting dataset included in OBsMD is reported in Section 5.

Follow-up design in an objective Bayesian framework
Several different techniques and strategies exist for augmenting an experimental design: foldover (Montgomery and Runger 1996), semifolding (Mee and Peralta 2000), the D-optimal design approach to augmentation (Goos and Jones 2011), a Bayesian strategy in the context of model discrimination later Consonni andDeldossi 2016). The package OBsMD implements the last of these proposals. To introduce the reader to the approach, Section 2.1 briefly explains -through the use of an example -when the researcher needs to augment a design and how he/she can choose the additional runs. The main results of Consonni and Deldossi (2016) are recalled in Section 2.2 together with the main output of the package OBsMD applied to the previous example. Finally Section 2.3 illustrates how the original mdopt program written by Meyer (1996) has been modified because of the adoption of the objective priors.

Aim of the package
Screening experiments are employed in the initial stage of investigation to identify the significant factors, i.e., those which affect the response variable from a list of many potential ones. We refer to this kind of factors as active factors; see Box, Hunter, and Hunter (1978) or Montgomery (2006) for references on this topic.
Consider for instance a response variable which is potentially influenced by k categorical factors. To discover which factors are active, the experimenter usually considers the designs 2 k , i.e., designs where all the k factors can take on only two values (low = −1, high = +1). A run specifies the level of each factor at which the experiment is conducted. Ideally, one would want to experiment with all possible 2 k runs (i.e., all the combination of the levels of the k factors). However, as the number k of the factors increases, the amount of experimental effort requested may be too high for an initial experiment. For example, a complete replicate of the 2 6 design requires 64 runs. In this design only 6 of the 63 degrees of freedom correspond to main effects, and only 15 to two factor interactions. The remaining 42 degrees of freedom are associated with interactions higher than 2. If the experimenter can reasonably assume that certain high-order interactions are negligible, information on  main effects and low-order interactions may be obtained by running only a "fraction" of the complete factorial experiment.
These are the so-called fractional factorial experiments 2 k−q , corresponding to one half fraction when q = 1, one quarter fraction when q = 2, . . . and so on. Refer, e.g., to the package FrF2 (Grömping 2014(Grömping , 2020 to create and analyze fractional factorial 2-level designs. The reduction of runs (from 2 k to 2 k−q ) introduces aliasing structures in the estimation process with the result that main effects are confounded with some interaction terms; the interested reader is referred to Box et al. (1978) and Montgomery (2006). As a consequence, this kind of design works well and gives clear results as long as the interactions among factors, especially those of high order, can be considered negligible. When this is not the case, the aliasing structure underlying these designs can lead to ambiguous conclusions regarding which combinations of factors are really active. Thus it is necessary to augment the design with n f extra runs. This kind of experiment is called a "follow-up design".
Consider for instance the dataset MetalCutting included in the OBsMD package. This dataset was used in Mønness, Linsley, and Garzon (2007) and re-examined in Edwards, Weese, and Palmer (2014) with the aim to compare methods to design follow-up experiments. It is introduced here just for illustrative purposes and it is analyzed in Section 5 with the aim to identify the oracle model and active factors.
It is a six-factor (A, B, C, D, E, F ) full factorial experiment designed to determine effects having an impact on metal surface finish in a cutting operation. Table 1 displays the k = 6 factors involved in the experiment and their levels. The full factorial design is composed of 2 6 = 64 runs obtained by combining in all possible ways the levels of the 6 factors. It is reported in Table 2 together with the value of the block 1 variable and that of the response variable (Ytransformed) obtained by a reciprocal transformation of the original y; see Mønness et al. (2007). Each row of the MetalCutting dataset represents a run which defines, apart from the block, the level of each factor at which the experiment is conducted. For instance, run = 1 with (−1, −1, −1, −1, 1, −1) defines the experiment done when the fifth factor (E) is fixed at high level, while all the other factors (A, B, C, D, F ) are set at low level. Often the cost per run and time constraints limit the number of combinations that can be included in the experiment, thus the researcher usually performs only a fraction of the full factorial experiment; as a consequence, he/she will know the value of the response variable only relative to this fraction of runs. Consider for instance the fractional factorial design 2 6−2 IV with generators 1 Observe that in Table 2 the variable block is a constant (it presents only a level equal to −1). This happens because Table 2 refers to the full data experiment performed on a batch. When it is impossible to perform all the runs of the experiment under homogeneous conditions, we can take into account this circumstance setting the block variable at a different level (for instance equal to +1).   Table 3: Runs and response variable Ytransformed corresponding to the 2 6−2 IV fractional factorial design with generators E = ABC and F = ABD. E = ABC and F = ABD, where only n = 16, over the 64 runs, are performed; see Edwards et al. (2014).
These 16 runs are identified according to a rule defined by the generator functions 2 . Thus, the experiment corresponding to the runs 62, 28, 51, 16, 64, 21, 26, 42, 44, 23, 39, 1, 14, 49, 37, 3 of Table 2 are performed in order to observe the response variable (see Table 3). This design has resolution IV, this means that the main factors are aliased with three-factors interactions. Thus, if we suspect that three-factors interactions are not negligible, i.e., their effect is not null, the results can lead to ambiguous conclusions regarding which of the main factors is really active, since in the estimate of E and F is included also, the effect of the interaction ABC and ABD, respectively. Accordingly the augmentation of the design may help in providing clear responses.
Several different techniques and strategies exist for augmenting an experimental design; their selection depends essentially on the results of the initial experiment and the availability of resources.
Foldover is a classic approach for design augmentation; see Montgomery and Runger (1996). It consists of a follow-up design with a number of runs equal to that of the initial experiment (n f = n), where one or more factors have opposite levels compared with those of the initial experiment.
Semifolding (Mee and Peralta 2000) is a strategy where only half of the foldover runs are performed (n f = n/2). It is usually obtained by selecting from the foldover design the runs related to a desirable level of a prominent factor revealed in the initial experiment.
The D-optimal design approach to augmentation (Goos and Jones 2011) is driven by the best model identified in the screening experiment. Conditionally to this model, the additional n f > 0 follow-up runs are chosen to minimize the generalized variance of the parameter estimates. One criticism of this approach regards the follow-up runs which are chosen solely on the basis of improving estimation for the single model identified as the best one in the initial experiment.
To overcome this approach Meyer et al. (1996) and later Consonni and Deldossi (2016) propose a Bayesian strategy in the context of model discrimination based on predictive densities for competing models. The idea is the following: since, due to confounded and/or limited data, there are generally multiple models (see Section 2.2) that provide similar or identical response predictions, follow-up runs have to be selected to allow maximum discrimination among all these plausible models. Within the Bayesian construct this means searching for runs able to produce strong diversification of posterior model probabilities. This result is obtained maximizing a model discrimination criterion. To compute the posterior probability of each model, one requires a prior on the model space, as well as a prior on the space of parameters, conditionally on each single model. The above proposals Consonni and Deldossi 2016) differ in the choice of these priors. The R package OBsMD implements the objective Bayesian methodology introduced by Consonni and Deldossi (2016) for this aim. The main feature of this approach is that it does not require prior specifications, being fully automatic.

Basic formulation, objective priors, main outputs
In the context of experimental designs, it is customary to assume that the response is normally distributed with expectation having a linear regression structure. Specifically, consider k categorical factors, and n experimental runs extracted from a full factorial experiment.
Let M i be a model which specifies a set of f i active factors (0 ≤ f i ≤ k). Following Consonni and Deldossi (2016) we assume the effect forcing assumption (Chipman and Hamada 1996) according to which an interaction AB is active, and hence included in the model, if and only if both its corresponding main effects are active. As a consequence each model M i will contain a certain number of main factors and all their interactions up to a certain fixed order, typically not higher than three. The assumption of effect forcing was adopted by Consonni and Deldossi (2016) to simplify the illustration of their theory, and also for comparison with the work of Meyer et al. (1996). The package could be extended by relaxing this assumption, therefore considers a more flexible approach. For instance two common ways of specifying heredity relationships are strong heredity, where an interaction can be active with probability p ≤ 1 if both its corresponding main effects are active (under the effect forcing assumption p = 1), and weak heredity, where all interactions have at least one parent main effect in the model; see Chipman and Hamada (1996). Another possibility is to assume that interactions may be present in a model only with a certain probability as advocated in Bingham and Chipman (2007). In this way the user is able to incorporate prior opinions on structural aspects of effects; see also Wolters and Bingham (2011).
Notice that the total number of distinct models M i will be equal to 2 k , including the null model M 0 (no factor is active), and the full model (all factors and interactions are active). For instance, consider the case of k = 3 active factors {A, B, C}. Then the 2 3 = 8 distinct models are those reported in Table 4: the null model, three models each with only one main Table 4: List of the total number of possible distinct models under the effect forcing assumption with k = 3 and interactions up to the third order.
effect, three models each with two main effects and the related two-factor interactions, the full model that contains, besides the three main effects, also all the two-factor interactions AB, AC, BC, as well as the three-factor interaction ABC, if interactions up to third order have been chosen. Thus a model is fully defined only fixing its main possibly active factors. We will assume that under model M i where X 0 represents an n × t 0 design matrix containing all columns which are common to all models (typically the intercept and possibly the block factors), while X i represents an n × t 1 matrix whose columns are related to each specific effect under model M i , i.e., the main factors and their interactions up to the desired order.
With reference to the prior on the model space, it is customary to assume that each factor is included in any particular model with some probability π independently of the other factors.
If M i contains f i specific active factors, f i ∈ {0, 1, . . . , k}, then In the full objective Bayesian perspective of Consonni and Deldossi (2016), π should be regarded as an uncertain quantity with its own distribution that they assume equal to π ∼ Beta(a, b). Then integrating (2) with respect to this prior yields where B(·, ·) is the usual beta function.
This kind of prior has been studied by Scott and Berger (2010), who show that it enjoys the attractive property of "multiplicity adjustment", i.e., it provides control -unlike prior (2)over the addition of spurious covariates, when performing variable selection in a linear model.
Usually the values (a = 1, b = 1), corresponding to the uniform distribution, are adopted. The alternative choice (a = 1, b = k + 1) has been advocated to achieve a stronger sparse model effect. Moreover, the main difference with such a prior is that the choice (a = 1, b = k + 1) gives more weight to more parsimonious models, relative to (a = 1, b = 1).
At this point, conditionally to a specific model M i , a prior distribution on the parametric space is needed. In the following the intercept β 0 and the error variance σ 2 are regarded as "common" parameters across models, while β i is the model-specific vector of regression parameters.
Consonni and Deldossi (2016) adopt a robust prior that does not require the user to specify any tuning parameters and that satisfies all the desiderata a prior should have from an objective model selection perspective; see Bayarri, Berger, Forte, and García-Donato (2012). According to their approach, a non-informative prior proportional to 1 over σ is given to the common parameters β 0 and σ, whereas the prior of the specific parameters β i of the model is a mixture of normal distributions with weight given by a new (proper) distribution prior. The analytic expression of this hierarchical g-prior for model selection is where p(β 0 , σ) is the prior on the common parameters shared by all models, with 1 A (t) = 1 if t ∈ A and 0 otherwise; refer to Consonni and Deldossi (2016) for further details.
Combining the prior on the model space with the parameters' prior, it is possible to obtain the posterior distribution of each model in closed form where P j0 is the prior odds of model M j relative to M 0 implied by (3), and BF j0 (y) is defined as where 2 F 1 is the standard hypergeometric function (see Abramowitz and Stegun 1965, p. 555) and Q i0 (y) = SSE i (y)/SSE 0 (y) is the ratio of the sum of squared errors of models M i and M 0 . Summing the probabilities P(M i |y), i = 1, . . . , 2 k over all models that include a specific factor, it is possible to obtain the posterior probability that it is active. For instance the posterior probability that factor j is active, P j (y), is given by Function OBsProb() of the R package OBsMD computes the posterior probabilities of models and factors based on objective priors implementing formula (5) and (7), respectively. Two further outputs of OBsProb() are: (a) The value of the normalized Shannon entropy index (hereafter named "Shannon index") where s is the total number of distinct models 3 . The Shannon index will assume values close to zero when there is low model uncertainty, i.e., there is one model with high posterior probability (thus, in the model discrimination framework, low values of the Shannon index are better).
(b) The value of the coefficient of variation (hereafter named "CV") that is the ratio of the standard deviation to the mean of the posterior probability of the factors (see (7)). The CV among posterior probabilities of factors is high when we have a clear evidence in favor of some factors (thus, high values of CV are better).
Just to give an idea of how OBsProb() works, consider for instance to have performed the 2 6−2 IV fractional factorial design of Table 3 and thus to know the values of the response variable only for these 16 combinations of factor levels.
R> es3.OBsProb <-OBsProb(X = X, y = y, abeta = 1, bbeta = 7, blk = 0, + mFac = 6, mInt = 2, nTop = 64) By changing the mInt input to 3, interactions up to the third order will be included. We refer the reader to Section 4 for an extensive illustration of the arguments of OBsProb(). Using the print method one can inspect the OBsProb() output. By suitably setting the argument nTop, which is the number of the top ranked models to print, you will get a glimpse of the output.
With regard to the previous example we can observe that none of the models has a posterior probability greater than 0.5. In particular the same mass of probability (0.329) is centered on two models, the first containing as active factors D, E, F (namely 4, 5, 6 in the output) and the second the factors C, D, E (3, 4, 5 in the output). The Shannon index is equal to 0.401. Looking at the factor probabilities reported in Figure 1 we could conclude that factors D and E are active (their posterior probabilities are greater than 0.8), but we have less evidence in favor of C and F since their posterior probabilities are around 0.35. The CV index is 0.844. Thus, augmenting the design is recommended.
To choose the n f follow-up runs which best discriminate between competing models, we look for those runs corresponding to the maximum value of the following model discrimination criterion (MD): where m(·|y, M i ) is the (posterior) predictive density for the vector of follow-up observations, and is the Kullback-Leibler (KL) divergence of the density f from g. Notice that MD is a weighted average of the KL-divergences between all pairs of predictive distributions for the follow-up observation introduced by Meyer et al. (1996). The closed form expression for the objective MD (OMD), obtained by Consonni and Deldossi (2016) adopting a standard reference prior where X * i is the design matrix for the additional runs.
Function OMD() of the R package OBsMD computes the n f extra-runs according to this approach.
Going on with the example of Table 3, we want to select the optimal n f = 4 follow-up experimental runs using function OMD() with the aim to discriminate among the nMod = 57 estimable models. The list of the possible follow-up runs in decreasing order according to the value of the function (10) can be obtained using the following code. Again, we refer to Section 4 for an extensive illustration of the function's arguments. Top 10 runs: OMD r1 r2 r3 r4 1 11.530 12 36 52 59 2 11.530 4 43 52 60 3 11.530 28 36 43 52 4 11.529 20 36 43 60 5 11.529 4 36 43 60 6 11.529 12 36 43 52 7 11.528 4 28 43 52 8 11.528 4 12 52 59 9 11.527 12 20 36 59 10 11.527 28 36 36 43 Looking at the output we can see that the first n f = 4 runs corresponding to the highest value of OMD are (12 36 52 59). After the augmentation of the design, function OBsMD() may be called again, and the posterior probabilities of the models and of the factors recomputed. 28,51,16,64,21,26,42,44,23,39,1,14,+ 49,37, 3), 1:7] R> y <- MetalCutting[c(62,28,51,16,64,21,26,42,44,23,39 Looking at factor probabilities, also displayed in Figure 2 (on the left side), we can observe more clearly which factors appear as active (D, E and F ) and which not (C). Furthermore, there is only one model with posterior probability greater than 0.5 and it is just that containing factors D, E and F (4, 5, 6 in the output). Observe that now the Shannon index is lower (0.207 against 0.401) and the CV is greater (0.954 against 0.844), reflecting a reduced heterogeneity among competing models and an increased concentration of probabilities on really active factors, respectively.

R> X <
The same whole procedure replicated for mInt = 3 (OBsProb() plus OMD() with the augmentation of runs 4 11 52 59; see the code in the supplementary material) generates the posterior probability of factors shown in Figure 2 (on the right side). We can observe that the results do not change and they are consistent to those reported in Figure 5 of Edwards et al. (2014) who adopted different augmentation approaches for the same dataset. In conclusion this package allows, within the objective Bayesian approach, to reach two different goals. First, it produces the posterior probability of models and factors using the data observed on an initial experiment; then, it provides the identification of the optimal follow-up runs in order to discriminate among competing models.

Changes in the Fortran code
Our Fortran code is the result of an extensive modification of the original mdopt program written by Meyer (1996). In order to adopt the objective priors we had to substantially modify both the expressions of the posterior probability of the models and the model discrimination criterion.
Moreover, in the objective Bayesian approach, the model matrix [X 0 . . .X i ] is assumed to be of full column rank, so that the number of linearly independent terms in the regression structure cannot exceed n. Then only estimable (non aliased) interactions can be introduced in the models besides the main effects, conditionally on the constraints that n > t 0 + t i . This constraint is not present in the original code. Thus only the original general structure of the program is maintained, i.e., the computation of the OLS estimates of β i and the corresponding residual sum of squares SSE i under each model M i and the exchange algorithm.
As far as the prior assumptions are concerned, we need only to fix the parameter a and b of the Beta distribution on the model space (see (3)). On the contrary, the approach of Meyer et al. (1996) requires the user to assign a value to π in expression (2) (with the recommended choice being π = 0.25 to induce factor sparsity in model selection). Furthermore, with reference to the parametric space, they assign a weakly independent N (0, γ 2 σ 2 ) prior to the components of the model-specific vector of regression parameters β i , γ being a scale parameter that needs to be fixed.
As a consequence, their expressions of the model posterior probability (see Formula (A.1) in Meyer et al. 1996) are different from our expressions (5) and (6), as well as their model discrimination criterion (see Formula (A.9) in Meyer et al. 1996) is different from (10) since we adopt a reference prior for prediction purposes. Thus with our proposal the well-known problem of sensitivity to the tuning parameters (π and γ) is overcome.

How the package works
In this section the main arguments of the functions OBsProb() and OMD() are introduced together with a discussion of the theoretical implication of their setting.
Assume that an experimenter has to identify which, among k factors each at two levels, are active on the basis of n < 2 k runs. Due to the effect forcing assumption he/she knows that there are 2 k distinct models able to capture the dependence between the response variables and the potential active factors.
To obtain the posterior probability of the competing models through the function OBsProb(), the experimenter needs to fix: • The parameters a and b of the Beta distribution for the prior on the model space.
The standard assumption is a = 1 and b = 1; this choice (equivalent to uniform distribution) implies that (3) will be equal to: If we think of the competing models as grouped in k + 1 boxes, each defined according to the number of the active factors (from 0 to k), Equation 11 means that a box is selected uniformly at random from the (k + 1) boxes available; next, given the chosen box, a model is selected uniformly at random from those in that box (Scott and Berger 2010). In other words this prior gives the same probability to all the boxes as well as to all the models in each box. Recently, an alternative choice (a = 1, b = k + 1) has been advocated to achieve a stronger sparse modeling effect. This option gives more weight to more parsimonious models; see the discussion in Consonni and Deldossi (2016).
• The maximum order of interaction among factors to be considered in the model (typically 2 or 3).
It has to be set according to considerations related to the number k of factors and the dimension n of the screening design. In fact, as we have just mentioned, in the objective Bayesian approach, the model matrix [X 0 . . .X i ] is assumed to be of full column rank, so that the number (t 0 + t i ) of linearly independent terms in the regression structure cannot exceed n. This condition, along with the effect forcing assumption, implies that sometimes it is not possible to compute the posterior probability for all the 2 k different models. The number of models for which this is possible also depends on the maximum order of interactions since it influences the value of t i . For instance if k = 5 and the initial screening experiment consist of a fractional design 2 5−2 , depending on whether the maximum order of interactions is set at 2 or 3, we can estimate 26 or 16 distinct models, respectively. In fact the ten models with three active factors are not estimable if the maximum order of interactions is set equal to 3 because n = t 0 + t 1 = 8.
In this context also the presence of a block factor -incrementing the value of t 0 -could represent a restriction in the exploration of potential different models whenever the dimension of the initial experiment n is low.
Analyzing the output of OBsProb(), the researcher can observe: 1. None of the factors stands out either as clearly active or clearly inactive. Extra runs are needed in order to resolve this ambiguity. In this case it is possible to apply the function OMD() in order to identify, through the OMD criterion (10), the optimal n f runs that allow maximum discrimination among the competing models and factors.
2. There are some factors with zero posterior probabilities (inert factors). In this case the user can decide to drop the inert factors and collapse the initial fractional design on the remaining active factors, so obtaining a replicated fractional factorial design defined only on those factors identified as active. At this point OBsProb() has to be called again in order to obtain the posterior probability of models and factors related to the experiment based only on the active factors. If the result does not resolve ambiguity, then the OMD() function can be applied (see the example reported in Consonni and Deldossi 2016). Dropping inactive factors can reduce or eliminate the need of extra runs. This strategy has not been applied in this paper to preserve comparison with follow-up designs in Mønness et al. (2007).
3. There are some factors with posterior probability greater than 0.5. In this case the user can decide to stop the procedure or to continue until all factors have been determined to be either inactive or active.
To obtain the optimal follow-up design, some results obtained as output of OBsProb() are required. By default they are passed as input to the function OMD(), together with the matrix Xcand containing the list of the 2 k possible combinations of the levels of the k factors to be considered as potentially active after the previous step. Furthermore the following relevant arguments have to be fixed: • the dimension n f of the follow-up design, i.e., the number of additional runs; • the number of competing models among which the experimenter wants to discriminate. Generally the investigator chooses models with the highest posterior probabilities by examining the output of the function OBsProb() where they are ranked in decreasing order.
Moreover the researcher can decide to identify the optimal follow-up design corresponding to the highest OMD value using either the exhaustive search strategy or the exchange algorithm.
• Adopting the exhaustive search over all the possible follow-up designs of n f runs chosen from the 2 k candidates, the experimenter has to give as input to the function OMD() the matrix Mbest containing all the combinations (with replication) of n f runs chosen from Xcand. Xcand is a matrix of dimension p × n f obtained using the R function combinations().
• Adopting the exchange algorithm (proposed in Meyer et al. 1996) it is possible to avoid the intensive calculation due to the previous choice. According to this option, a design is generated at random from the set of candidate points and the corresponding OMD value (10) is computed. Then this initial design is improved by adding that run which most increases the value of OMD and removing that which results in the smallest reduction in the criterion. This add/remove procedure is continued until the algorithm converges.
In the OMD() function the exchange algorithm is set as default option. For general problems there will be a large number of possible follow-up designs and the adoption of the exhaustive search would be excessively time-consuming.
OMD() produces the optimal follow-up designs and the corresponding OMD values, ranked in decreasing order. After having collected the value of the response variable corresponding to the follow-up design, the function OBsProb() can be re-applied in order to obtain the new posterior probabilities of models and factors.
Our procedure should appeal to practitioners because the number n f of additional runs can be freely chosen. Furthermore it is fully automatic not requiring prior specifications unlike the Meyer et al. (1996) approach whose sensitivity to the tuning parameters (π and γ) may represent a problematic issue.

Package description
In this section a detailed list of all the arguments of the functions OBsProb() and OMD() is provided.
Function OBsProb() produces posterior probabilities of models and factors based on the objective priors defined in Section 2.2; its arguments are listed in Table 5. The function returns an 'OBsProb' class object. Function OMD() produces follow-up designs, i.e., the extra runs which maximize the model discrimination criterion (10) represented by weighted average of Kullback-Leibler divergences between all pairs of models; its arguments are listed in Table 6. The function returns an 'OMD' class object. In order to compute OMD() we require some output from OBsProb(). As we have previously observed, in using OMD() the researcher can decide to adopt the exhaustive search strategy or the exchange algorithm.   the following setting is required: • mIter > 0; • nStart = number of random starts to initiate the algorithm (default value is 25); • startDes = NULL.
On the other hand, to explore all possible follow-up runs the matrix Mbest is previously needed. It is obtained using the Rfunction combinations() added to the OBsMD package. Furthermore this setting is required: • mIter = 0; • nStart = nrow(Mbest); • startDes = Mbest.

Tutorial using the MetalCutting dataset
This tutorial uses the dataset MetalCutting included in the OBsMD package. See Section 2.1 for its description. It corresponds to a full factorial design 2 6 that a researcher rarely has at his/her disposal, due to the cost per run and time limit constraints. In the following we consider and analyze this dataset only with the aim to have a benchmark to compare the performance of our approach with that of the other methods to specify follow-up designs (see Edwards et al. 2014).
The data reported in Table 2 is loaded with R> library("OBsMD") R> data("MetalCutting", package = "OBsMD") The ID column indicates the row number of the MetalCutting dataset in Table 2 and it specifies the run number. The first column of the MetalCutting dataset corresponds to the block variable. It is generally used when it is impossible to perform all the runs under homogeneous conditions, for instance on the same day or using a single batch of raw material.
In Table 2 all the levels of the block variable are set equal to "−1", thus meaning that all the runs are performed under the same conditions. It follows that we do not consider it to analyze the full design, so we put blk = 0 as input of the OBsProb() function and we define X as the columns from 2 to 7 of the MetalCutting dataset.
This example exhibits most of the output of the OBsProb() function. The function calls a Fortran subroutine. Most of the output is included in OBsProb's output list. This is a list of class 'OBsProb' with print, plot and summary methods. The design factors are accommodated in the matrix X while y includes the value of the response variable. As reported in Table 5, in the call of the function OBsProb() the experimenter can fix: • The parameters abeta and bbeta of the Beta prior distribution on model space. These parameters are set by default equal to 1 (equivalent to the uniform distribution).
• The number blk of block in the experiment; this parameter is set by default equal to 0.
• The maximum number mFac of factors that the model has to contain (in our example mFac = 6, i.e., we are interested in models with possibly all the 6 factors (A, B, C, D, E, F )).
• The maximum order of interactions mInt among factors to be considered in the model (in this case its two-factor interactions (2FIs), but it is also possible to fix mInt = 3 in case of interest in 3FIs).
• The number nTop of models for which the posterior probability is reported in the output. This number cannot be greater than the value of totMod, shown in the output, which represents the number of models for which the posterior probability has been computed. Observe that totMod is less or equal to 2 nFac , where nFac represents the number of factors involved in the experiment (in our example totMod = 2 nFac = 64).   Table 2.
M2 0.220 0.014 3 4,5,6 M3 0.000 0.011 5 1,3,4,5,6 Shannon index: [1] 0.129 CV: [1] 0.717 From the posterior probabilities of factors we can observe that factors C, D, E, F are active while A and B are inert since P A = P B = 0, while P C = 0.77 and P D = P E = P F = 1. These posterior probabilities can be exhibited as in Figure 3 using the code R> plot(es2.OBsProb, main = "mxint = 2") where es2.OBsProb is the output of the function OBsProb(). Observe that only two models have a non-null posterior probabilities; specifically model M1 has the highest posterior probabilities (0.779). It contains the active factors, in the output enumerated from 3 (C) to 6 (F ).
Herein we assume that model M1 is the "oracle" model and that the posterior probabilities of the factors' importance is the one reported in Figure 3 since they are obtained from a full factorial design. As a matter of fact, such a full experiment is very rarely conducted in practice due to the high number of trials involved and the consequently required amount of time and resources. Thus the researchers usually perform a fractional factorial design, reducing the number of runs. This kind of design often produces ambiguous results and thus extra runs are required to identify active factors. The collection of such additional runs defines a follow-up experiment. The question then becomes: how to efficiently choose such extra runs. Table 7: Runs, design matrix X and response variable Ytransformed corresponding to the 2 6−3 III fractional factorial design with generators D = ABC, E = BC, F = AC.
In order to illustrate the OBsMD package, we consider as initial experiment the same used for comparing different methods for design follow-up in Edwards et al. (2014). This will allow us to compare the results obtained adopting the objective Bayesian framework with the other approaches analyzed (foldover, semifoldover, D-optimal designs, MD-criterion of Meyer et al. 1996).

First 2 6−3 III fractional factorial design
We begin by considering a 2 6−3 III fractional factorial design according to the generators D = ABC, E = BC, F = AC. Given the generators, the experimenter knows the aliasing structure of the plan and its resolution equal to III, meaning that the main factors are aliased with two-factor interactions. The runs (ID) of Table 2 corresponding to this fractional factorial design are (2,25,37,62,15,24,44,51). The design matrix X and the related response variable are shown in Table 7.
After having fixed the number of runs of the follow-up design, we compute the function OMD() for all the possible combinations with replacement of n f runs from the full 2 6 design. For instance, if n f = 4, the number of all the possible different additional runs is 766480. An automatic routine allows us to make a list of all the possible combination of n f follow-up runs to allocate in the matrix Mbest: R> Mbest <-as.matrix (combinations(64, 4, 1:64,

repeats = TRUE))
To implement the procedure we first need to insert the matrix Xcand which contains the runs of the original 2 6 design. When the argument blk in the OBsProb() function is not set equal to 0, in the first columns Xcand will include the blocking factors.

R> Xcand <-MetalCutting[, 2:7]
As reported in Table 6, in the call of the function OMD() there are several arguments that have to be fixed coherently with those of OBsProb() such as: OBsProb, nFac, nBlk and Xcand. On the contrary the arguments mIter, nStart and startDes are to be set according to the choice of whether or not to use the exchange algorithm (see Section 4).
Furthermore the researcher has to fix the number of additional runs n f to use as input (argument nFoll) of OMD() and the number nMod of competing models according to which the discrimination criterion OMD has to be computed.
In this example nFoll = 4 and nMod = 42, since 42 are the number of models whose posterior probabilities are non-null according to the output of OBsProb() 4 . Obviously we could choose to discriminate only among the models in the list with a posterior probability greater than a certain value. For instance we could consider relevant to discriminate only among the models with posterior probability greater or equal to 0.025 (nMod = 5).
To adopt the exchange algorithm, some arguments of the function OMD() (in this example es7.pp_omd) have to be changed. In particular: • mIter has to be set greater then 0 since it represents the maximum number of iterations of the exchange algorithm; • nStart represents the number of random starts of the exchange algorithm (and not the number of rows of the matrix Mbest); • startDes has to be set equal to NULL (and not to Mbest).

, 8]
A preliminary analysis is conducted using the following code: R> es8.OBsProb <-OBsProb(X = X, y = y, abeta = 1, bbeta = 1, blk = 0, + mFac = 6, mInt = 2, nTop = 64) R> print(es8.OBsProb, nTop = 10)  Figure 5 on the left side indicates that factors D, E, F are active. Then, with the exception of factor C the posterior probabilities obtained with OBsProb() seem to highlight the true active factors (see Figure 3 representing the oracle of the effects importance). Note that this resolution III screening design shows a really different result compared to the previous one. The reason is that we are considering a highly fractional factorial design; as a consequence the choice of generators may influence the analysis of the screening design. Because D = EF , E = DF , and F = DE, some caution is needed before considering factors others than D, E, and F as inert and dropping them off. In order to understand the potential effective relevance of the other factors a follow-up experimental design might be run. Fixed equal to 4 the dimension n f of the follow-up design (nFoll = 4), the function OMD() is computed for all the possible designs of nFoll runs chosen from the 2 6 candidates of the full design (Xcand).

2 6−2 IV fractional factorial design one-run-at-a-time
In this section we consider the same fractional experimental design 2 6−2 IV analyzed in Section 2.2 (abeta = 1, bbeta = 7 and mInt = 2) but assuming that the follow-up experimentation is performed in one-run-at-a-time fashion.
Clearly the choice between running the follow-up design as a block or sequentially will depend on cost, feasibility, resources, and timeliness.

Conclusions
In this paper, the background and a tutorial are presented for the OBsMD package for the R environment. OBsMD implements the objective Bayesian methodology proposed in Consonni and Deldossi (2016) for identifying optimal follow-up runs to resolve model ambiguity. Unlike other implementations, packages, or strategies, OBsMD should appeal to practitioners because of its flexibility and the use of the objective Bayesian approach which does not require specification of the prior parameters.
When applied to real data, it produces follow-up runs which discriminate among competing models better than the current methodology. Because the tutorial uses the dataset MetalCutting described in Mønness et al. (2007), the readers can compare the results obtained with our approach (and package) with those proposed in Edwards et al. (2014). Through the applications of this dataset, we have shown the performance and the usefulness of the OBsMD package. We believe that the availability of such a method will be appreciated by other R users as well.
Three examples have been discussed, along with some theoretical and practical issues. Flexibility is achieved by providing the user with many options. For example with regard to the prior of the model space, we adopted the values (a = 1, b = 1), corresponding to the uniform distribution. The alternative choice (a = 1, b = k + 1) has been advocated to achieve a stronger sparse model effect. Moreover the main difference with such a prior is that the choice (a = 1, b = k + 1) gives more weight to more parsimonious models, relative to (a = 1, b = 1), however, optimal follow-up runs are broadly similar in the two cases. The opportunity to add one-run-at-a-time represents a further key point, since, as in Section 5.3, we can discover which factors are active factors simply by augmenting the design with 1 or 2 runs.