B2Z : An R Package for Bayesian Two-Zone Models

A primary issue in industrial hygiene is the estimation of a worker’s exposure to chemical, physical and biological agents. Mathematical modeling is increasingly being used as a method for assessing occupational exposures. However, predicting exposure in real settings is constrained by lack of quantitative knowledge of exposure determinants. Recently, Zhang, Banerjee, Yang, Lungu, and Ramachandran (2009) proposed Bayesian hierarchical models for estimating parameters and exposure concentrations for the two-zone diﬀerential equation models and for predicting concentrations in a zone near and far away from the source of contamination. Bayesian estimation, however, can often require substantial amounts of user-deﬁned code and tuning. In this paper, we introduce a statistical software package, B2Z , built upon the R statistical computing platform that implements a Bayesian model for estimating model parameters and exposure concentrations in two-zone models. We discuss the algorithms behind our package and illustrate its use with simulated and real data examples.


Introduction
The estimation of a worker's exposure to chemical, physical and biological agents is one of the key responsibilities of industrial hygienists (Ramachandran 2005). Statistical and mathematical modeling allows hygienists to systematically evaluate retrospective exposure when past monitoring data are poor or non-existent, to predict current and future exposure in the absence of the working process or operation, and to estimate exposure with only a small number of air samples with possibly high variability. Indeed, Nicas and Jayjock (2002) have argued that modeling may provide more precise estimates of exposure than monitoring with only a few data points. With advances in computational methods and inexpensive software implementation, formal modeling is set to become an indispensable tool in the industrial hygienists' armory. Zhang et al. (2009) recently proposed Bayesian models for estimating model parameters and exposure concentrations for a two-zone model. Their model predicts concentrations in a zone near and far away from the source of contamination. Their model also estimates the contamination rate, air ventilation rate through the system, and the air flow between near and far fields. In their simulation study, they show that the predictions of near field concentration concord with the true values, indicating that the two-zone model assumptions agree with the reality to a large extent and the model is suitable for predicting the contaminant concentration.
It is also well recognized in the statistics literature that spatial hierarchical models offer additional richness by building dependencies in different stages. These models follow the Bayesian paradigm of statistical inference (see, e.g., Carlin and Louis 2008;Gelman, Carlin, Stern, and Rubin 2003), where analysis is based upon sampling from the posterior distributions of the different model parameters. Hierarchical models are especially advantageous with data sets having several lurking sources of variation and dependence, where they can estimate much richer models with less stringent assumptions.
In applied research, providing software with a proposed model encourages other researchers to explore the proposed model, detect potential issues and advance methodological research. An exciting prospect in recent times that helps bring such sophisticated statistical methodology to the users is the R project (R Development Core Team 2011). R is a language and environment for statistical computing and graphics that offers several built-in functions for mathematical computations. A convenient feature of R is the ability to create packages (libraries) that implement the new model. In addition, for computationally-intensive tasks, C, C++ and Fortran programs can be linked and invoked by R at run time.
The present paper introduces a R package called B2Z -available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=B2Z -that implements the Bayesian two-zone model proposed by Zhang et al. (2009). This package obtains random samples from the posterior distribution of the parameters and exposure concentrations for the Bayesian two-zone model. Currently, three different sampler algorithms are available to do such task: the sampling importance resampling, the incremental mixture importance sampling, and the Metropolis-within-Gibbs sampler. In addition, the package also offers approximate Bayesian estimation using the Bayesian central limit theorem. Section 2 recounts the Bayesian two-zone modeling framework. Section 3 briefly describes the sampler algorithms implemented in B2Z. Section 4 illustrates the use of B2Z with simulated and real data examples. Finally, Section 5 concludes the paper with some discussion and thoughts. rest of the room is another well-mixed box that completely encloses the near field box. This box is called the far field and there is some amount of air exchange between the two boxes.
Customarily, it is assumed that each field is a well mixed box, i.e., two distinct places that are in the same field have equal exposure concentration levels. In addition, it is assumed that the contaminant's total mass is emitted at rate G and that there is an airflow rate between the near field and far fields equal to β. The final assumption considers that there are supply and exhaust flow rates which are taken to be the same and equal to Q. Figure 1 is a schematic depiction of the dynamics of the system, where V N and V F denote the volumes at the near and far field, respectively. In this context, the occupational hygienist seeks to model the exposure concentrations at the near and far fields based upon observations collected over a period of time. Figure 1, along with the assumptions, yields the following system of differential equations for the two-component model: The functions C N (θ 1 ; t) and C F (θ 1 ; t) are the exposure concentrations in the near and far fields at time t, respectively. Equation 1 is the matrix representation of a linear system of ordinary differential equations. When A(θ 1 ) is nonsingular, the solution for (1) has the matrix representation where I 2 is the 2 × 2 identity matrix and exp (tA (θ 1 )) is the matrix exponential (see, e.g., Laub 2005). Assuming that C N (θ 1 ; 0) = C F (θ 1 ; 0) = 0, (2) can be simplified to yield the following unique solution for the far field and near field concentrations: B2Z: An R Package for Bayesian Two-Zone Models where λ 1 and λ 2 are the eigenvalues of A (θ 1 ). In fact, these are available in closed form as: The solutions for the near and far field zones make intuitive sense. Firstly, notice from (4) that both λ 1 and λ 2 are negative numbers. Thus, the exponential terms in (3) decay asymptotically to zero at large values of t. Consequently, the steady state solution for the far field is G/Q, which is the same as the steady state solution for a one-box model. Also, the steady state solution for the near field is G/Q + G/β. Therefore, the model predicts relatively higher exposure intensity near the emission source compared to the one-box well mixed room model in steady state conditions. Secondly, if β is less than or equal to Q, then the steady state concentration in the far field is less than twice the steady state concentration in the near field. In general, Q increases relative to β as the room size increases. Thus, the model draws a distinction between exposures of workers near the source and those farther away from the source.
From (3), we also see that the solution of the system in (1) depends upon several parameters. Customarily, V N and V F are considered fixed and known, while β, Q and G are regarded as unknown parameters and will need to be estimated. Let Y(t) = (Y N (t), Y F (t)) be a 2 × 1 vector corresponding to the natural logarithm of the exposure concentration at time point t.
The observed value of Y(t) is a combination of two components: 1. Systematic component: C(θ 1 ; t) = (C N (θ 1 ; t), C F (θ 1 ; t)) , the solution of the system of differential equations in (1) at time t; 2. Measurement error process component: (t) = ( N (t), F (t)) , where N (t) and F (t) are the measurement error processes corresponding to the near and far field, respectively.
This leads to the following measurement model: where log C(θ 1 ; t) = (log C N (θ 1 ; t), log C F (θ 1 ; t)) . Following Zhang et al. (2009), we assume Gaussian measurement error and, more specifically, the following two possibilities: In the independent model, the measurement errors at the near and far field are assumed to be uncorrelated, while the dependent model relaxes this assumption. For both models, it is assumed that the measurement errors across time are independent and identically distributed.
Let Y = Y(t 1 ) , . . . , Y(t n ) denote the 2n × 1 vector of observed log-concentrations from the near and far fields at n time points. Letting θ = {θ 1 , Σ} be the collection of unknown parameters, (5) and the assumptions made on the measurement errors produce the likelihood where Σ is the covariance matrix of the measurement error process. We assume that the components β, Q, G and Σ are independent, so that the prior distribution for θ is p(θ) = p(β)p(Q)p(G)p(Σ). For the independent model, we assume that where a N and a F are shape parameters and b N and b F are scale parameters for the inverse Gamma distribution. For the dependent model, where S is a scale matrix and v is the degrees of freedom for the inverse Wishart distribution. The parameterizations of the inverse gamma and inverse Wishart here are the same as in Gelman et al. (2003). The parameters β, Q and G can have any prior distribution with positive support, i.e., they do not assign positive probabilities to any negative value. Based upon the above assumptions, the posterior distribution of θ can be computed using Bayes rule as proportional to p(θ)×p(Y | θ). However, the posterior distribution may not have a closed form precluding analytical inference. Our package B2Z has three different sampler algorithms available to obtain samples from the posterior distribution of θ. The algorithms are discussed in the next section.

Bayesian estimation
In this section we briefly discuss the three sampling algorithms and the approximation using Bayesian central limit theorem that are available in our package B2Z. We also present some algorithmic implementation details.

Sampling importance resampling
The sampling importance resampling (SIR; Dijk, Hop, and Louter 1987;Rubin 1987Rubin , 1988) is a fairly straightforward algorithm used to obtain random samples from a probability distribution, here the posterior distribution p(θ | Y). Several variants of this algorithm exist (see, e.g., Robert and Casella 2004), but the basic idea is to sample θ from an easily tractable distribution (e.g., the prior distribution) so that the SIR tends to choose θ i 's corresponding to higher values of the likelihood. This sampler is described in the following algorithm: 1. Obtain m i.i.d samples from the prior distribution p(θ). Denote each sample by θ i , i = 1, . . . , m ; 2. For each sample θ i , evaluate the likelihood l i = p(Y | θ = θ i ); 3. Compute the importance weights as: 4. From the m samples obtained at the first step, select m samples (with replacement) using the weights w i 's.

In
Step (3), the l i 's can be very close to zero so that a large proportion of the importance weights are close to zero as well. To assuage this issue, we implement in our package the following computational trick. We replace the computation of the importance weights in Step (3) for: where l i = log (p(Y | θ = θ i )) + C, and C is a large positive constant. While this may not fully resolve the issue of small weights, it does considerably increase the number of non-zero weights. Nevertheless, for the SIR to sample well from the posterior distribution, m must be large (thousands or even millions) which can be computationally expensive. In fact, in our examples, we often discovered the SIR to be returning very few distinct values, even with an m of size 50,000. This arises due to an inadequate exploration of the parameter domain. Also, it is important to the SIR that the prior distribution agrees with the likelihood. Otherwise very few distinct sampled points from the tails of the prior distribution have sizeable importance weights causing the final sample to have few unique points. The incremental mixture importance sampling, described next, attempts to circumvent these problems.

Incremental mixture importance sampling
In contrast to the SIR, at each iteration the incremental mixture importance sampling (Steele, Raftery, and Emond 2006; Raftery and Bao 2010, IMIS;) adds samples from a multivariate normal distribution, centered at the point with the highest importance weight, to the current importance sampling distribution. This covers sections of the posterior distribution with high importance weights that are normally underrepresented by the importance sampling distribution. The IMIS algorithm is presented below: 1. Initial stage: (a) Draw N 0 i.i.d. samples θ 1 , . . . , θ N 0 from the prior distribution of θ; (b) For each θ i , evaluate the likelihood l i = p(Y | θ = θ i ) and compute its importance weight as: 2. Importance sampling stage: k = 1. While some stopping criterion (see below) is not satisfied do: (a) Denote by µ (k) the input with highest importance weight among the current importance sample up to iteration k; (b) Find the B inputs with smallest Mahalanobis distance to µ (k) . The distances are calculated with respect to the prior covariance matrix of θ, denoted by V θ . More precisely, the Mahalanobis distance of an input x to µ (k) with respect V θ is given by: (c) Denote by u 1 , . . . , u B the importance weights of the B inputs selected in the previous step; (d) Denote byΣ (k) the estimated weighted covariance matrix using the selected B inputs. The weight of the input j is given by: (f) Compute the likelihood for each new input from the previous step, and combine the new inputs with the previous ones; (g) Update: N k = N 0 + Bk; (h) Compute the mixture sampling distribution q (k) at iteration k, given by: is the prior distribution of θ and N d (· | m, S) denotes the multivariate normal density with vector mean m and covariance matrix S; (i) Calculate the importance weights using the following formula: where c is chosen so that the weights sum to 1; (j) k = k + 1.
3. Resample stage: Once the stopping criterion (see below) at the importance sampling stage is satisfied, use the importance weights w N K to draw, with replacement, M inputs from the importance sample θ 1 , . . . , θ N K , where K is the total number of iterations at the importance sampling stage.
Stopping criterion: Raftery and Bao (2010) suggest ending the importance sampling step when the expected fraction of unique points in the resample is at least 0.632. B2Z follows this suggestion. However, the user can provide a maximum number of iterations at the importance sampling stage in case the stopping criterion takes too long to be met. Raftery and Bao (2010) also suggest that a good choice for the input parameters is: N 0 = 1000d, B = 100d and M = 3000. Recall that if the independent model is considered d = 5, otherwise d = 6.

Metropolis-within-Gibbs sampling
Gibbs sampling (Geman and Geman 1984;Gelfand and Smith 1990) is a popular Markov chain Monte Carlo (MCMC) algorithm that samples from the full conditional distributions for each parameter. This is attractive in our context since the full conditional distributions for τ N and τ F in the independent model and for Σ in the dependent model are respectively given by: 1. Independent model : However, the full conditional distribution of θ 1 does not have a closed form and we sample from its full conditional distribution using the Metropolis algorithm (see Metropolis algorithm section below). This is called the Gibbs sampler with Metropolis step (or Metropolis-within-Gibbs). The algorithm is as follows: , Y, and denote it as Σ (k) Using Metropolis sampler, draw a sample from θ 1 | Σ (k) , Y and denote by θ (k) . end for

Metropolis algorithm
Metropolis (Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller 1953;Hastings 1970) is a well known MCMC sampling algorithm. Here, at each iteration we sample a candidate from a proposal distribution and then decide whether the candidate should be accepted or not. This decision is based on the ratio of the posterior distribution evaluated at the candidate and the previously accepted candidate. Since this is a ratio one needs to evaluate the posterior distribution only up to a proportionality constant. Several variants of the Metropolis sampler exist (see, e.g., Robert and Casella 2004). The one that is currently implemented in B2Z is the random-walk Metropolis algorithm with normal proposals and is described as follows: The input parameters for the Gibbs sampler with the Metropolis step algorithm are the number of updates N , the vector initial value θ (0) 1 , and a covariance matrix V for the proposal distribution. An approach that usually works well in practice estimates the posterior mode and uses it as an initial value. For V, we use the negative inverse of the hessian matrix of the log posterior distribution evaluated at the posterior mode. This approach is implemented in B2Z as the default for setting initial values and specifying the proposal covariance matrix V.

Bayesian central limit theorem
Differently from the previous sections where we discussed about samplers algorithms, in this section we briefly discuss the Bayesian central limit theorem (BCLT). This theorem states that under some assumptions we can use a Gaussian approximation to the posterior distribution Consider a Taylor expansion of ln(f (θ)) centered on the posterior mode θ 0 . At θ 0 the gradient ∇f (θ) will vanish. Thus the expansion around θ 0 is given by where H is the negative Hessian matrix of the log posterior distribution evaluated at the posterior mode. Exponentiating both sides in Equation 7, we obtain From (8), f (θ) is seen to be approximately equal to a multivariate normal density with mean θ 0 and covariance matrix H −1 . Since the posterior density, p(θ | Y), is proportional to f (θ), it too is approximately equal to the multivariate normal density. We note this approximation assumes that the prior distribution of θ and the likelihood must be positive and twice differentiable near the posterior mode. For further details, see Bishop (2006).
To compute estimates of the parameters using the BCLT, we use the R built-in function called nlminb. This function implements constrained and unconstrained optimizations using PORT routines (Gay 1990), allowing us to estimate the posterior mode numerically. Subsequently, we use the R function hessian, from the package numDeriv (Gilbert 2011) to calculate a numerical approximation to the Hessian matrix of the log posterior function at the estimated posterior mode.

Algorithmic implementation details
B2Z is an R package that performs sampling-based Bayesian inference for the two-zone model described in Section 4.2. Currently, three sampling algorithms are available: (a) MCMC, (b) IMIS and (c) SIR. In addition, the package also offers approximate Bayesian estimation using the (d) BCLT. The Bayesian two-zone model can be fitted using the function B2ZM, where the desired sampling algorithm is specified as an argument to this function. Another option is to use one of the following functions directly: B2ZM_BCLT, B2ZM_MCMC, B2ZM_IMIS and B2ZM_SIR. In either one of the cases, the output is a valid input for the functions summary and plot. For instance, suppose fit is an output from B2ZM. Then, the line command summary(fit) returns the following: Some posterior summaries for each of the parameters θ: -Posterior median, mean, standard deviation; -100(1 − α)% credibile intervals, where α is specified by the user; -Posterior covariance matrix.
Sample quality measurements that depend on the sampler algorithm. Specifically, -SIR: Effective sample size (ESS), proportion of unique points in the sample, maximum importance weight; -IMIS: ESS, maximum importance weight, variance of the re-scaled importance weights, entropy of importance weights relative to uniformity, expected fraction of unique points and expected number of unique points after re-sampling; -MCMC: effective sample size and MCMC acceptance rate.
The package coda (Plummer, Best, Cowles, and Vines 2006) offers several other diagnostics measures. We show in Section 4.1 how to integrate the packages B2Z and coda. For details on some of the above quantities (e.g., DIC and ESS) see Carlin and Louis (2008).
The line command plot(fit) produces some graphical summaries of the estimated model. In particular, this line command returns: 100(1 − α)% posterior predictive interval along with the posterior median of the log concentrations at the near field over the observed period of time, where α is specified by the user; 100(1 − α)% posterior predictive interval and the posterior median of the log concentrations at the far field over the observed period of time, where α is specified by the user; Empirical posterior distributions for each parameter in the model; If Metropolis-within-Gibbs is selected, autocorrelation function (ACF) and trace history of the sampling of each parameter is also plotted.
Due to the domain of the parameters in the model, we actually implement the algorithms cited previously (except SIR) on a transformation of θ. After sampling from the posterior distribution of the transformed variables, we back transform to obtain a sample from the posterior distribution of θ. In particular, consider the dependent model and denote X 1 = β, X 2 = Q, X 3 = G, X 4 = τ N , X 5 = τ F and X 6 = τ N F . Suppose the priors for β, Q and G have supports (a 1 , b 1 ), (a 2 , b 2 ) and (a 3 , b 3 ), respectively, where 0 ≤ a i < b i < ∞ for all i = 1, . . . , 3. Consider the following transformations given by h i (·) for i = 1, 2, . . . , 6: Therefore, the domain of U = (U 1 , U 2 , . . . , U 6 ) is in R 6 . Notice that if the independent model is considered, the variable U 6 is not needed in U, and therefore its domain is R 5 . Thus, the density of U is given by: where h −1 (u) = (h −1 1 (u 1 ), h −1 2 (u 2 ), . . . , h −1 6 (u 6 )) and |J| is the jacobian of the transformation with J being the 6 × 6 matrix whose (i, j)-th element is J ij = ∂X i ∂U j . Regardless of the model choice, the matrix J is a lower triangular matrix, therefore |J| is the product of the diagonal elements, which is (1 + e U 6 ) 2 .
(dependent model) The transformation h i (·) makes the support of the U i the real line, which improves the algorithmic efficiency. Also, when we back transform, the sampled values for β, Q and G are within their respective domain, and the covariance matrices composed by the sampled values for τ N , τ F and τ N F are positive definite, since h 4 and h 5 guarantee that τ N and τ F are positive, and h 6 ensures that the covariance inequality is held, that is, τ 2 N F ≤ τ N τ F . When any of the domains of β, Q or G is (0, ∞), the corresponding U i is the natural logarithm of X i and the computation of the jacobian is still very similar to the one presented in Equation 10.

Illustrating B2Z
In this section we illustrate B2Z using two synthetic datasets and a real dataset. The simulated exposure concentrations at the near and far fields, over n time points, were generated according to the following algorithm: 1. Choose the values of the parameters θ 1 and Σ as desired. Recall that Σ is a diagonal matrix in the independent model, or a matrix with non-null entries in the off diagonal for the dependent model. In any case, Σ must be a positive definite matrix.

For (i in 1 : n) (a) Using the fixed parameters in
Step 1, find the solution of the system of differential equations in (1). Denote this solution by (c) The log exposure concentrations in the near and far fields at time t i are (d) The exposure concentrations in the near and far fields at time t i are exp(Y(t i )).
The dataset in Section 4.1 was generated considering dependent measurement errors, i.e., τ N F = 0. Section 4.2 presents an application of the Bayesian two-zone model to a simulated dataset where the measurement errors are independent, while Section 4.3 applies the model to a real exposure dataset. We started each sampler with a seed set to 2011.

Simulated data 1
Consider a simulated dataset that contains 100 exposure concentrations equally-spaced between times 0 and 4 minutes. Following the study simulation in Zhang et al. (2009) To fit the Bayesian two-zone model, we need to specify the prior distributions for the unknown parameters. We assume that β ∼ Unif (0, 10), Q ∼ Unif (11, 17) and G ∼ Unif (281, 482). The dependent model is used in this section. Therefore, we assume that Σ ∼ IW 10 0 0 10 , 4 . The example code below illustrates how to specify the model information and the sampling algorithm desired using B2Z.   The argument data is a 3-column matrix such that the columns are time, exposure concentrations at the near field and at the far field, respectively. The argument mcmc.control is a list that contains the input parameters for the Metropolis-within-Gibbs algorithm. Similarly, there are control input arguments to BCLT, IMIS and SIR as well, which are bclt.control, imis.control and sir.control, respectively. More details about the arguments in B2ZM can be found in R using the line command help(B2ZM).
As discussed in Section 3, the sampling algorithms require some input parameters. Table 1 presents the input parameters provided for each sampling algorithm in this example. The BCLT implemented in the B2Z package requires two input parameters: m and sample_size. In particular, to estimate the posterior mode (needed in the BCLT), the function nlminb is used, which depends on the starting parameter values. The input m is the number of sampling values from the prior distributions of β, Q and G. Therefore, the vector among the m sampled with largest likelihood value is used as starting parameter values. The other input parameter sample_size is the size of the sample from the approximate posterior distribution of the parameters in the model according to the BCLT. We use m = 8000 and sample_size = 2000. Table 2 presents several posterior summaries for each parameter in the dependent model obtained by using the function B2ZM within the package B2Z. The IMIS and Metropolis-within-Gibbs algorithms provide similar estimates for the parameters in the model. In addition, the posterior means obtained by these algorithms fairly estimate the parameters in the model, except for β and G that were estimated using the SIR algorithm. The 95% credible intervals cover the true values of the parameters, except for τ F when using the SIR algorithm.
In this example, the SIR algorithm samples poorly from the posterior distribution. In fact, the proportion of unique points in the sample is very low (0.062%), which explains the strange behavior in the standard deviation estimates. On the other hand, IMIS and Metropoliswithin-Gibbs algorithms perform better. In particular, the IMIS has an expected fraction of unique points equaling 58.7% and the ESS for the Metropolis-within-Gibbs the acceptance rate is 51.27%.
The following figures are produced using the line command plot(fit.depend), where the output fit.depend is an object from the Bayesian two-zone model fitted using the Metropoliswithin-Gibbs algorithm. The analogous figures for the SIR and IMIS algorithms, and BCLT are not shown in this paper. However, they can be produced by running the example codes in the tutorial of B2Z.  Gaussian kernel density curve. Figure 4 shows the Metropolis history plot and ACF for the parameters in the model.
The Bayesian two-zone model fitting was done on a PC Intel Core Duo CPU P8600 with 2.40GHz and 4.00GB of Memory RAM. The computational time (in seconds) obtained for the SIR, IMIS, 133.30,67.40,and 18.64, respectively. In this example, the computational time for the Metropolis-within-Gibbs also includes the time spent estimating the starting values and the covariance matrix needed for the proposal distribution.
B2Z can also interact with the package coda. For instance, Gelman and Rubin's convergence diagnostic can be computed very easily using the function gelman.diag provided by the package coda. To compute that measure we need to fit the model more than one time.
In particular, we fit the model three times using Metropolis sampler and denote them by fit.depend1, fit.depend2 and fit.depend3. The following code shows how to compute the Gelman and Rubin's convergence diagnostic in this example.

Simulated data 2
Here we consider another simulated dataset with most parameters that generated exposure concentrations being the same as those in Section 4.1. The only difference is that now the measurement errors at the near and far field are considered independent. In particular, we set Σ = 1 0 0 0.64 . As in the previous section, we assume β ∼ Unif (0, 10), Q ∼ Unif (11, 17) and G ∼ Unif (281, 482). However, now we fit the independent model and therefore we assume that τ N ∼ IG(5, 4) and τ F ∼ IG (5, 7).
The example code below shows how to specify the modeling information and the sampling algorithm desired using B2Z. The input parameters used by the algorithms are the same as the ones presented in Table 1, except for the IMIS algorithm, which uses N 0 = 5000 and B = 500 in this section. Table 3 presents posterior summaries for each parameter in the independent model. The parameters estimate using IMIS and Metropolis-within-Gibbs sampler are similar. Considering the parameters β,G and τ F , the BCLT posterior means were closer to the true values than using the other algorithms.
The three samplers and the BCLT have very similar posterior summaries. All the 95% credibility intervals cover the true values of the parameters. Compared to the previous example, the performance of SIR is slightly better; the proportion of unique sampled points is 2.5%. The IMIS and Metropolis-within-Gibbs algorithms sample fairly well from the posterior distribution. As a matter of fact, the fraction of unique points for IMIS is 0.654 and the acceptance rate for the Metropolis-within-Gibbs is 49.23%.
The computational times (in seconds) for SIR, IMIS, Metropolis-within-Gibbs and BCLT were 29.50, 83.50, 41.01 and 8.05 respectively. Again, the computational time for Metropoliswithin-Gibbs includes the time to estimate the covariance matrix of the proposal distribution.
In the next section we illustrate an application of B2Z to a real experimental data set.

Experimental two-zone study
In this section we fit the Bayesian two-zone model to the data set used in the experimental two-zone study in Zhang et al. (2009). Here, exposure concentrations of Toluene over a period of time were observed. These measurements were made in four directions (east, west, north and south) on three horizontal parallel planes at 5 different distances (10cm, 15cm, 20cm, 30cm, and 40cm) from the contamination source, where the source was located on the middle plane and the exposure concentrations were measured every 5 seconds for at least 15 minutes in each location. Although combinations of factors such as presence of a worker's body, body movement and heat were also included in the experimental study, here we consider only the plain experimental data, i.e, the measurements that do not include any of those factors. A very detailed explanation of this experiment can be found in Zhang et al. (2009).
To illustrate B2Z using this real data set, we use the observed exposure concentrations on the middle plane and north direction. The measurements at 10 cm and 15 cm from the contamination source represent the exposure concentrations at the near and far fields, respectively. The volumes of the near and far fields are π × 10 −3 m 3 and 3.8m 3 , respectively. There are 140 observed time points equally spaced between 5 and 700 seconds.
The DIC for the fitted independent model is 115.392, which indicates that the independent model fits the data better than the dependent model. However, the estimates in Table 5 are not substantially different from those in Table 4.

Discussion
In this paper we have introduced our user-friendly R package B2Z. We have described the underlying models and a suite of algorithms to perform Bayesian inference on the two-zone models in occupational hygiene (e.g., Zhang et al. 2009). In particular, we have demonstrated the main function called B2ZM, where the output from this function is a valid input for the functions summary and plot. Currently B2Z implements three different samplers: SIR, IMIS, and the MCMC. In addition, the package also offers approximate Bayesian inference using the Bayesian central limit theorem.
Our illustrative examples show that IMIS, MCMC and BCLT obtain similar posterior summaries. The SIR's performance was somewhat inferior, but it is easier to implement and can be useful as an initial tool for exploring approximate posteriors. Our examples also show that the Bayesian two-zone model can be fitted within a reasonable time (ranging from 8 to 140 seconds). In particular, IMIS is the slowest algorithm while BCLT has the fastest computational time. In ongoing work we focus upon including new features: adaptive MCMC samplers, maximum likelihood estimators, and a function called predict where users can define time intervals over which concentration predictions are sought. The B2Z 1.4 is already available for download from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=B2Z.