Bayesian Analysis for Food-Safety Risk Assessment: Evaluation of Dose-Response Functions within WinBUGS

Bayesian methods are becoming increasingly popular in the ﬁeld of food-safety risk assessment. Risk assessment models often require the integration of a dose-response function over the distribution of all possible doses of a pathogen ingested with a speciﬁc food. This requires the evaluation of an integral for every sample for a Markov chain Monte Carlo analysis of a model. While many statistical software packages have functions that allow for the evaluation of the integral, this functionality is lacking in WinBUGS . A probabilistic model, that incorporates a novel numerical integration technique, is presented to facilitate the use of WinBUGS for food-safety risk assessments. The numerical integration technique is described in the context of a typical food-saftey risk assesment, some theoretical results are given, and a snippet of WinBUGS code is provided.


Introduction
The billions of servings of food and water consumed in developed countries are generally free of pathogenic bacteria, with illness from prepared food occurring at rates often on the order of one illness per 100,000 servings for even the most common pathogens (e.g., Lammerding 2006). The field of microbial food-safety risk assessment deals with the rare event that some number of pathogenic organisms, such as Salmonella, Campylobacter and E. coli O157:H7, are ingested by an individual and the infection progresses to serious illness. The models used for these risk assessments are typically referred to as "farm-to-table" models. They attempt to model processes mechanistically, such as microbial contamination, growth during transportation and storage, cross contamination, and the inactivation of organisms that occurs during freezing and cooking.
These models are often complex Monte Carlo simulations that require considerable development effort. Nevertheless, a probabilistic model underlying such a risk assessment is surprisingly simple and consists of only two components. These being: 1. The level of microbial contamination (i.e., dose) D of all servings at the point of consumption is distributed as D ∼ f (·) where f is a function of unknown parameters.
2. A dose-response function that estimates the probability of illness occurring given exposure to D, denoted R(D) = P (ill|D).
The average probability of illness is estimated by integrating the product of these two components across the range of possible doses. Monte Carlo models are computationally intensive when second-order modeling is employed to capture both variability in the stochastic system as well uncertainty in the parameters used in the model (Frey and Burmaster 1999;Vose 2008).
Recent research interest has focused on the replacement of Monte Carlo models with a Bayesian approach that uses Markov chain Monte Carlo methods (Hald, Vose, Wegener, and Koupeev 2004;Parsonsa, Ortona, D'Souza, Mooreb, Jonesc, and Dodd 2005;Albert, Grenier, Denis, and Rousseau 2008). Along with this proposed approach comes the inevitable suggestion that the models be built using WinBUGS (Vose 2008).
The estimation of the average probability of illness can be a difficult task because the integration of the product of the exposure distribution and dose-response function rarely results in a closed-form solution. Numerical integration is typically used to evaluate the dose-response function, but some computing environments that are designed for implementing Markov chain Monte Carlo methods, such as WinBUGS (Lunn, Thomas, Best, and Spiegelhalter 2000), are not well suited to the evaluation of this integral. This article presents a probabilistic framework and a numerical integration technique that can be used for microbial food-safety risk assessment using the WinBUGS environment.

Outline of food-safety risk assessment
A biologically plausible model for microbial contamination assumes that average pathogen counts per serving vary according to a lognormal distribution (Limpert, Stahel, and Abbt 2001). Standard practice in microbiology is to express the mean and variance parameters describing microbial count data in log 10 , rather than log e , scale. The conversion between the two different scales is generally necessary for computation and is achieved by division of the µ and σ parameters by log e (10). The natural-log scale will be used for the remainder of this presentation.
The lognormal model holds for pathogen levels that are generally measured at the end of the production process (e.g., Campylobacter levels on poultry at the end of the slaughter process). Pathogen levels at consumption are generally unknown, so the risk assessment models the change in levels between production and consumption. Average pathogen levels per serving at the point of consumption can be modeled as the product of a large number of independent random variables describing growth and attenuation processes. When the central limit theorem is applied, this product is asymptotically a lognormal distribution (Mitzenmacher 2003).
By making this assumption, the dose distribution is defined by determining the parameters of the lognormal distribution (i.e., D ∼ Lognormal(µ, σ 2 )) (Haas, Rose, and Gerba 1999). Given the low average pathogen levels per serving at the point of consumption, the µ parameter is generally much less than zero. The rare instances of high levels of contamination necessitate large σ 2 values. These parameter values require large sample sizes for accurate Monte Carlo approximation.
A number of different dose-response models have been proposed, with one of the most common models for describing the probability of microbial illness being the beta-Poisson dose-response function (Haas et al. 1999). In this case the number of ingested organisms (dose) follows a Poisson distribution with rate parameter D. Inter-individual variability in susceptibility is modeled by a beta distribution with parameters α and β. Furumoto and Mickey (1967) derive the approximation and show, via a Taylor's series expansion, that this approximation is sufficiently accurate for β >> α. This model produces a curve that is sigmoidal when plotted in log space and the interpretation of the two parameters is that α controls the slope of the curve while β controls the location. The ultimate goal is to determine the probability of an illness occurring across the distribution of possible contamination levels. Averaging the dose-response function across the distribution of all possible doses provides an estimate of the average probability of illness per random serving of the food of interest, i.e., Calculation of the integral in (2) represents the central component of many food-safety risk assessments. For example, multiplication of P (ill) by the number of servings of a food commodity provides an estimate of the total number of illnesses. The effectiveness of a change in production practices, designed to reduce the level of contamination, will generate a new estimate of the total number of illnesses. A comparison between the two estimates can then be used to predict the marginal benefit of the change in production practice.

Solution techniques
Solution of the integral in (2) can be approximated using a number of different techniques. A common solution is to draw a large sample of values from the dose distribution and use a Monte Carlo-based approximation. For example, if a sample of size M dose values is drawn from the lognormal distribution and a beta-Poisson dose-response function is appropriate, then This is a reasonable approach that forms the basis of nearly all Monte Carlo risk assessments. Nevertheless, accurate approximation to the integral with the lognormal distributions typically found in food-safety risk assessments requires values for M on the order of millions of observations to adequately explore the tails of the distribution, which is generally beyond the capabilities of WinBUGS.
A more computationally efficient approach is to replace the integral with a partitioned, discrete equation that provides an analytic solution that can be incorporated into WinBUGS. The proposed approximation to (2) is motivated by considering the following: note that Then partitioning the range of doses yields The proposed integration technique exchanges the order in which the dose-response and expected value are evaluated. That is This equation can also be written as: is the proportion of doses in the interval, and N + 1 is the number of intervals used to partition the range of doses from 0 to ∞. This is a type of Newton-Côtes quadrature approximation (Givens and Hoeting 2005, Chapter 5).
The distribution describing D|D i < D < D i+1 is known as a truncated distribution (Johnson, Kotz, and Balakrishnam 1994), where one of the common areas of application for these distributions is in the actuarial sciences (Klugman, Panjer, and Willmot 2004). The expected value of a truncated random variable is given bȳ where F (D) is the cumulative density function of D.
For the lognormal distribution, the mean in the interval with upper bound D i is and Φ is the standard normal cumulative density function (Klugman et al. 2004). Equation 7 is used to determineD D 1 0 while the expected value for the remaining intervals is calculated by the weighted difference of successive intervals usinḡ The proposed approximation is beneficial only for probability distributions that have closedform solutions, or good approximations, toD as well as software that supports the necessary intrinsic functions for the evaluation of bothD and F (D i ) (e.g., normal and gamma distributions).
Given Equations 7-8, it is possible to construct an accurate approximation to the integral of the dose-response function using a limited number of terms. Dose intervals need not be equal in size or consistent in pattern. Selection of dose intervals can be informed by considering the uncertainty about the true dose distribution or the shape of the dose-response function. A simplistic, yet reasonable, partitioning of the range of doses is based on the symmetry of the normal distribution. Specifically, the range of log transormed dose can be defined as (µ − kσ, µ + kσ), where k describes the number of standard deviations considered on each side of the mean of the log transformed dose. Choosing k to be from 4 to 6 and setting the width of the partition in the range of 0.05 to 0.1 generally provides estimates of P (ill) that match estimates derived using the adaptive quadrature routine QUADPACK in Netlib to within the absolute error bounds values reported from QUADPACK (Piessens, deDoncker Kapenga, Uberhuber, and Kahaner 1983). In contrast, estimates of P (ill) derived using (3) with M = 3, 000, 000 Monte Carlo samples exhibited discrepancies that were often an order of magnitude larger.
More complex and accurate methods for numerical integration, such as Gaussian quadrature, are available (Givens and Hoeting 2005, Chapter 5). Nevertheless, there are a number of applications in risk assessment where user defined truncation points are important (e.g., a D i value that represents a maximum safe dose or other food-safety metric (Crouch, Labarre, Golden, Kause, and Dearfield 2009)), while in other applications the mean of the truncated distribution is important for the partitioning of overall risk into categories (Williams, Ebel, and Vose 2011). These applications are natural extensions of the proposed integration technique. Given these practical considerations, the proposed solution represents a good trade-off between simplicity and accuracy and it works well within the limitations inherent in Win-BUGS.

Theoretical properties
For a fixed partition of the dose range, the proposed numerical integration scheme is not an unbiased estimator of the true probability of illness. Bias and convergence properties of the proposed integration technique can be determined using results from basic probability theory.
To assess the bias, consider the relationship between E[R(D)|D i ≤ D < D i+1 ] and R(E[D|D i ≤ D < D i+1 ]) for an arbitrary interval i. The direction of the bias in the estimated probability of illness can be determined by examining the second derivative of the dose-response function and applying Jensen's inequality (Ross 1988). For typical dose-response functions, such as the beta-Poisson and exponential (Haas et al. 1999), the second derivative of R(D) is negative for all D, so the function is concave and in every interval. ConsequentlyP (ill) > P (ill). Though a formal proof is not provided, this same result can be used to show thatP (ill) is asymptotically unbiased in the limit as the interval width goes to zero. To demonstrate, note that E All that remains is to show that R() approaches a linear function as the interval width goes to zero.
A closed-form estimate of the bias in each interval, as well as the total bias, can be derived. To facilitate the discussion note that a weighted sum notaion can be use to describe the probability of illness measures as where D 0 = 0, D N +1 = ∞, and the weights are as defined in (5) Taking the expected value of R(D|D i ≤ D < D i+1 ) yields The contribution to the bias associated with interval i is Expressing the bias across the range of doses leads to the weighted sum

For the lognormal distribution, a closed-form solution for
The formulas for the calculation of D D i+1 D i 2 are provided in Equations 7-8.
The approximation to the bias can be used to select the boundaries of the dose intervals (D i , D i+1 ) so as to minimize the bias for a given number of intervals. This might, however, be difficult in this application because the uncertainty in the parameters of the exposure distribution and dose-response function causes the optimal grid spacing to vary from one iteration of the model to the next.

Example
The primary function of the provided code is to implement the proposed integration technique as it might appear as part of a food-safety risk assessment. The level of detail is more simplistic than an actual food safety risk assessment. The primary difference is that many of the parameters are treated as fixed values.
To motivate the example, suppose the public health agency, responsible for the food product, wishes to predict the effect of a risk management option (e.g., an intervention such as mandatory incorporation of a chemical antimicrobial agent) that reduces the mean level of Campylobacter contamination at the end of production by one-half log e unit.
Consider the case where the baseline, or current, distribution of Campylobacter per serving at the point of consumption can be modeled as f (D) ∼ Lognormal(µ, σ). It is impossible to collect representative data at the point consumption, so the µ and σ parameters would typically be derived using data collected during the production and processing of the food item and then predictive microbiology models (Haas et al. 1999), estimates from previous risk assessments, and possibly expert opinion would be used to determine pathogen levels at the point of consumption, as well as measures of uncertainty in these parameters. For this example, let the parameters and their uncertainties be modeled as µ ∼ Normal(−9.5, 1) 1 and σ ∼ Gamma(200, 100) and assume the µ and σ are independent.
This example assumes a new intervention during the production of carcasses has an estimated proportional reduction of 0.65 (i.e., one-half log reduction or e 1/2 ) in the mean level of contamination per carcass. The effectiveness of this hypothetical intervention and its uncertainty, in the log scale, is modeled as δ ∼ Gamma(200, 100). The distribution of Campylobacter per serving at the point of consumption that reflects the reduction in contamination can be modeled as f new (D) ∼ Lognormal(µ − δ, σ).
The parameters of the beta-Poisson dose-response function (1) are taken from Black, Levine, Clements, Hughes, and Blaser (1988) and are given by (α = 0.145, β = 7.59). Fixed values are used for these parameters in the example, but in practice additional uncertainty could be incorporated.
A microbial risk assessment model can reasonably assume that the observed number of sporadic illnesses are distributed as a Poisson random variable (Vose 2008) where the rate pa-rameter is the number of servings multiplied by the probability of illness. Suppose for the population of interest there were N servings = 10, 000, 000 servings consumed annually of the food product of interest and that a total of N ill = 540 illnesses were attributed to consumption of the product.
In this setting, the likelihood function in the WinBUGS environment would be N ill ∼ Poisson(N servings P (ill)).
In practice N ill would not be known. Note, however, that in contrast to chemical toxins, estimated total illnesses per annum can be empirically based for microbial pathogens because surveillance systems are in place to detect a portion of such illnesses (Allos, Moore, Griffin, and Tauxe 2004). An additional conversion factor would be required to scale between the observed and the total number of illnesses (e.g., Mead, Slutsker, Dietz, McCaig, Bresee, Shapiro, Griffin, and Tauxe 1999).
A primary objective for the types of risk assessments considered here is the prediction of the number of illnesses that would be avoided (N avoided ), assuming the implementation of a new policy or intervention that reduces contamination. This statistic assumes a reduction in pathogens for each serving and models the number of illnesses avoided as N avoided ∼ Poisson(N servings (P (ill) − P new (ill)), where Figure 1 depicts the resulting posterior distribution for N avoided , with the red lines indicating the boundaries of the 95 % posterior prediction intervals. The number of illnesses avoided can then be used to assess the economic benefit of implementing the policy. These estimated saving can then be compared to the expected cost to industry of implementing the policy to determine if potential benefits exceed the cost of implementation.
The following WinBUGS code uses the proposed numerical integration technique for the estimation of the number of human illnesses avoided.
The first line of code inputs the number of intervals (num.bins), interval width (int.width), number of standard deviations about the mean of the dose distribution (k), number of servings consumed (N.servings) and number of illnesses attributed to consumption of the product (N.ill). The number of intervals used here represents partitioning the interval of approximately 6 standard deviations about the mean into sub-intervals of about 0.1 units, i.e., num.bins = 2 * k * sigma/interval.width = 2 * 6 * 2/(.1) = 240.
list(num.bins = 240, int.width = 0.1, k = 6, N.servings = 10000000, N.ill = 540) In the model step, the parameters of a Beta-Poisson dose-response model for Campylobacter from Black et al. (1988) are used (alpha.dr = 0.145 and beta.dr = 7.59). These parameters are fixed values, but assuming prior distributions for these parameters would be even more appropriate in the Bayesian context.
The mu and sigma steps below specify the prior distributions describing the uncertainty in the mean and standard error of the dose distribution at the point of consumption. Recall that the parameterization of the normal distribution in WinBUGS is based on precision rather than standard deviation. The next steps perform the numerical integration of the dose-response model. First, we set the upper bound, mean dose, and proportion of servings in the first interval for the baseline and reduced contamination scenarios.
u[1] <-exp(mu -k * sigma) D.bar.lo[1] <-0 motivation for this decision is to provide access to these techniques for users of both WinBUGS and OpenBUGS, particularly given the current shift of development efforts to OpenBUGS (Lunn, Spiegelhalter, Thomas, and Best 2009). Testing of the code snippet under WinBUGS and OpenBUGS produced essentially equivalent inferences.
The provided code snippet is specific to the lognormal distribution. Nevertheless, the technique relies only on the existence of a closed-form solution to the expected value of a truncated random variable. Additional distributions that satisfy this condition can be found in Johnson et al. (1994) and Klugman et al. (2004). Examples include the logistic, normal, and gamma distributions. The adaptation of this numerical integration technique to these distributions may facilitate the use of the WinBUGS environment in other applications.