Beyond the Gaussian Model in Diﬀusion-Weighted Imaging: The Package dti

Diﬀusion weighted imaging (DWI) is a magnetic resonance (MR) based method to investigate water diﬀusion in tissue like the human brain. Inference focuses on integral properties of the tissue microstructure. The acquired data are usually modeled using the diﬀusion tensor model, a three-dimensional Gaussian model for the diﬀusion process. Since the homogeneity assumption behind this model is not valid in large portion of the brain voxel more sophisticated approaches have been developed. This paper describes the R package dti . The package oﬀers capabilities for the analysis of diﬀusion weighted MR experiments. Here, we focus on recent extensions of the package, for example models for high angular resolution diﬀusion weighted imaging (HARDI) data, including Q-ball imaging and tensor mixture models, and ﬁber tracking. We provide a detailed description of the package structure and functionality. Examples are used to guide the reader through a typical analysis using the package. Data sets and R scripts used are available as electronic supplements.


Introduction
The basic principles of diffusion weighted imaging (DWI) have been introduced in the 1980's (Le Bihan and Breton 1985;Merboldt et al. 1985;Taylor and Bushell 1985). Since then it has evolved into a beneficial technique for in-vivo investigation of tissue micro-structure in the human brain (Le Bihan et al. 2001), spinal cord (Clark et al. 1999) and muscle tissue (Sinha et al. 2006) as it probes microscopic structures well beyond typical image resolutions through water molecule displacement. Diffusion weighted data is usually measured using the pulsed gradient spin echo sequence (PGSE, Stejskal and Tanner 1965) with a number of specified magnetic field gradients in different directions q. Due to diffusion in tissue this signal S( q) is attenuated compared to the signal S 0 acquired without diffusion gradient application. For a recent discussion of the physics of image acquisition, relations to white matter anatomy, clinical use and problems in neuroscience that can be addressed by DWI we refer to Johansen-Berg and Behrens (2009). A good introduction into the basic principles of magnetic resonance imaging is given in Callaghan (1991). Requirements on DTI acquisition schemes are discussed e.g., in Smith et al. (2007).
In this paper we discuss the modeling and analysis of single subject DWI data after image reconstruction using the package dti (Tabelow and Polzehl 2011) for the R environment for statistical computing (R Development Core Team 2011). An earlier version (0.6-0) of the package that was restricted to diffusion tensor imaging (DTI) and adaptive smoothing for DTI (Tabelow et al. 2008) has been described in Polzehl and Tabelow (2009).
The focus of this paper is on models for estimating the orientation distribution function (ODF), see Tuch et al. (2002); Wedeen et al. (2005); Aganj et al. (2010) and Barnett (2009), that overcome the limitations of the tensor model. Such models are of special interest for high angular resolution diffusion weighted imaging (HARDI) data. Section 2.2 briefly reviews the diffusion tensor model and discusses its limitations. We then introduce both Q-ball imaging (Tuch 2004;Anderson 2005;Hess et al. 2006;Descoteaux et al. 2007) and tensor mixture models (Tuch 2002;Tournier et al. 2004;Alexander 2005;Özarslan et al. 2006;Alexander 2006;Behrens et al. 2007;Leow et al. 2009;Tabelow et al. 2011b). Section 4 introduces the design of the package and uses several examples to illustrate a typical data analysis. Data sets and R scripts are provided in electronic form.
2. Modeling diffusion weighted data 2.1. The diffusion weighted signal DWI effectively measures the diffusion of water molecules in a direction determined by external magnetic field gradients. Let P ( r, r , τ ) denote the probability density for a particle (spin) to diffuse from position r to r in time τ . With DWI an aggregate value of P over a volume (voxel) V can be determined, where p( r ) is the initial probability density of the location of the particles. A voxel is typically of a size of about 1 − 8 mm 3 . The relation to the normalized diffusion weighted signal E( q, τ ) = S( q, τ )/S 0 , is given by the three-dimensional Fourier transform F and its inverse: E( q, τ ) e −2πi q R d q = F(E( q, τ )), E( q, τ ) = R 3 P ( R, τ ) e 2πi q R d R = F −1 (P ( q, τ )).
(2) Figure 1 provides a snapshot of a three dimensional data visualization generated by Figure 1: 3D Data visualization for a row of four voxel in the artificial tensor data using the package rgl (Adler and Murdoch 2010). For each voxel the values E( q, τ ) are coded as distances to voxel centers. Diffusion gradients are characterized by their spherical coordinate directions and additional color coding using red for left-right, green for anterior-posterior and blue for inferior-superior directions.
Within a DWI experiment n grad diffusion weighted images S( q i , τ ) and at least one image S 0 without application of a diffusion gradient are acquired (Smith et al. 2007).

The diffusion tensor model
Assuming homogeneity of tissue within a voxel the diffusion tensor model is derived for the special case of Gaussian diffusion 1 , that is fully characterized by a three dimensional tensor D, Equation 2 can then be expressed as with the b value depending on q and the effective diffusion time τ (Basser et al. 1994a). Within this model diffusion is completely characterized by the positive definite symmetric 3×3 matrix D, the diffusion tensor. For simplicity of notation we henceforth omit the diffusion time τ .
The components of the diffusion tensor clearly depend on the orientation of the object in the scanner frame. Rotationally invariant quantities derived from the diffusion tensor circumvent 1 Free diffusion is usually isotropic except in some liquid crystals. However, the Gaussian model of anisotropic diffusion can be considered as the low spatial frequency approximation to the diffusion propagator in case of restricted diffusion  this dependence and are usually used for further analysis, mainly based on the eigenvalues λ i (i = 1, 2, 3) of D with λ i > 0 for positive definite tensors (Basser and Jones 2002). The eigenvector e 1 corresponding to the principal eigenvalue λ 1 determines the main diffusion direction used for fiber tracking.
The simplest quantity based on the eigenvalues is the trace of the diffusion tensor Tr(D) = 3 i=1 λ i which is related to the mean diffusivity λ = Tr(D)/3. The anisotropy of the diffusion can be described using higher moments of the eigenvalues λ i . The widely used fractional anisotropy (FA) is defined as with 0 ≤ FA ≤ 1, where FA = 0 indicates equal eigenvalues and hence isotropic diffusion.
The resulting FA-maps together with a color-coding scheme are used for visualization and interpretation. Clinical use focuses on diagnostics of neuronal disease, see e.g., Kellinghaus et al. (2006); Kleffner et al. (2008); Deppe et al. (2008); Duning et al. (2010). The principal eigenvector e 1 = (e 1x , e 1y , e 1z ) is used for assigning each voxel a specific color, interpreting e 1x , e 1y , and e 1z weighted with the value FA as red (left-right), green (anterior-posterior), and blue (inferior-superior) contribution Note, that in contrast to the eigenvalues and the FA the principal eigenvector, and hence the color coding, depends on the orientation of the object in the scanner frame.
The main problem of the diffusion tensor model arises from the assumption of tissue homogeneity within a voxel. DWI aims to detect structures with an extension of up to 15 cm and a diameter of about 30 µm. A high percentage of voxels contain structures with different orientations. In such a situation partial volume effects may lead to invalid or non-informative tensor estimates.

The orientation distribution function
The drawbacks of the diffusion tensor model may be avoided by a more detailed analysis of Equation 1. The main concern in DWI is to access properties of the tissue like neuronal fiber bundles. A special interest is in the orientation properties of the probability distribution P (its projection onto the unit sphere S 2 ) with R = r u and u ∈ S 2 and neglecting the dependence on τ . Equation 4 is the orientation distribution function (ODF) as proposed in Wedeen et al. (2005) and used in Aganj et al. (2010). For a detailed discussion on how this differs from the original definition of Tuch et al. (2002) see Barnett (2009). This definition is intrinsically normalized and defines a probability distribution on the sphere S 2 .
In case of elliptically symmetric distributions P ( R, τ ) = C τ (det D) −1/2 h τ ( R D −1 R) with zero mean, a three dimensional tensor D as shape parameter, some function h τ and a normalization constant C τ the ODF is given by , (see Tabelow et al. 2011b). This is known as the angular central Gaussian distribution on the sphere (Mardia and Jupp 2000). One example for an elliptically symmetric distribution P is the anisotropic Gaussian diffusion assumed for the tensor model.
Using spherical coordinates q = q u = (q, θ, φ) and inserting (2) into (4) the orientation distribution function may be expressed as a function of the signal E( q): (Aganj et al. 2010).

Q-ball imaging
Q-ball imaging (QBI, Tuch 2004) aims at the reconstruction of the ODF from signals E( q 0 ) measured on one q-shell, i.e., for one fixed b value corresponding to the value q 0 . This is in contrast to alternative methods like diffusion spectrum imaging (DSI, Wedeen et al. 2005) or hybrid diffusion imaging (HYDI, Wu and Alexander 2007) which measure diffusion weighted images on a Cartesian grid of diffusion gradients or sample gradients from spheres with different b values, respectively. Both methods effectively require significantly more diffusion weighted images.
QBI requires an extrapolation of the signal E( q) to other locations in q-space in order to evaluate the integral over q. Here, we use the formulation of QBI as in Aganj et al. (2010) which differs from the original proposal by Tuch (2004) by the r 2 weighting factor in the definition of the ODF. The extrapolation of E( q 0 ) is performed by assuming a mono-exponential decay of the signal in q E(q u) = E(q 0 u) (q/q 0 ) 2 .
Under this assumption the ODF may be expressed as denotes the Funk-Radon transformation (Aganj et al. 2010).
Efficient implementations of QBI make use of the spherical harmonic basis functions Y m k ( u), particularly in the form of the a symmetric modified basis. For selected maximum order l, Figure 2: True ODF and expected reconstruction by spherical harmonics expansion of order 2, 4, 6 and 8 (from left to right) for a voxel with tree mixture components. even order k ≤ l, and −k ≤ m ≤ k we define an index j = k(k + 1)/2 + m + 1 and a real function where k j is the order associated with j. The use of the expansion leads to an approximation of the ODF by where P k j are the Legendre polynomials of order k j (Aganj et al. 2010). Estimation of the ODF is now a linear problem where estimates of the parameters c j , j = 1, . . . , N are obtained from the observed noisy signals E( u i ) = S( q i )/S 0 , i = 1, . . . , n grad by solving the regularized least squares problem (Descoteaux et al. 2007;Aganj et al. 2010) Regularization is used to reduce the variability of the estimated ODF, with the parameter λ reg effectively determining a balance between variance and bias of the estimated ODF. Figure 2 illustrates effects of ODF reconstruction by spherical harmonic expansions for a voxel where the true ODF is a mixture of three angular central Gaussian distributions. Reconstruction was based on 136 gradient directions in a noise-free situation. Figure 2 is generated using the following script.

Tensor mixture models
Although the estimation of the ODF by spherical harmonics expansion of the observed signal is computationally appealing it also has several drawbacks. The form of the ODF reconstruction depends on the order l of the approximation and the regularization parameter λ reg . The approximation by orthogonal basis functions aims for a minimal mean squared error of the density estimates and may lead to a bias in the location of the maxima of the estimated ODF.
Finally, in the case of QBI, the result depends on the assumption of a mono-exponential decay of the signal E with q. One may therefore consider alternative models for DWI.
Let us assume a voxel to contain M compartments covering a fraction w m of the voxel such that M m=1 w m = 1 and anisotropic Gaussian diffusion characterized by the tensor D m in compartment m, m = 1, . . . , M . This leads to a signal that does not exhibit a mono-exponential decay but instead satisfies The corresponding ODF is a mixture of the ODF's for the compartments This additivity of the ODF is not preserved when using Equations 5 and 6 due to the nonlinearity of the operators ln(− ln(·)).
The mixture model of M diffusion tensors D m in its general form (7) is too flexible, leading to severe identifiability and numerical problems (Tuch 2002;Johansen-Berg and Behrens 2009). Suggestions to resolve the problem include the restriction of the number of components to M = 2 (Alexander 2005(Alexander , 2006Özarslan et al. 2006), the approximation of the solution by estimating the ODF spherical harmonics expansion and subsequent deconvolution (Tournier et al. 2004) or the use of restrictions for the tensor eigenvalues (Leow et al. 2009).
In Tabelow et al. (2011b) we show that a reduction of the complexity of the model 7 accompanied by model selection using BIC (Schwarz 1978) is practical. We assume the diffusion tensors D m to have a spectral decomposition Basser et al. 1994b) and to differ only in their main direction d m .
After re-parameterization the reduced model has the form with , cos(φ m )) the model has 2M + 1 nonlinear parameters and M linear parameters with constraints. Estimation in model 8 still requires to solve a non-convex optimization problem and the use of suitable initial estimates. For a detailed description of parameter estimation and a strategy to select the number of mixture components M we refer to Tabelow et al. (2011b). The tensor mixture model is characterized by the number of compartments M opt estimated by BIC, the fractional anisotropy of a prolate tensor model , and, assuming w 1 ≥ · · · ≥ w Mopt , the effective order The definition of FA is a straightforward generalization from the DTI model. The value of EO ranges from 1 for w 1 = 1 to M opt in case of equal mixture coefficients w m ≡ 1/M opt .
In general estimation in model 8 is computationally expensive and may require multiple starting points in optimization to provide suitable results.

Fiber tracking
The estimated diffusion tensors and ODF's may be used to infer the underlying neuronal fiber structure. This is performed using fiber tracking algorithms on vector fields or tensor orientation functions extracted from the estimated objects. There exists a large variety of fiber tracking algorithms that can be roughly classified into deterministic and probabilistic approaches and are based either on local or global criteria. For an overview of existing approaches, see e.g., Mori and van Zijl (2002)

Data sets
With this document we provide four data sets that have been used within the text and may be used to explore the capabilities of the R package dti. The data are contained in the electronic appendix of this paper in form of directories containing NIfTI files. The data may be freely used under the terms of the GPL ≥ 2 license.
MRI images were obtained from a healthy male volunteer in the age group 40 -45 within an Institutional Review Board approved research protocol at Weill Cornell Medical College. Images were acquired on a 3.0 Tesla General Electric Excite MRI scanner using an 8-channel receive-only head coil. First, a localizer scan was obtained to prescribe the position of the subsequent DWI scan. For the DWI scan, a single-shot spin-echo echo planar imaging (EPI) sequence with 10 images without diffusion weighting and 140 diffusion gradient directions, which were approximately isotropically distributed over the sphere, was used, with an echo and repetition time of TE = 73.2 ms and TR = 14000 ms, respectively. 66 axial slices were scanned with no gap and an acquisition matrix size of 128 × 128. Images were zero-filled to an image matrix size of 256 × 256, yielding an effective resolution of 0.898 × 0.898 × 1.800 mm 3 . The b value in the diffusion weighted images was 1000 s / mm 2 , the parallel imaging acceleration factor was 2, and the total scan time for this scan was 36 min.
Experimental data 1: This data set contains a subset of the DWI data described above. The data covers parts of the genu of the corpus callosum (GCC), the anterior thalamic radiation (ATR) and the superior fronto-occipital-fasciculus (SFO).
Data directory: Artificial tensor data: An artificial data set created by demo("dti_art") with default settings. For this data set the tensor model is adequate. Estimated fiber tracks: This data set contains fiber tracks identified from whole brain data recorded within an diffusion weighted experiment using a tensor mixture model of maximal order 3. (The experimental data sets 1 and 2 contain part of data from this experiment.) The data set contains fibers extending over at least 100 voxels. Access data using load("tracks3_100.rsc").

Analyzing DWI data: The R package dti
The R package dti has been designed to perform an analysis of single subject diffusion weighted imaging data using S4 classes (Chambers 2008). The concept of S4 classes enables the use of principles from object oriented programming like inheritance, methods and polymorphism in R. Consistency of the work flow for analyzing DWI data ( Figure 4)
(7) (7) reduceFibers Remove redundant fiber tracks.  defined data structures and methods. Table 1 provides an overview of currently implemented classes. Table 2 lists the available functions and the classes of objects they generate. Table 3 gives a list of available methods and the respective classes of objects they act on and create. An extract method is provided to access specific information from the S4-objects, see Example 4.3. The information to extract is specified by an argument what. For detailed information we refer to the documentation of the package. The use of the package in DTI and our approach to adaptive smoothing within this context is described in detail in Polzehl and Tabelow (2009);Tabelow et al. (2008).
Information on the package and its classes and methods is obtained within an R session by R> help(dti) R> class?class-name R> methods?method-name The package includes two data sets: "polyeder" contains a description of regular polyhedra that are refinements of the icosahedron and are used for visualization and "optgradients" contains sets of optimal gradient directions.
Currently there are two comprehensive demos, demo("dti_art") for modeling within the context of the diffusion tensor model and demo("mixtens_art") illustrating the work flow for analyzing HARDI data. In both demos a variety of configurations may be specified both concerning underlying true fiber structures as well as the number of gradients and signal-tonoise ratio (SNR).

Example: Tensor estimates
To illustrate the capabilities of the package in DTI in an adequate situation we use the artificial tensor data. The true structure has, in the anisotropic parts, main eigenvectors of the diffusion tensors along circular bands, and in the center, in vertical direction. Voxel-wise tensor estimates are obtained without and with structural adaptive smoothing (Tabelow et al. 2008;Polzehl and Tabelow 2009) using a maximal bandwidth of 4. Tensor characteristics are computed and fiber tracking is performed.

Example: The effect of regularization in Q-ball imaging
The next example illustrates the effect of regularization in ODF reconstruction with spherical harmonics using the experimental data set 1. The data are modeled using a tensor mixture model of maximum order 4 and the ODF-estimate (5, 6) applying the regularization proposed in Aganj et al. (2010) when estimating the coefficients in (5). The regularization parameter is chosen as λ reg = {2.5 · 10 −3 , 1 · 10 −2 , 4 · 10 −2 }, respectively.
ODF reconstruction using spherical harmonics is, as a solution of a linear problem, an appealing method. This comes at the cost of a biased estimate due a violation of its mono-exponential decay assumption. Adequate modeling may be achieved using multi-shell acquisition schemes that lead to a multi-exponential decay assumption (Descoteaux et al. 2010;Aganj et al. 2010). The method requires one to choose appropriate parameters l and λ reg .

Example: Tensor mixture models
Here, we illustrate capabilities of the tensor mixture model in comparison to the single tensor model. We first analyze the experimental data set 2 with both models.
Parameter estimation in the tensor mixture model 8 requires one to solve a non-convex optimization problem. In such problems results depend on the appropriate choice of initial values for the parameters. The method dwiMtImprove allows for possible improvements using results from neighboring voxel for improved initial estimates. The method dwiMtCombine is used for a voxel-wise comparison of results from two dwiMixtensor-objects selecting the best reconstruction with respect to the specified model selection criterion (default: "BIC"). In the example these methods are effectively used to select, in each voxel, a best estimate using different initial parameters.
The method extract is now used to access components from the objects containing the tensor and tensor mixture estimates.
provides characteristics of results obtained in the analysis. The left image shows the FA for the second axial slice obtained using the tensor model. The other four images provide FA, estimated number of mixture components, effective order and maximal eigenvalues (from left to right) for the same slice obtained employing a tensor mixture model with specified maximal number of compartments of 4 (see Tabelow et al. 2011b, for definitions and details). We observe an increase of FA in comparison with the tensor model, especially in regions adjacent to tissue borders. Note the spatial homogeneity observed in all characteristics for the tensor mixture model.

Example: Fiber tracking
The object tracks3.100 provided in file tracks3_100.rsc has been obtained analyzing high resolution DWI data recorded by H.-U. Voss at Weill Medical College, Cornell University NY. 140 gradient directions were used and images of 256 × 256 × 66 voxel with spatial resolution of approximately 0.9 mm ×0.9 mm ×1.8 mm where recorded. Fiber tracking is performed by the method tracking. The R code R> load("tracks3_100.rsc") R> summary(tracks3.100) provides a complete history of the object tracks3.100 including the applied functions and methods for the tracking and the basic characteristics of the object:

Conclusions
With the package dti we provide a toolbox for the analysis of diffusion weighted MR data within the R language and environment for statistical computing (R Development Core Team 2011). The package includes functions for reading image data in DICOM and NIfTI format. The data can be analyzed using diffusion tensor models, tensor mixture models and ODF The upper slices of the brain were not scanned in favor of spinal cord slices. models using spherical harmonics expansions. Fiber tracking may be performed using a deterministic streamline algorithm. The package has extensive 2D (based on the adimpro package, Polzehl and Tabelow 2007) and 3D (based on the rgl package, Adler and Murdoch 2010) visualization capabilities that can be used to produce publication ready illustrations. A special feature of the package are functions for model-based adaptive smoothing of DWI data. Further improvements will include a model free adaptive smoothing method and an OpenMP based parallel implementation. Alternatives for reading and writing image data are provided by oro.dicom (Whitcher 2011), oro.nifti , tractor.base (Clayden 2010;Clayden et al. 2011) and Rniftilib (Granert 2010). Interface functions to use capabilities of these packages are planned. We refer to Tabelow et al. (2011a) and the Medical Imaging task view (Whitcher 2010) for a comprehensive overview of related activities in R.
This paper discusses modeling for diffusion weighted MR experiments, describes the package structure and provides guidance for a typical analysis using several instructive examples. We encourage the reader to use the example R scripts and data provided as an electronic appendix to follow the analysis and to get an impression of the 3D visualization capabilities. The R scripts may also serve as a template for analyzing your own data.

Appendices
The electronic appendix contains the data sets used in directories data1, data2, data3, data4a and data4b as well as scripts containing the R code used to produce the figures ant to perform the analysis in the examples.