Spatio-Temporal Multiway Data Decomposition Using Principal Tensor Analysis on k-Modes: The R Package PTAk

The purpose of this paper is to describe the R package {PTAk and how the spatio-temporal context can be taken into account in the analyses. Essentially PTAk() is a multiway multidimensional method to decompose a multi-entries data-array, seen mathematically as a tensor of any order. This PTAk-modes method proposes a way of generalizing SVD (singular value decomposition), as well as some other well known methods included in the R package, such as PARAFAC or CANDECOMP and the PCAn-modes or Tucker-n model. The example datasets cover different domains with various spatio-temporal characteristics and issues: (i)~medical imaging in neuropsychology with a functional MRI (magnetic resonance imaging) study, (ii)~pharmaceutical research with a pharmacodynamic study with EEG (electro-encephaloegraphic) data for a central nervous system (CNS) drug, and (iii)~geographical information system (GIS) with a climatic dataset that characterizes arid and semi-arid variations. All the methods implemented in the R package PTAk also support non-identity metrics, as well as penalizations during the optimization process. As a result of these flexibilities, together with pre-processing facilities, PTAk constitutes a framework for devising extensions of multidimensional methods such ascorrespondence analysis, discriminant analysis, and multidimensional scaling, also enabling spatio-temporal constraints.


Introduction
Multiway data are common in different scientific and non-scientific domains where modelling interactions is crucial for better understanding of the studied phenomena. By "multiway" it is understood that observations are described by a series of characteristics dependent within the

Basics of multiway data analysis
Spatio-temporal measurements are naturally linked to multiway data. For example, Tunisian climatic data, analyzed in further sections, deals with 10 climatic indicators measured on a spatial domain, a pixel grid of size 2599, and summarized by their 12 monthly average over 50 years.
In R this can be stored in a multiple-entries table, an "array" object, here of dimension 2599 × 12 × 10, where the first entry refers to space, the second to month, and the third to indicator. Multiway data can occur in other contexts and appear usually when repeating the same measurements on some statistical units, spatially, at different times, and/or different conditions. For the CNS drug data one can obtain an array with 5 entries: subject, drugdose, time, electrode, and EEG spectral band ; the interest for this pharmaco-dynamic study is about identifying differences in doses with a spatial zone of activation for a specific EEG band pattern. Multiway data, stored in an "array" object, can be collapsed to a "matrix" object, allowing the use of multivariate methods, inferential or descriptive such as multidimensional analysis, or even into a single "vector" to use univariate models such as ANOVA taking in account the complex design as covariates in the inferential procedure.

Models for multiway interactions
Multiway data analysis acknowledges the multiple interactions of the data. ANOVA (analysis of variance) which deals with decomposition and interaction is not a multidimensional method; pursuing this kind of approach FANOVA methods (F for factorial) added nonetheless a factorial decomposition for two modes (Gollob 1968).
In mathematical algebra, an array can be seen as a multilinear form or tensor (Lang 1984). The properties of tensor algebra enable to derive multiple-entries table calculus, therefore extending matrix calculus (Franc 1992;Leibovici 1993;Dauxois et al. 1994;Leibovici and Sabatier 1998). A multiway multidimensional method or multiway method for short, deals directly with the multilinear aspects of multiple-entries array data by proposing a tensorial decomposition, i.e., a multilinear decomposition of the form: A(x, y) = r u P u (x, y) + (1) for a matrix A (a tensor of order 2) acting as a bilinear form on vectors x, y of appropriate dimensions, which extends as: for the three-way array B (a tensor of order 3) acting as a trilinear form. The structures of P u and T v express the model by being some "simple element" of the tensor space, usually rank-one tensors, see further. The number of components r is part of the model in terms of approximation level and is the "residual".
According to different model optimizations and constraints, one gets different forms of decomposition and properties for the elementary tensors (the P u or the T v ). For example in PCA optimization, the components will have the form P u = σ u ψ u t φ u where ψ u is a principal components normed to 1 and φ u is the corresponding factor loadings or factor component also normed to 1 (notation: t v is the "row" vector, transpose of a "column" vector v). The σ u value correspond to the square root of the variance associated with this principal component, and we have: which is equal to σ u if x = ψ u and y = φ u . Equation 3 enables the study of the properties of A = r u σ u ψ u t φ u as an approximation of A. For trilinear or higher order forms, the notation used in Equation 3 becomes: where ⊗ and .. are respectively called tensor product and contraction. The tensor product is also known as the outer product and the contraction generalizes the operation performed when transforming a vector by a matrix.
The models PARAFAC/CANDECOMP (refer to Carroll and Chang 1970;Harshman 1970), PARAFAC-orthogonal and PTAk-modes (Leibovici 1993;Leibovici and El Maâche 1997) follow this generic presentation as well as PCAn-modes (or Tucker-n model) (Kroonenberg and De Leeuw 1980;Kroonenberg 1983;Kaptein et al. 1986) but the latter is usually presented in a condensed way using tensor product of matrices, see further, Equation 10. The estimation procedure is usually an alternating least squares (ALS) optimization, i.e., after initialization, optimizing one set of components at a time, the other being fixed by the previous optimization.
Setting no particular constraints between vector components within each mode or entry of the table, PARAFAC/CANDECOMP, where the number of components for each mode has to be equal, performs the optimization by alternating multivariate regression techniques. Generic PCAn-modes will not impose equality of number of components for each mode but stating orthogonality within each mode, performs optimization by alternating eigen-decomposition of a particular symmetric matrix (Leibovici 1993). PARAFAC-orthogonal can be seen as a PARAFAC/CANDECOMP where orthogonality between the components of each mode is imposed, or as a PCAn-modes where the core tensor C (see Equation 10) expressing cross-links between components is imposed to be hyper-diagonal (only C iii = 0). PARAFAC-orthogonal can be obtained using a PTAk-modes, by keeping only "main" principal tensors (see further). PTAk-modes proceeds also using ALS (alternating least squares) technique but step by step instead of optimizing the full set of components at each optimization. The algorithm involved in the PTAk-modes model is explained in more detail in Section 3, in Equations 8, 9 and 11; the expression of the CANDECOMP/PARAFAC model and the PCAn-modes model are also explicit in Equation 9 and 10.

[1] TRUE
Notice the above matrix is of rank one. The tensor product of any number of vectors gives what is called a rank-one tensor, as in fact any bilinear function resulting from collapsing the array into a matrix will be always of rank one.
When storing a dataset into an "array" object it is also essential to know that the left index runs faster: try array(1:24, c(2, 3, 4)). Performing a contraction of tensor of dimension (30, 10, 4, 2) by a vector of dimension 4 can be done by collapsing the tensor into a matrix of dimension (600, 4), then performing the multiplication of the vector by the matrix, then by reforming the array of dimension (30, 10, 2). This kind of operation is facilitated by the packages tensor (Rougier 2002) and tensorA (van den Boogaart 2007).
The outer product concatenates dimensions and multiplies the left matrix by each element of the right matrix; the Kronecker product multiplies dimensions and multiplies the right matrix by each element of the left matrix: R> A <-matrix (1:8, 4, 2) R> B <-matrix(c(1, 2, 0, 1), 2, 2) R> class(B %o% A) [1] 2 2 4 2 R> class(A %x% A) Note the essential tool for data analysis is the array() function to store the dataset and its related methods such as aperm() to permute the dimensions of the array. The operators briefly described above will be used within the methods to decompose the multi-entry table.

Extension of PCA as proposed by PTAk-modes
The PTAk model approach is similar to a step by step PCA, but for tensors. In order to describe the generalization proposed with the PTAk-modes model, let us first rewrite the PCA method within a tensorial framework.

PCA of tensor of order 2
For a given matrix X of dimension n × p, the first principal component is a linear combination (given by a p-dimensional vector ϕ 1 ) of the p columns ensuring maximum sum of squares of the coordinates of the n-dimensional vector obtained. The square root of this sum of squares is called the first singular value σ 1 . One has: t (Xϕ 1 )(Xϕ 1 ) = σ 2 1 and Xϕ 1 /σ 1 is the principal component normed to 1. This maximization problem can be written either in matrix or tensor form: In Equation 5, X represents either the data matrix or the data tensor of the same data table.
Another easy way of understanding computationally the algebraic operators ".." and "⊗" is to see them as the following operations: ψ 1 ⊗ ϕ 1 is a np vector of the n blocks of the p vectors ψ 1i ϕ 1 , i = 1, ...n (this is the computational description using the Kronecker product); ".." called a contraction, generalizes the multiplication of a matrix by a vector and in the case of equal dimensions (as above), it corresponds to the natural inner product (X is then also seen an np vector). ψ 1 is termed first principal component, ϕ 1 first principal axis, (ψ 1 ⊗ ϕ 1 ) is called first principal tensor.
Notice here the description of a tensor of order 2, a bilinear map, as associated to a matrix is usually associated to one linear map. The duality diagram (Cailliez and Pagès 1976;Escoufier 1987;Dray and Dufour 2007) comes to complete the association with another linear map on the dual spaces involved to define the other linear map: expressed by the transposed matrix.
The contraction, "..", is implemented within the function CONTRACTION() and it uses the R package tensor (Rougier 2002).

PTAk-modes of a tensor of order k > 2
Now if X is a tensor of higher order, say 3 here we can look for the first principal tensor associated with the singular value with the optimization form: This is a direct extension of Equation 5, as expressed by the practical schemas in Figure 1, with contractions made either on a matrix table or on a tensor of order 3. The further extension to k > 3 is straightforward. CONTRACTION.list() is convenient relatively to Equations 5 and 6 as it performs the contraction without computing the tensor product of the vectors in the first place as algebraically: The function SINGVA() computes the best rank-one approximation of the given tensor X together with its singular value, given by Equation 6 (and a similar equation for higher orders). The therein algorithm, called RPVSCC in Leibovici (1993), is inspired from the algorithm of "reciprocal averaging" (Hill 1973) also known as the "transition formulae" in modern correspondence analysis and in the signal processing community as the "power method".
Adding an orthogonality constraint to Equation 6 allows us to carry on the algorithm to find the second principal tensors and so on. The optimization becomes is equivalent but working on P (ψ ⊥ 1 ⊗ϕ ⊥ 1 ⊗φ ⊥ 1 ) X: the orthogonal projection of X onto the orthogonal tensorial of the principal tensor, i.e (ψ ⊥ 1 ⊗ ϕ ⊥ 1 ⊗ φ ⊥ 1 ); this projector can also be written as (Id − P ψ 1 ) ⊗ Id − P ϕ 1 ) ⊗ (Id − P φ 1 ). Following this algorithm schema, given in Equation 8, the PTAk-modes decomposition obtained offers a way of synthesizing the data according to uncorrelated sets of components. Within this schema implemented for the functions PTA3() and PTAk() one can distinguish main principal tensors from associated principal tensors. The latter are associated with main principal tensors as they show one or more component of this main principal tensor in their sets of components. The associated principal tensors are obtained by a PTA(k − 1)-modes decomposition once the k-modes data array has been "contracted" by the given component. Figure 2: Output summary from the function summary() on a "PTAk" object, here the climatic data described in Section 4.3: (a) is the first principal tensor, (c) represents all the associated principal tensors to first one such like (b) are the spatial-mode associated principal tensors, (d) corresponds to a PTAk-modes decomposition of the initial data tensor projected onto the orthogonal tensorial of the first principal tensor.
This makes the algorithm a recursive algorithm with the following procedure, where here k = 3: The notation ⊗ i means that the vector on the left hand will take the ith place, among the k places here, in each full tensorial product, e.g., ϕ 1 ⊗ 2 (α ⊗ β) = α ⊗ ϕ 1 ⊗ β. More details on the properties of the method and on each function of the R package is given in the references Leibovici and Sabatier (1998); Leibovici (2009). Figure 2 illustrate the multi-hierarchical decomposition obtained with the PTAk-modes model. In Figure 2, in almost the same way as for PCA, one gets a hierarchy of principal tensors corresponding to a hierarchy of sum of squares, i.e., by the square of the singular values (σ) under the column -Sing Val associated with each principal tensor. It is a multilevel hierarchy in agreement with Equation 8. Percents of variability associated with each principal tensor can be used to retain the main variability within the data tensor X. These percentages are in the -Global Pct column of Figure 2 whereas -local Pct are relative to the sum of squares given in column -ssX linked to the current tensorial optimization as defined in Equation 8. Plots of the vector components of a particular principal tensor allows the description of the extracted variability for each principal tensor.

Equation 8 and
PROJOT() is the function within PTAk() performing the orthogonal tensor projection of Equation 8 but can also be used for any structure or design associated with each mode to perform a linear constrained analysis in the same way as for PCAIV (principal component analysis on instrumental variables), see Leibovici (2000) for a full description of using PTAIVk() and in the PTAk manual for PROJOT() where a quick implementation is given as an example.

A brief comparison of multiway models
Before expressing in detail the R usage of the main methods within PTAk a practical comparison of the multiway models already described is of use. The models behind the methods PTAk(), CANDPARA() (PARAFAC/CANDECOMP) and PCAn() (Tucker-n model) are equivalent when looking for best rank-one approximation. This can be demonstrated from the expression of the models associated with these methods and can be understood from Equations 9 and 10. Using an example this would be: R> library("PTAk") R> PTAk(X, nbPT = 1, nbPT2 = 0) == CANDPARA(X, dim = 1) R> PTAk(X, nbPT = 1, nbPT2 = 0) == PCAn(X, dim = rep(1, length(dim(X)))) R> CANDPARA(X, dim = 1) == PCAn(X, dim = rep(1, length(dim(X)))) This cannot be strictly verified using the package PTAk as CANDPARA() and PCAn() in their implementation only accept rank approximation greater than 1. Working around using a tensor "nearly" of rank-one is: R> X <-c(1, 2, 3) %o% c(2, 4, 6) %o% c(3, 7) + rnorm(18, sd = 0.0001) R> sol1 <-PTAk(X, nbPT = 2, nbPT2 = 0) R> sol2 <-CANDPARA(X, dim = 2); R> sol3 <-PCAn(X, dim = c(2, 2, 2)) R> This gives a numerical proof of equivalence between PTAk, PARAFAC/CANDECOMP and Tucker-n when looking for the best rank-one approximation. Then the methods differ as also differs the rank definition attached to each model. PTAk will try to look for best approximation according to the orthogonal rank, i.e., the rank-one tensors (of the decomposition) are orthogonal; Tucker-n or PCAn-modes will look for best approximation according to the space-ranks, i.e., ranks of every bilinear form deducted from the original tensor (folding the multi-array into a matrix), that is the number of components in each space; PARAFAC/CANDECOMP will look for best approximation according to the rank, i.e., the rank-one tensors are not necessarily orthogonal.
It is said here "PTAk will try" as it has been shown recently on an example that the orthogonalrank was not necessarily providing a nested decomposition as PTAk-modes implies (Kolda 2003). One can also notice that PTAk model extends the PARAFAC-orthogonal if one only retains in the decomposition the main principal tensors (not the associated ones), i.e., by setting nbPT2 = 0 in the PTAk() call or by ignoring them.
The function REBUILD() will return the approximated or filtered dataset according to the method used, either PTAk(), CANDPARA(), or PCAn(); the parameters of the method are the list of tensors and/or a global threshold for percentage of variability explained by each elementary tensors. For PCAn() the function calls REBUILDPCAn() which does not use these parameters.
R> Xapp <-REBUILD(sol1, nTens = c(1, 2), testvar = 1e-12) --Variance Percent rebuilt X at 100 % --MSE 4.378514e-09 --with 2 Principal Tensors out of 2 given --compression 0 % For PTAk() and CANDPARA(), the approximation is done according to the equation model, here written for a tensor of order 4: where ς is a set of the selected elementary tensors. The PCAn() rebuilt approximation is a direct generalization of model from Kroonenberg and De Leeuw (1980): where the components here are matrices of components with as many columns in each modespace as asked for during the optimization analysis (the space-ranks), and C being the core tensor with dimensions corresponding to the space-ranks.

Running a general PTAk
SINGVA(), PROJOT() described above are part of the main functions for 2-modes analysis, such as SVDGen(), and k-modes analysis with PTAk(), CANDPARA() and PCAn(). They can also be used to devise new analysis. So once you have loaded or scanned the dataset from other sources or format, put it in a multi-array, an "array" object in R you can run the PTAk() decomposition or the other multiway methods.

Structure of the PTAk package
The package can be summarized as four series of methods: (i) preprocessing the methods Multcent(), IterMV(), Detren() for data preprocessing but also some metrics preparation such as CauRuimet(), (ii) the multiway analysis methods PTAk(), FCAk(), CANDPARA(), and PCAn() which output objects of class "PTAk" and appropriate subclasses given by the name of the analysis along with S3 methods associated with them (iii) plot(), summary(), and REBUILD(), the other methods (iv) are either internal or used within main methods but they can be used for developing further methods.
The principal argument of the preprocessing methods is an "array" object which has been prepared beforehand for data analysis: the array will be the multiway table of the measurements arranged by their modes, i.e., the "dimensions" deserving interest, e.g., time(s), variable(s), subject(s), space(s), countrie(s), or condition(s). Whichever name used to describe an entry of the table, it has a particular semantic according to the study. For example time for the ecolimatic study example corresponds to 12 months, and for the pharmaco-dynamic study it is the hour and minutes at measurements. Some examples of preprocessing are given in Section 4.3, see the help files of the package for a detailed description of the other arguments.
The principal arguments for the analysis methods are first of all, either an "array" object of the multiway dataset or a "list" object with $dat containing an "array" object and $met containing the metrics associated with each entry of the array, then the "amount" of approximation chosen. A metric is a semi-definite positive symmetric matrix allowing to perform non-canonical scalar product, i.e., covariance, "sum of squares of products", within the corresponding vectorial space (see Section 6 for further explanation). The arguments related to this "amount" of approximation chosen are: dim an integer for CANDPARA() and a "vector" object of integers for PCAn() fixing respectively as described previously the number of elementary tensors to fit, and the size of the core tensor (therefore the number components in each space); for PTAk() one chooses the number of principal tensors at each "level" of analysis by nbPT, the last level (2-modes analysis) is fixed by nbPT2. Note that for PTA3-modes() nbPT has to be just an integer but for k > 3 it can be a vector (of length (k − 2)) specifying this choice for each level above 2-modes analysis. The current version of the package doesn't give much support for other plots or interpretations for CANDPARA() and PCAn(). For example the summary() method on a PCAn object doesn't properly describe the core tensor and no jointplot method (see Kroonenberg 1983) has been implemented yet in the package PTAk.

Practical example
This illustrative session uses the dataset related to an ecoclimatic delineation problem (Lei-bovici et al. 2007), where dynamics over a typical year of 10 climatic indicators were analysed in the circum-saharan zone, using their monthly average estimates. The problematic explained in Leibovici et al. (2007) is to delineate homogeneous zones in relation to the ecolimatic profile (rainfall, temperature, evapotranspiration, etc.). So finding main spatial patterns via the spatial components associated with a climatic profile and a seasonal pattern, was the aim of this analysis. Here the studied zone has been limited to Tunisia; the shapefile contains a regular grid with the multivariate values. The dataset Zone_climTUN was obtained using the call read.shape("E:/R_GIS/R_GilHF/TUN/tunisie_climat.shp") based on the read.shape() function from the MAPS package. For replication, the data are also available in PTAk: R> library("PTAk") R> library("maptools") R> data("Zone_climTUN") The next command produces a plot of a MAP object not shown here: R> plot(Zone_climTUN, ol = NA, auxvar = Zone_climTUN$att.data$PREC_OCTO, + nclass = 20) The data are transformed into an array object:  (Zone_clim)), c(2599, 12, 10)) R> dim (Zone3w) [1] 2599 12 10 R> dimnames(Zone3w) <-list(rownames(Zone3w), 1:12, c("P", "Tave", "ETo", + "PETo", "Tmax", "Tmin", "Q3", "Alt", "dM2T", "dMETo")) R> Zone3w.PTA3-modes <-PTA3-modes(Zone3w, nbPT = 3, nbPT2 = 3, + minpct = 0.1, addedcomment="centrée réduite sur var")  The first principal tensor captures most of the variability, 97.6%, nearly as much as the decomposition up to 3 main principal tensors and 3 for each associated, i.e., at each second level analysis (a PCA). Notice that the listing should be 30 lines long, as for each main principal tensor, 9 associated principal tensors are requested (nbPT2 = 3), but redundant tensors are removed automatically and out of the 21 potential principal tensors a selection has been performed here: Global Pct > 0.01%. The listing summary() mentions "...over 15 PT" as in the call function, the parameter minpct = 0.1 forces the algorithm to stop a k ≥ 3-level (no sub-level analysis), if this percentage of variability is not met: it was the case here for vs333. The full description of the ouput summary() is explained in the Section 4.4 where the listing ouput provides a more complete form.
This first PTAk analysis is not very useful as the variations and range of values can be very different from one climatic variable to another. So the main variations captured by the principal tensors will be towards this differentiation without necessarily expressing the interactions between the variables and them with the spatio-temporal domain which may only be detected in some principal tensors (main or associated) with comparatively small singular values. As for PCA, centering and scaling the variables, preprocessing transformation may be crucial as part of the modelling and analysis process.

Array data and preprocessing
In the ecoclimatic data example the variables of interest are in mode 3, the climatic indicators, as the other tow modes are their support, the spatial-locations and the months. How does one center and scale the 10 indicators? It depends on the variability of the data one put the focus on, so this has to be considered as a part of the model. Here we are focusing on the spatio-temporal dynamics of the indicators, so we are looking for spatial patterns and temporal patterns of the correlations of the variables. For a given spatial-location or spatial trend one would like to detect the mean temporal patterns of evolution or seasonality, therefore it is not desirable to center and scale the indicators for each month over the spatial-locations. It is also desirable to detect spatial mean patterns for a given month or temporal trend. Therefore, centering and scaling the indicators along the whole spatio-temporal observations seems appropriate. This would be the transformation to do, to perform a PCA on the indicators with spatial observations repeated over the 12 months.
Another interesting preprocessing would have been to perform a bi-centering along spatiallocation and month modes for each indicator in order to emphasize only on interactions but not on marginal effects.
Performing centering and scaling can be done with the function Multcent() which proposes a centering and/or a scaling along the by mode(s) combination, before and/or after, xxxBA some possible "multi-centering" along each bi combined with by. For example a bi-centering on a three-way table corresponding to an ANOVA way of removing each of the two first factor effects and the mean effect for each level of the third factor can be done with: Simple centering and scaling as mentioned before has been performed for the ecoclimatic data for Tunisia. This allows us to extract spatio-temporal trends and spatio-temporal interactions with the indicator mode.   Leibovici (2000), exploiting the ANOVA interpretation of the factors involved. The EEG data consisted of a repeated cross-over design on 12 subjects with 3 verum doses (10mg, 30mg and 90mg) and 1 placebo. The EEG bands quantification was available spatially at 28 electrodes with repetitions over 10 times of measurements (before drug administration and every hour or so after until 6 hours post-dosing). For this dataset the preprocessing aim was pretty much towards minimizing subject variability. Part of Figure 4 in Leibovici (2000) is reproduced here in Figure 3, showing a comparison of the first principal tensors obtained from PTA4-modes of the subject × dose × electrode × time × EEG − band with different centering and scaling. So the best result (c) was obtained after scaling each subject and then removing the subject effect and all two-way interactions with the subject factor. Would it have been better to remove subject 11 from the analysis? qualifies the identifier with a leading name. For main principal tensors the name is unique, like vs111 or vs222 where the number corresponds to an order from the top level hierarchy (the repetition of the number is to emphasize the level of the hierarchy corresponding to the order of the current tensor analyzed). The others names are qualifying associated principal tensors, expressing the dimension of the mode from which the association is made (contraction by the corresponding component vector). In 3-modes analysis, they also show the last two dimensions (on which a PCA is done).

Summary method on "PTAk" objects
For PTA4-modes the schema starts with vs1111, the second level corresponds to 3-modes analyses with names such as 12-vs222 where here the component contracting the data is explicit for the dimension, 12, but implicit for the principal tensor it comes from, as here vs222 expresses the current 3-modes analysis. The third level brings in names such as 12-300 vs222 10 7 with the same meaning as in 3-modes analysis: associated with the PCA on the table 10 × 7. This schema is then similar for all other higher modes. Notice for the 4-modes analysis there is no 12-vs111 as in fact this SINGVA() optimization is redundant with the vs1111 solution. In the same manner, the first principal component for every 2-modes analysis is redundant with the solution just optimized within the 3-modes analysis. This comes from the fact that in order to really take the benefit of the recursivity it is easier in the implementation of Equation 8 to perform the PTA(k − 1)-modes analysis just onto the contracted tensor and not onto the projection of it on the tensorial orthogonal of the rest of the principal tensor. Computationally it is then easier to let the recursive algorithm perform all the solutions associated and discarding the redundant ones. That is why there are gaps in the number for the column -no-, e.g., after the principal tensor -no-1 there will never be a principal tensor -no-2. The generic form of the PTAk() algorithm which is implemented in the R package is then: in where * means that the "top" solutions of each PTA(k − 1)-modes have to be discarded as redundant from previous optimization.

Plotting and interpreting
A class "PTAk" object is a list of lists. For each mode of the tensor, the list, reachable by its mode number in the dimension of the array, contains few descriptors of the mode and the components stored in a matrix $v where the row numbers match the order (-no-) given in the summary(), while the columns are the names list given in $n (taken from the dimnames() of the array) for the corresponding mode. The list for last mode has extra summaries describing the analysis (applicable to all modes or the whole analysis), such as the singular values stored in $d:
Interpretations of the extracted features of the dataset expressed in the principal tensors can be derived from various plots of components which can be read simultaneously. For the ecoclimatic analysis one would read and interpret a map configuration (spatial-location component), an annual pattern (month component) and an axis describing associations and oppositions of the variables (indicator component), together expressing the monthly dynamic of the ecoclimatic characteristics. PTAk provides a plot() method for "PTAk" objects which basically overlays the scattering plots of components for the asked modes. For the spatial component here we used a modified version of the plot.Map() (given in the file "v34i10-additions.R"). The following plot() calls can be seen on Figure 4 which gathers the basic 3 plots related to the principal tensors 1 and 11: the first plot is a joint plot of their modes 2 and 3 (time and indicators), the other plots are their respective spatial modes (mode 1).
R> plot(Zone3w.PTA3-modes, mod = c(2, 3), nb1 = 1, nb2 = 11, + lengthlabels = 5, coefi = list(c(1, 1, 1), c(1, -1, -1))) R> plot ( Looking closely at the given outputs, one sees that the principal tensor -no-11, vs222, makes an opposition between "number of dry months" dM2T, dMETo (two different ways deriving this ecoclimatic indicator) with positive weightings and the other indicators with negative weights. Nonetheless the plot() on Figure 4 shows the opposite signs and as a matter of fact the argument coefi, in the call of the plot(), is indicating this change for the tensor nb2 = 11 on mode 2 and 3. The reason for this ad-hoc change can be understood for example from the fact: So as in PCA where a principal component and the corresponding variable loadings can be arbitrarily multiplied by (−1), k-modes analysis having k > 2 set of components will show different ways of distributing this changes. Therefore simultaneous or joint interpretation has to be cautious about this fact. Interpretation has to look at either components separately giving a within-component description (association, opposition, etc.) or all the component scores giving a whole principal tensor interpretation, but not reading only 2 out of 3 components for example.
Using the associativity of the tensor product, a theoretical example of reading associations and oppositions for different components is given in Table 1, the interpretations are all compatible, and also identical for any other tensor equivalence transformation.
In PCA, examining correlations of variables with the principal components is the traditional way of having interpretations across components. The duality in 2-modes analysis implies Oppositions 2 by 2 (Bt 2 m) , (At 1 m), (Bt 1 n), (At 2 n) full equivalence as reading interpreting directly the factor loadings. With 3-modes, theses correlations have to be between the variables represented by the vectors of the tensor unfolded into a matrix and the ψ i ⊗ φ i (for the 2-modes variables). Similar considerations occur with k > 3 modes.

PTAk-modes with non identity metrics
As mentioned in the introduction, all multiway decomposition methods in PTAk allow us to use non-identity metrics for every space involved in the tensorial space, i.e., symmetric positive definite matrices used in the inner products within each space. (The canonical inner product -sum of cross-products-corresponds to the identity metric and the metric on the tensorial space the tensorial product of the individual metrics, see for example Leibovici 1993;Dauxois et al. 1994). Instead of feeding the methods with an "array" object X, one uses a list where $data contains the "array" object and $met is a list of "matrix" objects or "vector" object objects (diagonal metrics) representing the metrics associated with the inner products in each space. Algebraically within the tensor framework this has an effect on the contracted product and on the norms of the vectors, therefore on the optimization of Equations 6 and its equivalent for any k. Going back to one of the definitions of the tensorial product gives a hint for this natural extension: where a and ψ belong to the metric space R s with as metric the symmetric positive definite matrix D, s being the length of vector a, and similar definitions for the other elements of Equation 13. If a, b, and c cover the canonical basis elements in each space and X being expressed naturally in this basis, it is possible to grasp via the contraction of elementary tensors (rank one tensors), the utilisation of metrics in the contraction of any two tensors. Computationally one could also write: in which the use of non-identity metrics in the contracted product has been emphasized by a subscript .. m ; the subscript with a metric means rearranging the given tensor in a matrix where the columns lie in that metric space, i.e., the number of rows is equal to the dimension of the given metric space.
Like in PCA or SVD, where one analyses a triplet (X, Q, D), data and metrics for the two modes, it is convenient to refer to (k + 1)-uples of objects defining the analysis, that is the tensor of order k as the data, and the k metrics associated to each mode of the tensor:

Use of metrics in PCA
A PCA of a triplet (X, Q, D) with X a data matrix n × p, Q a p × p metric on the rows (or in the column-space) and similarly D a n × n metric on the columns (or in the row-space), is generalizing a standard PCA by diagonalizing with Q-normed vectors the matrix t XDXQ equivalent, to the covariance matrix if X is column-centered, D = 1/nId n and Q = Id p , or to the correlation matrix if instead of the identity metric Q = diag(1/var 1 . . . 1/var p ) (see also Dray and Dufour 2007, for more details).

Choice of metrics for spatial data
A classical use of metrics is in discriminant analysis when a known group structure is part of the design experiment, and either assessing or minimizing the impact of this structure on the variability is the goal of the analysis. For example it would be possible to a perform (i) a PTAIVk-modes to assess the known structure and (ii) an orthogonal-PTAIVk-modes (that is a residual analysis) to minimize the structure in the analysis, where using as metric the inverse of within-group variations would improve both analyses. When the structure is unknown, as is often the case, the goal of the analysis becomes to look for what is structuring the data. As mentioned by Caussinus and Ruiz (1990) a good strategy to reveal dense groups with generalized PCA would be to reveal outliers first using the metric W −1 o given in Equation 17 with 0.05 ≤ β ≤ 0.03 and remove them before using the metric W −1 l , Equation 16, using a β ≥ 1. W l is a robust estimate of the within covariance of the unknown structure given by the function CauRuimet: where Z is a data matrix, d 2 S − (., .) is the squared Euclidean distance with S − the inverse of a robust sample covariance and G is a graph structure expressing some known proximity (when no knowledge of proximity is chosen G ij = 1 and one just has to put mo = 1 in the function). ker is a positive decreasing function which would provide a kernel function for the different weighting: by default e −βu with choices on β. W l corresponds to the definition of local variance (Lebart 1969;Caussinus and Ruiz 1990;Faraj 1994).
To complete the different metrics or semi-metrics associated to known or unknown structure variances, the function can give also something analog to a global variance: whereZ is the location vector of the multivariate distribution, i.e., a robust estimate of the mean. In the case of some known structure the semi-metric W gα− l = W 1/α g W −1 l W g 1/α may be used to extract robust sub-structure, in the sense that the analysis will tend to minimize the local variance and be consistent at 1/α with the global variance, with convergence towards W −1 l as α increases.
So far, using these metrics is not specific to spatial data as even the graph structure can reflect any proximity in attribute space, but it is also reminiscent of neighborhood graphs as in Lebart (1969);Faraj (1994). In a spatial context, W gα− l is particularly interesting when looking for finer-scale structures compatible with some coarser scale relationships represented by the G ij (G taken into account in W g but in W l computations).
With the Tunisian climatic data example, we used W l y −1 as metric in the indicators space computed on the yearly average across spatial-location but also the same metric computed on the concatenated array across spatial-location and month mode (noted below W −1 l ).
In a similar context, geographically weighted discriminant analysis (Brunsdon et al. 2007), takes a similar kernel approach with geographic distances, to take into account spatial proximity in the estimation of variance-covariance matrix playing also the role of a metric. The approach of Borcard and Legendre (2002), claiming to account for a range of scales by using an eigenvalue decomposition of a truncated matrix of distances between sampling sites, is also applicable here to build a metric depending on spatial proximity.

Correspondence Analysis on k-way tables
The tensorial framework developed previously, using different metrics and particular datasets to perform a PTAk-modes, extends the framework of multidimensional analysis using PCA as a generic method (Escoufier 1987;Dray and Dufour 2007). A method of particular interest is independence for the two-way margins of the three-way table, and a three-way interaction term. Each two-way margin deviation from independence is reminiscent of (simple) correspondence analysis: showing that, the PTA3-modes (Equation 21) simply retrieves the two-way lacks of marginal independence, and this, in a natural way according to the algorithm schema 11. The inertia or sum of squares is: where the first (s = 0) principal tensor being 1I I ⊗ 1I J ⊗ 1I K with σ 0 = 1, its associated principal tensors relate to two-way margins decompositions, i.e., each term of the second row of the Equation 23. The use of the FCAk() function, implementing this particular PTAk(), is described in the next section for a particular spatial analysis.

Analyzing multiple collocations
Looking at collocations of attributes with v ≥ 1 categorical variables issued from spatiallocation processes (such as point processes) leads to analyse multiway tables of spatial cooccurrences . Order-two cooccurrences have now a long history within spatial pattern analysis (Ripley 1981;Diggle 2003) and can be used within R for example with the R package spatstat (Baddeley and Turner 2005).
They can also be analyzed using correspondence analysis whilst higher order cooccurrences revealing spatial patterns can be extracted via FCAk-modes performed within PTAk using the method FCAk()(respectively the CAOO and CAkOO statistical methods defined in Leibovici et al. 2008). This is illustrated here using the dataset Lansing from spatstat. This dataset consist of an ecological study where categories of trees and their positions in the studied area are recorded. Analyzing the pattern of categories will help the ecologist to study tree associations in the development of the forest. A collocation of order 3 at distance 0.1 unit (square window of 1 unit × 1 unit) on the single point process marked with a categorical variable describing the tree species was used. The collocation function cOO3d1S() is given in the file "v34i10-additions.R". Other functions for other applications of the generic PTAk method can be found at Leibovici (2004).
The function cOO3d1S() in the following code computes the collocation of order 3 for one marked point process, keeping the "locations of collocations" in one entry of the output array (instead of marginalizing for each category). Each cell n s i ,j,k contains the number of collocations, up to distance d = 0.1 unit, of categories i, j and k at location s i of category i. The FCAk() listing results obtained by the summary() function on a FCAk class object (inheritance from "PTAk" class object) differs from the PTAk() listing multi-hierarchical tree of singular values by the extra column -FCA-percentage which is merely a percentage of variability without regard to vs111 the trivial singular value 1. Associated with this trivial principal tensor we get the marginal FCA's (or generally the marginal FCA(k − 1)-modes).
When analyzing cooccurrences of high orders on the same categorical variable, such as the lansing data, one gets lots of symmetries within the tensor to be analyzed by FCAk(). A fully symmetrical tensor can cause problems to obtain convergence for the actual algorithm within SINGVA(). It is expected that the renewed interest in multiway tensor analysis, particularly for symmetric tensors (Ni and Wang 2007;Comon et al. 2008) will bring new algorithms for rank-one approximation (as in SINGVA() with the RPVSCC algorithm): see also Faber et al. (2003) for a discussion on recent PARAFAC algorithms and de Silva and Lim (2008) for a general discussion on lower-rank approximation in general. Other multiway methods linked to higher order cooccurrences, spatially or non-spatially, are focusing on generalizing dissimilarities or similarities for multidimensional scaling. For example Bennani's work from his thesis, published in Heiser and Bennani (1997), was looking at Euclidean approximation of 3-way dissimilarities using an approach generalizing the unfolding metric multidimensional scaling.

Penalized optimization
The optimization algorithm within PTA3() or PTAk(), can be seen as alternating unconstrained least squares. Utilizing metrics could be seen as introducing linear constraints or linear smoothing, which was pushed a bit further by allowing any smoothing of the components within each iteration step. This very simple constraint makes a panelization algorithm which will be equivalent to using metrics or semi-metrics if the penalizing operators are linear in a separable Hilbert space, e.g., polynomial smoothing, spline smoothing (Leibovici and El Maâche 1997;Leibovici 2008;Besse et al. 2005). Nonetheless even in the previous case, the structure of the algorithm is more flexible, as for a particular mode, the smoothing parameter is a list of smoothers which can be different along the decomposition process. However, the increased flexibility of optimization may lead to an invalid or even non-convergent PTAk(), but nearly orthogonal decompositions may be intersting. Work to fully describe the properties of such smoothing parameterization for the multiway methods in PTAk analysis is still underway by the current author.
Below is a comparison of two different PTA3 for a verbal study on 12 subjects in which brain activity is measured by fMRI during the verbal paradigm on/off showed by the square curves on Figure 8. The first analysis (top of Figure 8) is a standard PTA3-modes on the brain × time × subject data, and the analysis at the bottom is using panelization for time and space: Wav2D() a 2-D wavelet smoothing for brain adapted from wavethresh and for time a double kernel smoother Susan1D() provided in PTAk, which additionally to traditional kernel smoothing does preserve high peaks. The smoothing arguments in PTAk() or PTA3() are then included: PTA3(..., smoothing = TRUE, smoo = list(Wav2D, Susan1D, NA), ...). Beforehand, the data was detrended, using Detren on the time mode and scaled for each subject using Multcent(). Only the first principal tensors are shown here, some more results especially combining metrics and penalization can be seen at Leibovici (2004). Figure 8: Verbal study data brain × time × subject with: canonical PTA3-modes (top), penalised PTA3-modes with smooth constraints on brain and time (bottom).

Conclusions and perspectives
As a possible way of extending multidimensional analysis on tables with 2 entries to multientries data analysis, PTAk allows multiway decompositions of high order interactions and can describe efficiently multiple domain interactions including the spatio-temporal domain. Using specific metrics, linked to the input data or to a covariate structure, leads also to extend well known 2-way analysis or principles such as discriminant analysis and correspondence analysis, which have been shown to be of interest for contiguity analysis, or spatial data defined by collocation events. Metrics, linear constraints and finally penalizing components during optimization are made possible by the framework developed; they are efficient ways to take into account spatio-temporal intrinsic properties within multiway analysis. Even though we presented mainly the PTAk() generic method, two other multiway methods, CANDPARA() and PCAn(), implemented in the R package possess these generalized features.
Non-linear adjustment is desirable in multidimensional analysis and also in multiway analysis, PTAk gives some answers but remains fundamentally multilinear. Non-linear objective functions used with multilinear models, can nonetheless be very powerful such as in independent component analysis (ICA) for noisy data (Leibovici and Beckmann 2001;Beckmann and Smith 2005).