Data Analysis with the Morse-Smale Complex: The msr Package for R

In many areas, scientists deal with increasingly high-dimensional data sets. An important aspect for these scientists is to gain a qualitative understanding of the process or system from which the data is gathered. Often, both input variables and an outcome are observed and the data can be characterized as a sample from a high-dimensional scalar function. This work presents the R package msr for exploratory data analysis of multivariate scalar functions based on the Morse-Smale complex. The Morse-Smale complex provides a topologically meaningful decomposition of the domain. The msr package implements a discrete approximation of the Morse-Smale complex for data sets. In previous work this approximation has been exploited for visualization and partition-based regression, which are both supported in the msr package. The visualization combines the Morse-Smale complex with dimension-reduction techniques for a visual summary representation that serves as a guide for interactive exploration of the high-dimensional function. In a similar fashion, the regression employs a combination of linear models based on the Morse-Smale decomposition of the domain. This regression approach yields topologically accurate estimates and facilitates interpretation of general trends and statistical comparisons between partitions. In this manner, the msr package supports high-dimensional data understanding and exploration through the Morse-Smale complex.


Introduction
Recent advances in computational topology have led to a multitude of algorithms to estimate the topological properties of a function from a finite sample (van Kreveld et al. 1997;Edelsbrunner et al. 2003;Carr et al. 2003;Gyulassy et al. 2007). These algorithms led to the analysis of data sets based on their topological properties and have shown promising re-sults in a wide variety of applications, including combustion simulations Mascarenhas et al. 2010), fluid dynamics (Laney et al. 2006), climate simulations (Gerber et al. 2010), protein folding (Weber et al. 2007) and molecular shape analysis (Cazals et al. 2003). This paper describes the R (R Development Core Team 2010) package msr (Gerber et al. 2011a) for data analysis with the Morse-Smale complex. The Morse-Smale complex provides a topologically meaningful decomposition of the domain of a scalar function. The basic functionality of this package rests on the computation of the Morse-Smale complex from unorganized scattered data. The domain decomposition induced by the Morse-Smale complex is used to build the exploratory visualization technique described in (Gerber et al. 2010) and the regression approach in (Gerber et al. 2011b). Gerber et al. (2010) exploit the Morse-Smale complex decomposition to form abstract twodimensional representations of high dimensional scalar functions. These representations present high-level summaries of the salient features of the functions and form the scaffolding for an exploratory data analysis tool. In regression, domain decompositions have been used successfully to build flexible non-parametric models such as regression trees (Breiman et al. 1984), MARS (Friedman 1991) and variations on those approaches (Chaudhuri et al. 1994;Alexander and Grimshaw 1996). These partition-based regression methods split the domain recursively into smaller partitions with low-order parametric models in each partition. The recursive partitioning is based on splitting rules that typically measure the quality of the resulting fit. In a similar fashion, Gerber et al. (2011b) use the domain of the Morse-Smale complex as the splitting rule for a partition-based regression scheme.
To take full advantage of the package presented in this paper, a high-level understanding of the properties of the Morse-Smale complex is necessary. Thus, a brief overview of the Morse-Smale complex, and some issues involving its discrete approximation, is included in this paper. For completeness, the visualization and regression techniques in Gerber et al. (2010) and Gerber et al. (2011b) are summarized.

The Morse-Smale complex
In Morse theory, the Morse-Smale complex provides a tool to examine the topology of a manifold M based on the critical points of a function f defined on M. Here, the interest is not on the topology of M but on the exploitation of the Morse-Smale complex to glean information about the geometry of the function f itself.
Informally, the Morse-Smale complex partitions the domain of a function f based on the critical points of f . The interior of each partition is a monotonic region, i.e contains no critical points, with a single local minimum and maximum on the boundary, ass illustrated in Figure 1.

Formal definition
Let M be a smooth, compact, p-dimensional manifold. A smooth function f : M → R is Morse if, for all critical points x of f , the Hessian matrix Hf (x) is not singular. An integral line, λ : R → M, is a curve in M with dλ ds (s) = ∇f (λ(s)), with α(λ) = lim s→−∞ λ(s) and ω(λ) = lim s→∞ λ(s). Note, α(λ) and ω(λ) are both, by definition, critical points of f . Define the ascending and descending, also referred to as stable and unstable, manifolds of a critical

Persistence
The Morse-Smale complex introduces a measure of the significance of each extremal point, called persistence. Persistence is a measure of the amount of change in the function f required to remove a critical point, and thus, to merge two or more partitions. Note, persistence describes the significance of an extremal point in geometric terms and not in the statistical sense of a hypothesis test. Figure 2 illustrates the concept of persistence simplification on a one-dimensional function.
Definition Let x i be the critical points of f . Define s(x i ) as the set of critical points that have a direct integral line connecting to . This definition is motivated by the amount of change of f in L ∞ required such that the critical point pair (x i , n(x i )) is either canceled or merged into a single critical point. Recursively removing the critical point with minimal persistence leads to a nested series of successively simplified Morse-Smale complicies, also called a filtration (Edelsbrunner et al. 2006;Chazal et al. 2009). At each level, some of the partitions induced by the Morse-Smale complex are merged into a single partition until the Morse-Smale partitioning consists of only a single partition (i.e., the entire input domain).
The sequential simplification by persistence leads to a hierarchy of Morse-Smale complicies. At the highest persistence level, the Morse-Smale complex only consists of the highest maximum and lowest minimum, and the segmentation corresponds to the entire domain. With decreasing persistence levels, more extremal points and corresponding partitions are introduced in order of their persistence level. Thus, persistence introduces a notion of scale at which the Morse-Smale complex of f is considered.

Morse-Smale complex approximation
The definition of the Morse-Smale complex, in terms of ascending and descending manifolds, leads to a direct algorithm. Given a set of observations O = {y i , x i } N 1 with y i = f (x i ), determine, for each data point x i , its α-and ω-limit and corresponding partition by following the gradient at x i . A direct approach would require the estimation of the gradient at x i , and to trace the integral line that passes through it. This would require essentially an estimate of f . However, the only information required to partition the domain is the α-and ω-limit of the integral line, which is computed using a discrete approximation of the integral line without having to first estimate f . Note, that the 0 through p − 1 dimensional components have measure zero. Thus, in practice, the points of a data set are, with probability one, part of the p-dimensional components of the Morse-Smale complex with integral lines ending in local minima and maxima.
In Gerber et al. (2010), the domain is approximated via a k nearest-neighbor graph. The algorithm follows paths of steepest ascent and descent based on the connectivity of the graph; essentially a variant of the quick shift algorithm (Vedaldi and Soatto 2008). For such an algorithm, let the adjacencies of point x i be adj( Then the approximate integral line is traced out by following the path of steepest ascent all neighbors have lower/higher function value (i.e., a local maxima/minima). This assigns each point to a p-dimensional component, identified by local minima/maxima association, of the Morse-Smale complex. Hence, the Morse-Smale approximation results in a partition C = {C 1 , . . . , C n }, with C i the set of points for partition i, such that i C i = {x i } n 1 and C j ∩ C i = ∅∀i = j. For computing the persistence-based hierarchy of the Morse-Smale complex, an approximation of saddle point values between neighboring partitions is required. Let s(p a , p b ) be the set of edges (x i , x j ) such that x i is assigned to partition p a and x j to p b . Then, the persistence of the minima a min of partition p a is defined as min k∈P (X) min e∈s(pa,p k ) max x i ∈e a min − x i . Again, as in the continuous case, this is recursively applied withP (X), the minimal persistence extrema, removed.
Depending on the number of nearest neighbors and sampling density, artificial extremal points can be introduced or true extremal points can be removed. Artificial extremal points are introduced if the k-nearest neighbors of a point connect only to points with lower function values. True maxima (or minima) can be removed if the k-nearest neighbors graph introduce a monotonous steepest ascent (or descent) path 1 to another maxima (minima) point. This arises mostly in the case of a large number of nearest neighbors, and/or a sparse sample of the domain.

Partition prediction
The Morse-Smale complex approximation provides the partition assignments, or labels, for the input data, but not the complete domain. In Gerber et al. (2011b), two approaches to extend this partitioning to the complete domain are considered.
The first approach estimates the probability distribution P (X|C = C i ) with a kernel density estimator over the observations s in partition i: From this, Bayes Theorem yields the partition probabilities P The second approach employs a support vector machine (SVM) to build a multi-class classifier using the one-against-one approach described in Hsu and Lin (2002). The SVM is trained on the partition assignments of the Morse-Smale complex. To estimate partition probabilities from the classifier, a logistic regression is fit to the decision values of the SVM as described in Platt (1999).

The Morse-Smale complex for scalar function visualization
In Gerber et al. (2010) the Morse-Smale complex is used to build a summary representation of the high-dimensional scalar function f : Ω → R, with Ω a compact connected subset of R n . Each partition C i of the Morse-Smale complex is summarized by an inverse regression curve, r i (y) = E[X ∈ C i |Y = y]. The regression estimates approximate a graph of curves connecting the minima and maxima as identified by the Morse-Smale segmentation; essentially a compressed representation of the domain. For visualization, the extremal points are projected, using principal components, into two-dimensions (2D). Then, each regression curve is separately projected into 2D, again using principal components. Finally, the projected curves are affine-transformed such that the end points match the corresponding projected extremal points. This concept is illustrated in Figure 3.
The visualization created by this method can help to gain a qualitative understanding of higher dimensional functions. In Gerber et al. (2010) this exploratory analysis approach is is demonstrated on • climate simulations outcomes.
• chemical compositions in combustion simulations.
• socio-economic variables in relation to crime rates. The approach consists of three steps to arrive at a 2D representation for visualization of the high-dimensional scalar function: 1. Morse-Smale approximation: Compute a segmentation using the Morse-Smale approximation of f .
2. Geometric summaries: Construct regression curves r i with kernel regression.
3. Dimension reduction: Embed the regression curves in 2D using a three-step dimensionreduction approach: i. Project extremal points onto its first two principal components.
ii. Project each regression curve onto its first two principal components.
iii. Affine-transform projected curves to match the corresponding projected extremal points.

Morse-Smale regression
The idea presented in Gerber et al. (2011b) is to use the Morse-Smale complex induced segmentation of the domain of a function f : Ω → R to combine simple linear models within each partition to approximate the underlying function f . Two approaches are considered. First, a piecewise linear modelf Second, a sum of weighted linear models, with weights based on the probabilistic interpretation of the segmentation induced by the Morse-Smale complex, i.e., a soft assignment of each point based on the partition probabilities.
with w j (x) = P (C = C i |X = x) estimated as described in Section 2.4.

Package description
The methods for approximating the Morse-Smale complex from a sample of a scalar function form the basis of the msr package (Gerber et al. 2011a). The regression and visualization approaches are built on this basic functionality. This section gives a brief overview of the package. Detailed descriptions of the available methods are in the manual pages contained in the package.

Morse-Smale complex approximation
The main functions for the Morse-Smale approximation is msc.nn. The functions msc.nn.kd and msc.nn.svm add predictive capacity to msc.nn by estimating the partition probabilities with the kernel density estimation or a multi-class SVM, respectively (see Section 2.4). All these methods return a basic Morse-Smale complex object msc.nn (potentially supplemented with predictive capabilities), which is the basic input for the regression and visualization methods.
The nearest neighbor graph is computed using the approximate nearest neighbor library ANN (Mount and Arya 2010). For performance, the kernel density estimation is implemented in C with an interface to R. For the SVM multi-class classifier the R package e1071 (Dimitriadou et al. 2010) is used, which provides an interface to the C++ library libsvm (Chang and Lin 2001).
For both msc.nn.kd and msc.nn.svm, the generic R method predict is implemented and returns a matrix with partition probabilities for each point, either of the original data x or, if the argument newdata is specified, for the locations of the new observations.
Two additional utility functions are supplied: • msc.level.ind extracts the indices into the original, supplied data x for a given partition index, Morse-Smale complex, and persistence level.
• msc.sublevels extracts a subset of the hierarchy of Morse-Smale complicies.

Visualization
The visualization functionality is accessed through plot.msc. This function takes in an msc.nn object, computes the necessary geometry for the plots, and creates the visualization. The package rgl Adler and Murdoch (2010) is used for three-dimensional display, and user interaction through mouse clicks bring up detailed 2D plots of the inverse regression curves. The plotting function returns an object that allows the user to modify what is displayed in the rgl device, such as the optional display of the standard deviation tubes, or the 3D axis. The visualization displays the graph of the regression curves in 3D. The regression curves are projected into 2D and the 3rd dimension is used to encode the function value, also redundantly encoded with a colormap. While color provides a quick overview of the location of minima and maxima, the 3rd dimension can be used to better perceive quantitative differences in minima and maxima and helps to visualize the function as a height field over the graph representation of the domain.
Opaque tubes surrounding the curves encode the standard deviation of the regression curves by mapping to the radius of the tubes. The standard deviation provides information about the shape of a partition, i.e. the extent of the level sets. Figure 4 shows the use of the 3rd dimension as well as the standard deviation tubes, which can be turned on or off through a user parameter. In addition, the user can inspect the behavior of the independent variables in each partition by clicking on the corresponding regression curve which then highlights that curve in an identifying color in a corresponding 2D graph, as shown in Figure 5, left. A separate graphics device is then opened, and plots the regression curve r i (y) = E[X ∈ C i |Y = y with d graphs of y as a function of r i (y) for each dependent variable in X (Figure 5, right). To compare various partitions simultaneously, the user can select multiple curves, which are displayed in the same plot using the matching color from the main display.

Regression
The piecewise linear model for the Morse-Smale regression is implemented in the two methods: • msc.lm: The piecewise linear model.
and for the weighted linear model: • msc.slm: The weighted linear model.
The methods take as input a Morse-Smale complex object and additional parameters that influence the fitting.

A complete tour
This section gives a complete tour of the package on a simple 2D function, for which it is easy to show visual results for all aspects of the package. The tour is based on the 2D function shown in Figure 1, which is defined by (4)

R> np <-length(ms$persistence) R> ms$persistence[np]
In the above example, the Morse-Smale complex was computed for a fixed persistence level. Alternatively, compute the first 15 persistence levels of the Morse-Smale complex R> ms 1],x = d[,2:3],nLevels = 15,knn = 15) and access individual levels. For both visualization and regression, predictLevel determines the level of the current Morse-Smale hierarchy at which prediction or visualization is performed. Now plot the partition probabilities for partition index 8.

Visualization
Using the Morse-Smale complex from the previous section show visualizations at level 2, 7, and 12.

Regression
The regression is invoked directly on the Morse-Smale complex object. Construct Morse-Smale complex with SVM prediction and fit a piecewise linear model with a cross-validation for choosing the prediction level.

R> p <
[1] 0.003848764 Instead of the piecewise linear model, fit a sum of weighted linear models. The persistence level is, as for the piecewise linear model, selected automatically with cross-validation.

Visualizing the UCI crime data
The communities and crime data set from the UCI machine learning repository 2 combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and crime data from the 1995 FBI UCR. The original data set contains 1,994 observations (communities) with 128 attributes. This analysis uses a reduced data set, with attributes that have missing entries removed, and investigates the crime rate in relation to 99 socio-economic variables.

R> data(uci_crime_subset)
First, construct the Morse-Smale complex and plot the persistences.  This shows that the Morse-Smale complex with 4 extrema requires a function change of 26 percent of the function range to introduce an additional extrema. This indicates that the Morse-Smale complex at this level is relatively stable and the extremal points are not very likely artifacts from noisy observations. Now plot the Morse-Smale based summary representation of the function at persistence level 3 (see Figure 6).

R> ms$predictLevel = 3 R> obj <-plot(ms)
This visualization is interactive and allows for deeper investigation of the function. By clicking on a regression curve, representing one partition of the Morse-Smale complex, a window with the inverse regression curves per variable is shown for that curve. In the case of the crime data, this opens multiple device windows to show the 99 variables of the domain. This can be cumbersome to analyze. To restrict attention to variables of interest, a plotList on the object is created which allows the user to select specific variables for plotting. <-c(1, 12, 27, 20, 32, 33, 97)

R> names <-colnames(crimes)[obj$plotList] R> print(names) R> plot(obj)
This allows for the easy comparison of user selected variables of interest. For example, the selected variables show that two peak crime rate configurations involve a large percentage of urban population, while the third high crime rate peak concerns rural areas, One peak crime rate is related to high unemployment rates while the other two have relatively low unemployment. This illustrates how the proposed visualization can be used to build hypotheses for further examination.

A Morse-Smale regression example
This example is based on the energy/objective function of a camera estimation problem.

R> ms
R> df <-data.frame (test, pID = as.factor(ms$level[[pLevel]]$partition)) R> lm1 <-lm (E~. * pID, data = df) Here, the different partitions are, except for a few (visible with summary(lm1), not shown), not significantly different. However, in general, this illustrates how the partition based regression can be used to detect different regions of the parameter space where the function shows different behaviours.
Next, extract a single persistence level Morse-Smale complex and estimate a piecewise linear model with an L 1 regularization to compare differences between the partitions.
Alternatively, employ a forward stepwise approach for a sparse regression estimate.

R> ms.slm <-msc.slm(ms)
The cross validation selected persistence level R> print (ms.slm$ms$predictLevel) [1] 12 with a mean squared test error of R> penergy <-predict(ms.slm, test) R> mean((test$E -penergy)^2) [1] 0.002840482 Note, that due to the averaging of different linear models the additive weighted approach is not suitable for interpretation, but typically yields better function approximations than the piecewise linear approach.

Conclusion
This paper described the R package msr for exploratory data analysis based on the Morse-Smale complex approximation of a function. The package is built in a modular fashion with the Morse-Smale complex approximation as the fundamental building block. A visualization and a regression approach are implemented on top of the Morse-Smale approximation. The regression and visualization facilitate a qualitative understanding of high-dimensional functions. However, the modular design enables to harness the Morse-Smale complex for data analysis approaches other than the two methods presented here.