Eﬃcient Code for Second Order Analysis of Events on a Linear Network

We describe eﬃcient algorithms and open-source code for the second-order statistical analysis of point events on a linear network. Typical summary statistics are adaptations of Ripley’s K -function and the pair correlation function to the case of a linear network, with distance measured by the shortest path in the network. Simple implementations consume substantial time and memory. For an eﬃcient implementation, the data structure representing the network must be economical in its use of memory, but must also enable rapid searches to be made. We have developed such an eﬃcient implementation in C with an R interface written as an extension to the R package spatstat . The algorithms handle realistic large networks, as we demonstrate using a database of all road accidents recorded in Western Australia.


Introduction
The study of events that occur along a network of lines, such as traffic accidents recorded on a road network, requires the development of advanced statistical techniques and computational algorithms (Okabe and Sugihara 2012;Ver Hoef, Peterson, and Theobald 2006;Baddeley, Rubak, and Turner 2015, Chapter 17). Because a linear network is not a homogeneous space, even elementary statistical tools can be difficult to implement. Kernel smoothing of point events, which is simple to define and very fast to compute in two dimensions (Diggle 1985), is mathematically complicated and can be extremely time-consuming to perform on a network (Okabe, Satoh, and Sugihara 2009). Similar difficulties arise in second-order (correlation) analysis of point patterns, which is straightforward in two dimensions using Ripley's Kfunction (Ripley 1977) and the pair correlation function (Okabe and Yamada 2001;Ang, Baddeley, and Nair 2012;Baddeley, Jammalamadaka, and Nair 2014). This geometrical complexity militates against the statistical analysis of real data sets of moderate size. Figure 1 shows the locations of road accidents recorded on the road network in the state of Western Australia for the year 2011. In this area, about 2000 km across, the network consists of 115, 169 road segments and there are 14, 562 accident locations. Kernel smoothing and second-order analysis of these accident data are prohibitively expensive (both in computer time and memory) using current implementations of these methods (Baddeley et al. 2015, Chapter 17; Okabe and Sugihara 2012) as we demonstrate below. Figure 2 shows a much smaller data set that can easily be handled with simple R code (R Core Team 2019): It contains 116 points on a network with 503 line segments. For kernel smoothing on a network, a fast algorithm capable of handling very large data sets was recently developed by McSwiggan, Baddeley, and Nair (2016) and is now implemented in the R package spatstat (Baddeley et al. 2015) as the function density.lpp.
In this paper, the main focus is the second-order (correlation) analysis of point patterns on a linear network. We develop efficient algorithms and open-source code for computing general second-order summary functions which include the K-function and pair correlation function.
Suppose we have observed point events x 1 , . . . , x p on a linear network L. Let d L (x i , x j ) denote the shortest-path distance between data points x i and x j in the network. The objective is to calculate second-order summary statistics of the general form where h is a chosen function. A simple example of Equation 1 is the empirical cumulative distribution function of the shortest-path distances where I(A) is the indicator that equals 1 when A is true, and 0 otherwise. Other examples of the general form (1) include the observed network K-function (Okabe and Yamada 2001;Okabe and Sugihara 2012, Chapter 6), empirical estimators of the geometrically-corrected network K-function and pair correlation function (Ang et al. 2012), and various generalizations involving spatially-varying weights, auxiliary variables and local statistics (Ang et al. 2012;Baddeley et al. 2014;Boots and Okabe 2007). A detailed description of these estimators and their applications can be found in Baddeley et al. (2015, Chapter 17). Computation of the summary statistics of the form (1) is important, not only for exploratory data analysis, but also for fitting models to point pattern data by maximum composite likelihood (Guan 2006;Tanaka, Ogata, and Stoyan 2008;Baddeley et al. 2015, Chapter 12).
Simple code for calculating any statistic of the form (1) is available in the R package spatstat (Baddeley and Turner 2005;Baddeley et al. 2015). Table 1 (under the column heading Adjacency matrix) shows the computation time (in seconds) and the total memory (sum of all memory allocation requests) used by this implementation to compute the geometricallycorrected K-function (Ang et al. 2012, Equation 12) for three small example data sets supplied in the spatstat package.  Table 1: Performance comparison of two algorithms for computing the geometrically-corrected K-function. Adjacency matrix: Algorithm M using adjacency matrices, described in Section 4, as implemented in spatstat; Linked list: Algorithm L using linked lists, described in Sections 5-6, as implemented in the supplied package spatstat.Knet. Row names refer to three example data sets supplied in the spatstat package. Column headings are as follows: Points: Number of data points; Lines: Number of line segments; Time: Elapsed computation time in seconds, reported by system.time; Memory: Sum of all memory allocations in megabytes, reported by the function profmem in package profmem (Bengtsson 2018).
Extrapolating to the Western Australian road accident data (Figure 1), under simple assumptions, gives a predicted computation time of at least 2 hours and total memory allocation of at least 1.4 terabytes. On a standard PC, such large amounts of memory are not available, and the algorithm will not run successfully.
For the simplest case of the network K-function (Okabe and Yamada 2001, Equation 7), which is equivalent to the computing in Equation 2, an efficient algorithm has been described by Okabe and Yamada (2001); see also Okabe and Sugihara (2012, Chapter 6). The code is not open-source, although compiled executables are available (Okabe, Okunuki, and Shiode 2006).
This paper presents an alternative, open-source, C implementation for computing any statistic of the general form introduced in Equation 1. Detailed pseudocode is included; the full source code is available within the R package spatstat.Knet  from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/ package=spatstat.Knet. Our implementation uses a simple data structure and efficient code, which are easily adaptable to different choices of the function h in Equation 1 including the geometrically-corrected K-function (Section 3.2). This implementation adapts many of the ideas of Okabe and Yamada (2001), including the key concept of the (extended) shortestpath tree.
The traffic accident data in Figure 1 have been included as a data set named wacrashes in the spatstat.Knet package. The following code can be used to create Figure 1.
R> library("spatstat.Knet") R> data("wacrashes", package = "spatstat.Knet") R> plot(wacrashes, cols = "red", cex = 0.5, main = " ") Figure 2 can be plotted as follows R> plot(unmark(chicago), cols = "blue", pch = 16, main = " ") Section 2 introduces necessary mathematical and computational structures such as adjacency matrices and linked-lists, for representing linear networks and events on networks. Section 3 gives some notation associated with the shortest-path distance on a network, different versions of the K-function, and their current implementation in spatstat. Section 4 describes the existing algorithm, which we call Algorithm M, for computing the K-function using the adjacency matrix. Sections 5 and 6 present a new algorithm, which we dub Algorithm L, using linked-lists. Section 7 gives a worked example of the key part of Algorithm L. Timings, as a function of the number of observed points on the network, are reported in Section 8. The Western Australian road accident data are analyzed in Section 9. We end with a discussion.

Representation of events on a linear network
In this section, we introduce the terminology associated with a linear network (Section 2.1), describe the traditional adjacency-matrix and adjacency-list data structures that are used for storing simple networks (Section 2.2), and propose a new data structure for storing point patterns on a linear network (Section 2.3).

Linear networks
A straight line segment in the two-dimensional plane with endpoints v and v is the set

the Euclidean distance between its endpoints.
A linear network is the union L = n i=1 s i of a finite collection of straight line segments s 1 , . . . , s n ; the total length of the network is |L| = n i=1 (s i ). The representation of L as a union of line segments is not unique: we assume that a representation is chosen so that any two distinct segments s i , s j with j = i either do not intersect, or intersect at a common endpoint of s i and s j . Then the network can be considered as an embedded planar graph, whose vertices are the endpoints of the segments.
In a planar graph setting, it is common to refer the line segments as edges and their endpoints as nodes. However, we make a small distinction between a segment and an edge, which will be explained in the next subsection. The set of nodes (vertices) is denoted by V = {v 1 , . . . , v m } and the set of segments by S = {s 1 , . . . , s n }. Both V and S are indexed sets with the subscript of an element representing its integer identifier in the set. Therefore, though the nodes v i ∈ L ⊂ R 2 , we often use the labeling 1, . . . , m to denote m nodes. In what follows, without loss of generality, we assume i < j in the representation [v i , v j ] of a segment in S.

Data structure for storing a network
Given a spatial linear network L we can construct a weighted undirected graph G = (V, E), where V is the set of nodes as before, and E is the set of weighted edges. Each weighted edge is an ordered triple (v, v , w), where v and v are endpoints of some segment s ∈ S and w is the positive weight associated with the segment s; in this paper, we take the weight to be the segment length w = (s). More details on the connection between linear networks and weighted graphs can be found in Kolaczyk and Csárdi (2014).
Note that, corresponding to each segment s = [v, v ] ∈ S, there are two weighted edges (v, v , w) and (v , v, w) in G. In the ordered representation (v, v , w) of an edge, we shall refer to v as the starting node and v as the ending node. Because all algorithms in this paper are developed assuming such double representation of the segments, in what follows, we use data structures that store information about all 2n weighted edges. Two standard data structures for representing a weighted graph are the adjacency list and adjacency matrix (see Cormen, Leiserson, Rivest, and Stein 2009, p. 589), which are discussed below.
When a network is represented by its adjacency matrix, software coding becomes relatively straightforward for computing functions on the network. For example, the task of finding the immediate neighbors (or adjacent nodes) of a given node can easily be implemented by extracting the relevant row of the adjacency matrix and finding all finite entries. However, the main drawback of the adjacency matrix is the high memory usage when it is represented as a full matrix with m 2 entries. For the Western Australian road network, shown in Figure 1, there are m = 88, 512 nodes; since m 2 > 2 32 , the full adjacency matrix would exceed the array size limits in many 32-bit software systems, and would be too large to fit into random-access memory (RAM) on a typical 64-bit PC.
For a weighted undirected graph constructed from a road network, the adjacency matrix is usually sparse in the sense that the number of edges is much less than m(m−1), the maximum possible number of finite entries in the matrix. In such cases one can use a sparse matrix representation (Wilkinson and Reinsch 1971;Tewarson 1973;Pissanetzky 1984;Golub and Van Loan 1996). In a sparse representation of an adjacency matrix A, only the finite positive entries of A are recorded, essentially as a list of triples (i, j, a ij ) giving the endpoints and weight associated with each weighted edge in G. This reduces the storage requirement to the minimum possible.  Table 2: Memory storage requirements (kb) for linear networks using the adjacency matrix representation in the spatstat package, with or without sparse matrix encoding. Network data are example network data sets provided in the spatstat and spatstat.Knet packages.

Element Description data
Pointer to a data object that is stored in the list. next Pointer to the next list entry in the linked-list.  Computations involving sparse matrices are supported by fast code in low-level languages, usually Fortran libraries (Dongarra, Moler, Bunch, and Stewart 1979;Anderson et al. 1999) with interfaces to higher-level languages such as R (Koenker and Ng 2003;Bates and Maechler 2019). However, most of the functionality provided by sparse matrix libraries is for linear algebra, rather than graph topology. Furthermore, the representation of a graph structure by an unstructured list of edges (i, j, a ij ) leads to computational inefficiencies. For example, the task of finding the immediate neighbors of a given node i * requires a search through the entire list to find all entries (i, j, a ij ) where either i = i * or j = i * . This can be accelerated by sorting the list appropriately, but the data structure is inherently inefficient for our purposes. Hopcroft and Tarjan (1973) advocated the adjacency list representation for graphs in terms of linked-lists. A linked-list is a standard data structure for representing a list of objects which are related in some way, e.g., nodes that are all connected to a given node or road accidents recorded on the same road segment. Each entry in a linked-list is a pair of pointers, namely, data and next, whose description is given in Table 3. The end of the list is indicated by assigning a null value to the pointer to the next entry (see Figure 4). List entries can easily be inserted or deleted at any position by changing the relevant pointers, so that linked-list data structures are well suited to applications where the connections between the list entries are required to be changed frequently (Cormen et al. 2009, p. 236;Louden 1999, Chapter 5).

Adjacency list
The adjacency list representation of a weighted graph G consists of a list of m linked-lists, one linked-list for each node in the graph. The linked-list corresponding to a particular node stores all its adjacent nodes, along with the associated edge weights. Recall that each line segment s = [v, v ] in L corresponds to two weighted edges in G; the edges (v, v , (s)) and (v , v, (s)) appear, respectively, in the adjacency lists for nodes v and v .   Figure 3. The first column represents the list of four nodes, and for a given node, the corresponding row represents the linked-list containing its adjacent nodes along with the edge weights. For ease of illustration, in Figure 5 we omitted the next elements of the linked-list structures and the crooked arrows in Figure 4 representing pointers were replaced with the straight arrows.
For sparse graphs, the adjacency list representation is compact, utilizing only O(n + m) space for storing the graph, and is more efficient than the adjacency matrix representation for implementing graph searching algorithms (Cormen et al. 2009;Even 1979;Tarjan 1983). An adjacency matrix representation is efficient only when the graph is dense, i.e., when the number of edges n is of the same order as m 2 (see Cormen et al. 2009;Even 1979).

Events on a network
A data set of events on a linear network L will be represented as x = {x 1 , . . . , x p }, where x i ∈ L is the location of the ith event on the network, and p ≥ 0 is the total number of points, which is not fixed in advance. Figure 6 depicts a simple illustrative example of data giving the spatial locations {x 1 , . . . , x 5 } of events along a network.
Although the planar Cartesian coordinates are sufficient to locate point events, the data structure will be more computationally efficient if it makes an explicit connection between the point event and the segment on which it lies. This can be done using the elements x, y, seg and tp corresponding to a point event; description of these elements are given in Table 4. The coordinate system with both (x, y) and (seg, tp) is mathematically redundant, but allows efficient addressing of different databases of spatial information. In road accident analysis, some explanatory variables, such as shoulder width, road curvature, and road condition, are part of a database of road information. For any event on the network, it is convenient to query this database using the road name or number (i.e., seg) and position along the road (i.e., tp). Other explanatory variables, such as terrain elevation, are spatially-referenced images, which are most conveniently queried using spatial coordinates (x, y).
There is an important distinction between explanatory variables and 'marks', which could be part of the database. An explanatory variable is potentially observable at any spatial location: examples are terrain height and road width. A mark, such as crash severity or the number of passengers, is an additional attribute of the observed event. For simplicity of presentation, this paper does not consider marks. Element Description (x, y) Planar Cartesian coordinates of the point event.
seg Integer label of the segment containing the event. tp Relative distance of the point event along the segment; tp values 0 and 1, respectively, correspond to the starting node and ending node of the segment.

Data structure for a linear network with events
For efficient statistical investigation of the point patterns on a linear network, we have developed a data structure that extends the adjacency list representation by including storage for events and additional data about them. The elementary components of this data structure, namely Adjlist, sNode, and aNode, are sketched in Figure 7, and Table 5 provides a brief description of the members of these different components. Figure 7 also includes the Crash structure, which is used for storing point events.
The sNode objects are used for storing all the nodes in a weighted graph. For a given node in V , the node member of sNode stores the integer identifier of the node, and the members d and parent are used for storing, respectively, the shortest-path distance estimates and information about some relevant nearest node while computing the shortest path route from some source point in the network (see Procedure 1 in Section 5 for details).
For a given node v ∈ V , the Adjlist object connects v to all its adjacent nodes and keeps track of the resulting edges. The data member is a pointer to the sNode object that stores the starting node v, and the adjacent member is a pointer that keeps reference of the start of the linked-list that stores all the adjacent nodes of v. In each entry of this linked-list, the data member is a pointer to the aNode structure that contains the integer identifier of the adjacent node and all information about the edge connecting v to that adjacent node.

Component Description sNode node
Integer identifier of a given starting node parent Pointer to the parent node in the shortest-path route d Shortest-path distance from some source aNode node Integer identifier of a given ending node eid Integer identifier of the edge weight Length of the edge crashlist List of point events (Crash structures) on the edge.

Adjlist data
Pointer to the starting node (sNode object) of an edge adjacent Pointer to the beginning of the list of all adjacent edges corresponding to the starting node Table 5: Components of the adjacency list data structure for storing events on a network.
At the top level, the entire network is accessed as a linked-list whose entries correspond to the nodes of the network. Figure 8 shows the data structure representing the network and events in Figure 6. The first column of the figure shows five linked-list entries corresponding to the five nodes in the network. For this top level linked-list, the data member of the ith list entry (i = 1, . . . , 5) points to an Adjlist object, which holds all the information related to the ith (starting) node v i and its outgoing edges. The third column in Figure 8 shows the use of sNode structures by Adjlist structures for storing information about the nodes of the network.
Columns four to six in Figure 8 show the use of the aNode structure in representing the adjacent nodes (corresponding to the weighted edges) of the network graph in Figure 6. Due to space limitations, we have omitted eid from the aNode structure in the figure. The crashlist is again a linked-list of Crash structures (see Figure 7, right) containing information about each point event on a given edge. The members seg and tp of the Crash structure are described in Table 4.

Time complexity of creating the data structure
Recall that for a network with point events on it, m, n, and p are the numbers of nodes, segments, and events respectively. In order to build the data structure (in Figure 8)

Inter-event distances and the K-function
In this section, we first introduce further terminology related to a linear network and point pattern data, then describe the K-functions introduced by Okabe and Yamada (2001) and

Shortest path distance
A path between two points w and w on a linear network L is a sequence π = (w 0 , w 1 , . . . , w N ) of points joining w 0 = w to w N = w such that each line segment [w i , w i+1 ] (i = 0, . . . , N −1) is a subset of some edge of the network. For most purposes w 1 , . . . , w N −1 can be taken as nodes of the network. The length of the path is the sum of the step lengths, (π) = N i=1 w i −w i−1 . The shortest-path distance d L (w, w ) between w and w is the minimum of the lengths of all possible paths from w to w . If there are no paths from w to w (implying that the network is not path-connected) then we define d L (w, w ) = ∞.
The disc of radius r > 0, with center u ∈ L, is the set of all points v in the network that lie no more than a distance r from the location u, by the shortest path: Figure 9 gives the street network around the University of Chicago (Ang et al. 2012). Bold lines show the disc of radius 425 feet centered at the location marked by the open circle.
The perimeter c L (u, r) = {v ∈ L : d L (u, v) = r} is the set of points lying exactly r units away from u by the shortest path. The perimeter count is the number of points on the perimeter c L (u, r). Points contributing to the count m(u, r) are displayed as filled-circle in Figure 9. Note that a segment of b L (u, r) may terminate without contributing to m(u, r) if its terminal endpoint lies at a distance less than r from u. Two such cases are visible in Figure 9.
A subnetworkL of L is a linear network with a collection of line segmentsS ⊂ S and set of nodesṼ ⊆ V . If the networkL is path-connected (i.e., if any two points onL can be joined by a path insideL), we call it a connected subnetwork.

K-functions on a linear network
A standard tool for the analysis of point patterns in two-dimensional space is the well-known K-function introduced by Ripley (1977). Several counterparts and generalizations of this function have been defined for linear networks (Okabe and Yamada 2001;Boots and Okabe 2007;Ang et al. 2012).
Suppose we are given a linear network L with observed events x = {x 1 , . . . , x p }. Okabe and Yamada (2001) defined the (empirical) network K-function as As explained in Ang et al. (2012), a severe drawback of K net (r) is its dependence on the network geometry, even for a completely random point pattern. This makes it difficult to compare point patterns on different networks. Ang et al. (2012) proposed the following geometrically-corrected empirical network K-function: where the m-function in the denominator, defined in (3), is the weight to compensate for the network geometry. The empirical function K L (r) is given as an estimator of the theoretical K-function K L (r) (Ang et al. 2012, p. 598) and it is shown that K L (r) = r (r > 0) for a homogeneous Poisson process. This property can be used to compare any point pattern to a completely random point pattern. It is also permissible to compare K-functions obtained from different networks (Ang et al. 2012). Ang et al. (2012) also introduced a version of (5) for inhomogeneous point processes: whereλ(·) is some estimate of the spatially varying intensity function of the point process on L.
In the analysis of two-dimensional point patterns, it is standard practice to restrict the computation of K(r) to distances r that are less than a specified maximum r max . This is due to the fact that bias and variance increase dramatically with r and that there is usually a maximum expected range of spatial dependence (Baddeley et al. 2015, Chapter 7). The same statements are true on a linear network. For example, in the Western Australian accident data, any spatial dependence between accident events is unlikely to extend over more than a few kilometers. Accordingly, we adopt the same practice: our algorithms are designed to evaluate the K-functions (4)-(6) for 0 ≤ r ≤ r max , where r max is considerably smaller than , called the inradius of the network L. We show in Section 8 that this restriction substantially increases the computational efficiency.

Existing implementations of K-function
For computing the network K-functions (4)-(6), the R package spatstat at present offers the only open source capabilities, to our knowledge. The spatstat function linearK computes K net and K L when the arguments correction="none" and correction="Ang" are provided, respectively; the default value for the argument correction is "Ang". The function linearKinhom computes the inhomogeneous K-function K ih L when the intensity estimatesλ are provided. See Baddeley et al. (2015, Chapter 17). Figure 10 shows the homogeneous and inhomogeneous K-functions computed and plotted by the following R code: R> library("spatstat.Knet") R> X <-unmark(chicago) R> plot(linearK(X)) R> fit <-lppm(X~polynom(x, y, 2)) R> plot(linearKinhom(X, fit)) At the time of writing, linear networks in spatstat are represented using adjacency matrices, and a matrix-based algorithm, which we refer to as Algorithm M, is used to compute the network K-functions (4)-(6). This is described in Section 4.

Algorithm M, using adjacency matrix
If the data set is small, or computer memory is unlimited, then it is possible to use the adjacency matrix representation of the network (Section 2.2) and a relatively straightforward algorithm can be used to compute second-order summary statistics of the form (1). This approach is followed in the spatstat package at the time of writing (spatstat versions 1.23-0 to 1.51-0). In this section, we give details of this approach for computing the geometricallycorrected K-function (5). Algorithms for (4) and (6) can be obtained as simple modifications.

Algorithm M specification
The algorithm is described in four sequential steps (M1 )-(M4 ).

(M1) Shortest-path distances between nodes
The first step is to compute the matrix of shortest-path distances d ij = d L (v i , v j ) between each pair of nodes. Recall that the adjacency matrix (a ij ) has entries a ij = v i − v j , if v i and v j are joined by an edge and a ij = ∞, otherwise. The algorithm initializes d ij = a ij , then iteratively applies the update where v ∼ v denotes a pair of nodes joined by an edge. This update is similar to the 'relaxation' step used in the famous Dijkstra shortest-path algorithm (Gallo and Pallottino 1988). This iterative procedure finishes after a finite time, giving the matrix (d ij ). Careful coding is needed to avoid numerical error associated with floating-point comparisons, which could otherwise cause the iterations to continue indefinitely.

(M2) Shortest-path distances between events
Next, for each pair (x i , x j ) of events, the algorithm computes the shortest-path distance h ij = d L (x i , x j ) as follows. We identify the segments s i and s j containing x i and x j , respectively. If s i = s j then h ij = x i − x j . Otherwise, we find the endpoints v, v of s i and the endpoints w, w of s j , and then compute where are the path lengths of the shortest paths from x i to x j passing through the specified endpoints.

(M3) Counting points on the disc perimeter
The shortest-path distance matrix h ij (i, j = 1, . . . , p) described above provides the necessary data for the network K-function (4). The geometrically corrected function (5) requires additional computation of the weighting factors m ij = m(x i , h ij ) for each pair (i, j). For any arbitrary point u on the network and a given distance r, the steps to compute m(u, r) are as follows: Figure 11: The bold lines give part of the disc b L (u, r) relevant to scenarios 3(a) (left) and 3(b) (right) of step M3 for counting perimeter points.
1. Find the endpoints v, v of the segment containing u.
2. For each node v k compute the shortest-path distance from u, Since v crosses the perimeter of the disc, and contains one perimeter point. This is illustrated in the left panel of Figure 11. The bold line in the figure gives a part of b L (u, r) where t k < r, v k ∼ v k , and t k > r. This gives one perimeter point on Then [v k , v k ] contains 0, 1 or 2 perimeter points according as c < 0, c = 0 or c > 0 respectively. This is illustrated in the right panel of Figure 11. The bold line in the figure gives a part of b L (u, r) where t k < r, v k ∼ v k , t k < r. In this case c > 0, giving two perimeter points on [v k , v k ].

(M4) Computation of K-functions
The final step is to compute the Okabe-Yamada network K-function K net defined in (4) or the geometrically corrected K-function K L defined in (5). For K net , we simply compute all the pairwise shortest-path distances h ij , form a histogram, compute the cumulative distribution function of the distances, and renormalize to obtain (4). For K L , we compute pairs (h, w) of distances h = h ij and corresponding weights w = 1/m(x i , h ij ), compute the weighted histogram, cumulate and renormalize to obtain (5).

Implementation in spatstat
Algorithm M was implemented in the R package spatstat. The initial implementation of steps M1 -M4 was coded in the R language using the basic facilities for matrix operations. This is very convenient for testing and cross-testing purposes, but is slow and memory-hungry. The algorithm was later accelerated by re-coding the update (step M1 ) and the perimeter-counting rule (step M3 ) in C to achieve the speeds reported in Table 1.
At the time of writing, spatstat represents a linear network using an adjacency matrix which may be either sparse or full (using the Matrix package of Bates and Maechler (2019) for sparse matrices). The sparse representation is memory efficient, even for the Western Australian data. However, the Algorithm M requires computation of the full matrix (d ij ) of distances between all pairs of nodes. Hence the fundamental limitation of this algorithm is the O(m 2 ) storage requirement.
In Section 5 and 6, we develop a new algorithm for computing summary statistics of the general form (1), in particular (4)-(6).

Tree structures for the new algorithm
As explained in Section 4, Algorithm M computes the shortest-path distances between events by first computing the matrix (d ij ) of pairwise distances between nodes. The new algorithm, which we refer to as Algorithm L, avoids this memory-intensive task of computing the distance matrix (d ij ); instead, it computes the shortest-path distances from x i (i = 1, . . . , p) to other events in the network by performing a network search starting at the source point x i . Searching a graph is a standard procedure, which is generally accomplished by constructing a tree from the source point, taking it as the root node (Cormen et al. 2009). In this section, we describe some important tree structures used in developing Algorithm L. A pseudocode for the Algorithm L is presented in the Section 6.
The adjacency list algorithm divides the problem of computing a network statistic of the general form (1) into sub-problems. In order to develop this idea for the network K-function K L defined in (5), we use the concept of the local K-function (Getis and Franklin 1987;Anselin 1995, Baddeley et al. 2015 adapted to linear networks. The network K-function can be decomposed as where are the local K-functions. Hence the computation of K L reduces to the computation of p local K-functions. Here we introduce some important graphical structures for the new algorithm. For the computation of local K-functions in (10), we need the notions of "breakpoint", "shortest-path tree", and "extended shortest-path tree" introduced by Okabe and Yamada (2001). The next three subsections give, respectively, a method of constructing an extended shortest-path tree (using a shortest-path tree and breakpoints) based on a subnetwork, a procedure for con- structing such a subnetwork from the original network, and the time complexity analysis of this procedure.

Shortest-path tree, breakpoints, and extended shortest-path tree
For a given starting point u ∈ L, a breakpoint (corresponding to u) is defined as a point v ∈ L for which the shortest path from u to v is not unique, i.e., there exist two different paths from u to v which achieve the minimum possible path length. There can only be finitely many breakpoints for any starting point u ∈ L.
Let L * u be the subset of L formed by the union of all segments which do not contain a breakpoint corresponding to u. Then L * u is a directed weighted graph without any loops, and is equivalent to a tree rooted at u, called the shortest-path tree. A data structure representing the shortest-path tree L * u can be built by first including all the nodes in the network along with the additional node representing the root u of the tree and then recursively adding adjacent edges to an adjacency-list data structure. Figure 12 gives a topologically equivalent representation of the network in Figure 6, and the bold lines in it represent the shortest-path tree rooted at the point event x 1 . The segment [v 3 , v 4 ] contains the breakpoint b (3,4) , so this segment is not included in the shortest-path tree.
When there is a breakpoint corresponding to some u ∈ L, an extended shortest-path tree can be constructed as follows. For each segment s = [v i , v j ] that contains a breakpoint b (i,j) , two new nodes b (i,j) , b (i,j) are created with the same spatial coordinates as b (i,j) but are treated as distinct nodes. Then the two weighted edges corresponding to s are replaced by the two new edges ( j) ), which are treated as having no common intersection. These edges are added to the shortest-path tree L * u by inserting b (i,j) and b (i,j) in the adjacency lists of v i and v j , respectively. Continuing the process for all breakpoints of u, the final tree, denoted as L * * u , thus obtained is called the extended shortest-path tree Figure 13: Extended shortest-path tree from point event x 1 in Figure 6. Six-pointed stars represent terminal nodes, which are duplicates of the breakpoints. Solid black lines represent the shortest-path tree in Figure 12.
To compute the local K-functions (10), first an extended shortest-path tree, L * * x i , needs to be constructed from x i for i = 1, . . . , p. Then, K L (x i , r) can be computed for any r ≤ t x i (L * * x i ), where for a connected subnetworkL(⊂ L) and u ∈L, t u (L) is defined by Note that t x i (L * * x i ) (denoted by t * x i hereafter) is the maximum possible shortest-path distance from x i to any point on L * * x i . However, as discussed in Section 3.2, in the case of a large network, the computation of K L (x i , r) is restricted to distances r < r max , for a prespecified value r max < min{t * x 1 , . . . , t * xp }.

Local subnetwork
Since the local K-functions K L (x i , r) (i = 1, . . . , p) are restricted to r ∈ [0, r max ), we do not require to construct the extended shortest-path trees based on the entire network L.
Here we note a straightforward but important fact that |L * * Hence, for the computation of the local K-function K L (x i , r) (0 ≤ r < r max ), it is sufficient to consider an extended shortest-path tree based on any connected subnetworkL such that b L (x i , r max ) ⊂L.
Accordingly, for each i = 1, . . . , p, we construct a local connected subnetwork L(x i , r max ) corresponding to x i as follows. First, we insert all the nodes that are within a distance r max from x i by the shortest path. Next, we insert all the edges connected to these nodes and all the nodes corresponding to the endpoints of these edges (if not already inserted). It can be shown that L(x i , r max ) is the smallest connected subnetwork of L with the property b L (x i , r max ) ⊂ L(x i , r max ). The use of these local subnetworks in the construction of the extended shortest-path trees greatly reduces the computing time for the K-function. This is evident in Figure 14 which plots the computation time of the K-function K L (r) against different r max values for the Western Australian network data.

Procedure for constructing local subnetwork
The following algorithm to construct L(x, r max ) corresponding to a point event x ∈ x is a modification of Dijkstra's shortest-path algorithm (Gallo and Pallottino 1988).

Procedure 1 (Local subnetwork construction).
Input data: Network-graph L and the point event x.

Identify the segment
2. Insert x as the (m + 1)th node in L, where m is the number of nodes in L.

Create an empty graph L(x).
6. Assign a distance value to d in the sNode structure of every node in L: set it to zero for x and infinity for all other nodes. Label all the nodes with color white signifying they are all unvisited. Assign the parent member of every node a label equal to NULL.
, and assign node v to the parent component of v adj .

Restore L to its original form after it was changed in Steps 2-4.
When Procedure 1 terminates, the empty graph L(x) becomes L(x, r max ), and every node that are within r max distance from x in this subnetwork stores its shortest-path distance from x and reference to its parent node in the shortest-path route. The shortest-path distances from x to these nodes in L(x, r max ) play a very important role in the computation of the m-function (3), the denominator in (10).

Time complexity in computing local subnetwork
Let m x denote the number of nodes that are within a distance r max from x by shortest path and n x denote the number of edges in L(x, r max ). At the center of Procedure 1 is a single conditional loop (Steps 7-11 ) that iterates m x times, once for each of the m x nodes. The conditional loop is originally set to iterate over all m nodes in the network L with the condition given in Step 8 of the Procedure 1 to exit the loop. Because the nodes that we encounter after first m x iterations all have shortest-path distances (from x) more than r max , the loop terminates after m x iterations.
Each iteration starts by selecting the node with the smallest shortest-path estimate among the nodes labeled as white, and the node is selected by traversing through m nodes in L and checking their distance estimates. This part of the iteration is O(mm x ).
Next, we visit the nodes adjacent to the selected node. As we visit each adjacent node, we update the distance estimates d and the parent label parent of the adjacent node (Step 11 ). The update process for an adjacent node v adj of v requires the distance estimate corresponding to v adj . It is obtained by going through m nodes in the node-list. For all the m x nodes, we go through the node-list n x times, once for each of the edges in L(x, r max ). Consequently, this part of the iteration is O(mn x ). Therefore, the main conditional loop overall is O(m(m x +n x )).

Algorithm L, using adjacency list
In this section, we describe Algorithm L for computing the geometrically-corrected K-function (5) based on the adjacency list structure in Figure 8 and the concepts developed in Section 5.
Implementations for (4) and (6) are simple modifications of this algorithm. One feature of our algorithm is that it computes K L (r) on a finite grid of distance values 0 ≤ r 1 , . . . , r l ≤ r max without additional computational cost. This efficiency is achieved by computing interval sums (defined in (12)), which are related to the local K-functions, for every point event in the network; details are in Section 6.1.
After constructing the extended shortest-path tree from the root node x i , two remaining steps in the computation of K L (r) are: (1) computing the weights 1/m(x i , d L (x i , x j )) corresponding to the neighboring point events x j of x i , and (2) computing the interval sums for a specified grid of distance values by performing a search from the root node x i . Section 6.2 outlines the computational details of m(x, r), required in (1) for x ∈ x, and Section 6.3 provides a depth-first search algorithm (Cormen et al. 2009;Even 1979;Tarjan 1983) for computing the interval sums in (2). The pseudocode for Algorithm L is given in Section 6.4.

Interval sums
Although K L (r) is a function of a continuous argument r, in practice we compute it on a finite grid with stepsize ε, obtaining K L (r j ), where r j = jε, for j = 0, . . . , l; l = r max /ε is the largest integer less than or equal to r max /ε.
For x ∈ x and j = 1, . . . , l, let be the interval sums corresponding to x. In what follows we assume I 0 (x) = 0 for all x ∈ x, corresponding to locations without multiple events. Then the local K-function and the Kfunction can be expressed as for i = 1, . . . , l.
An alternative way of expressing the K-function, by interchanging the sums in (14), is An intuitive way to compute the K-function is to use (14) by first computing the local Kfunctions using (13). However, it is more efficient to do this by summing the computed values Figure 15: Top: the extended shortest-path tree from point event x 1 in Figure 6, restricted to r max = 5.5. Below: the function m(x 1 , r) for computing local K-function of x 1 .
of x∈x I j (x) and using (15). We first compute the function m(x, r) and then compute I j (x) using a depth-first search algorithm.

Computation of perimeter count m(x, r)
We start with some notation used in the rest of this section. Let L * * x (r max ) denote the extended shortest-path tree that is constructed from the subnetwork L(x, r max ) defined in Section 5.2. Let V x (r max ) denote the set of nodes that are within r max distance from x by shortest path. Let V * x denote the set of all nodes in L * * x (r max ), except the nodes that do not have any adjacent node in V x (r max ).
It is easy to verify that V x (r max ) ⊂ V * x , and for any v ∈ V * x , d L (x, v) > r max if and only if v ∼ v for some v ∈ V x (r max ). Note that, V * x may not include all breakpoints in L * * x (r max ), e.g., in Figure 15 the extended shortest path tree L * * x 1 (r max ), for a given r max = 5.5, does not include the breakpoint b (3,4) .

Efficient Code for Analyzing Events on a Linear Network
To compute m(x, r), we first order the shortest-path distances based on the non-decreasing values of d L (x, v), v ∈ V * x . The sorted list of distances is given by  [j] denote the degree of the node v [j] , for j = 1, . . . , m * x . Then, for a given r determine j such that d [j] ≤ r < d [j+1] , and compute m(x, r) as The m-function is a step-function with possible jumps at distinct d [j] values. If there is a tie, e.g., d [j] = d [j+1] , the value of the m-function remains unchanged between d [j−1] and d [j+1] . j+k+1] for some j ≥ 2 and k ≥ 2, we have The m-function m(x, r) is stored in the computer memory using two arrays dval and mval of Therefore, for a given r, we have,

Depth-first search algorithm
Here we present a variant of the classical depth-first search algorithm (Tarzan 1972) for computing the interval sums I j (x) (j = 1, . . . , l) in (12) for a given x ∈ x. When searching a tree, the most recently visited node, say v, is called the current node, and its adjacent nodes away from the root node are the children nodes, with v as their parent node.
A depth-first search begins at the root node x, taking it as the current node, in the extended shortest-path tree L * * x (r max ). The search then explores an unvisited outgoing edge (a child node) of the current node and then updates that child node as the current node. When there are no unvisited outgoing edge from the current node, the search backtracks to the parent node, thereby updating the parent node as the current node. The search finishes when there are no more unvisited outgoing edges from x. In our variation, we backtrack from a node, without exploring outgoing edges from the node, if the shortest-path distance from x to that node is greater than r max and the shortest distance to its parent node from x is less than r max .
We have implemented the depth-first search algorithm in a recursive C function. To give an overview of our implementation, below we provide a pseudofunction intervalSum outlining the steps involved in computing the interval sums (12). The pseudofunction has six arguments, which are described in Table 6.

Check if there exists any adjacent node xAdj of x.
(a) IF (xAdj == NULL), then EXIT.  (G, D, M, xAdj, remainingDistance, K, r) iii. Go to the next adjacent node, and assign it to xAdj.

EXIT
The function intervalSum should be executed with arguments dist = r max and K equal to 0, an l-length array of zeros. The recursive calls of intervalSum update the array K upon The extended shortest-path tree is constructed for a given r max = 5.5. For this example, m * x 1 = 7, and the ordered distances are d [7] = 7.0, and the nodes corresponding to these ordered distances are The graph in Figure 15 gives a plot of the step-function corresponding to m(x 1 , r), whose computation is explained below using (17).
Let r 1 = 0.5, r 2 = 1.0, . . . , r 10 = 5.0 be the values of r for which the interval sums will be computed. Accordingly, we set the initial value for Isum as the zero vector of length 10. Then, the depth-first search routine as explained in Procedure 2 works as follows. 10. Finally, the search track backs to x 1 , and since there present no unexplored outgoing edges of x 1 , the algorithm terminates.
When the above search finishes, the jth element of the array Isum is equal to the interval sum I j (x 1 ) for j = 1, . . . , 10.

Execution time against the number of point events
The computational time of both algorithms M and L crucially depends on p, the number of events observed on the network. Simulation experiments in this section suggest that the computational times of algorithms M and L are, respectively, quadratic and linear functions of p (see Figure 16a). A heuristic behind these timings is that Algorithm M computes p(p − 1)/2 shortest-path distances for all distinct event pairs, whereas Algorithm L only iterates over p times (see Step 2 of Algorithm L).
Figure 16a compares execution times (in seconds) for the Algorithm M and the Algorithm L as a function of p. For this plot, first, we generated independent, uniformly-distributed random points on the chicago street network for p = 1000, . . . , 10000, and then for each simulated data, we recorded the execution times of both the algorithms, taken for computing K L . The plot confirms that our proposed algorithm's computation time is linear in p, while that of the adjacency matrix algorithm is quadratic in p. The other drawback of the Algorithm M is that the R-program breaks down, even for a small network such as the chicago network, due to memory allocation problem when the number of points are greater than or equal to 11, 000. Although the adjacency matrix algorithm has satisfactory performance for small data sets, it is not feasible for larger data sets such as the Western Australian road network in Figure 1. Not only is the time complexity of order p 2 , but also the memory storage requirement of Algorithm M is prohibitive, as discussed in Section 2.2. Based on the adjacency list structure in Figure 8 and the Algorithm L in Section 6.4, we created R interfaces for computing the summary statistics (4)-(6). This implementation is memory efficient; it can store and analyze large networks with more than 10 5 nodes and edges on a PC with only 8 Gb RAM.
The execution time of Algorithm L also depends on the chosen value of r max . However, the linear time complexity of the algorithm with respect to p holds true for any choice of r max . Figure 16b plots the computation times of the network K-function (5) against the number of (a) (b) Figure 16: (a) Execution times of the adjacency matrix and adjacency list algorithms, plotted against the number of point events used to compute K L (r) on the chicago street network. (b) Execution times of the adjacency list algorithm for computing K L (r) (on the chicago network) corresponding to different r max values. In both cases, uniform random points were generated on the network.
points on the chicago network for different choices of r max . It is evident from the plot that the choice of r max only affects the slope of the linear relationship.
To evaluate performance on real data, we used the large Western Australian road network shown in Figure 1, and generated data sets with different numbers of point events by sampling without replacement from the 2011 road accident record. Figure 17 plots the timings (in minutes) of computing K L (r) for the data sets with p = 1000, 3000, . . . , 13000. As expected, the plot shows an approximately linear relationship between the execution time and the number of points.

Western Australia road network and road accident data
Here we compute the homogeneous and inhomogeneous network K-functions using Algorithm L for the road accident data set shown in Figure 1. The geometrically-corrected homogeneous K-function K L (r), equation (5), for the Western Australian accident pattern is shown in Figure 18 (left panel). The summary function K L (r) is often compared with the theoretical K-function of the Poisson process (K L (r) = r) to assess whether the distribution of point pattern is different from a completely random point pattern.
In Figure 18, the large difference between the empirical K-function K L (r) and the theoretical benchmark value (dashed line) suggests a departure from the completely random point process model. One can use K L (r) to formally test whether the accident pattern exhibits clustering, using a one-sided Monte Carlo test based on simulation envelopes (Baddeley et al. 2015, Chapter 10). The following lines of code can be used to compute and plot the homogeneous K-function in Figure 18.
Although inference based on the homogeneous K-function is straightforward, its computation for the accident pattern assumes that the accident rate is constant across the entire road network. This assumption is clearly fallacious in this case, as the spatially varying accident rates are visible in Figure 1. If the underlying point process is inhomogeneous with a spatially varying intensity function λ(u), u ∈ L, the inhomogeneous K-function K ih L (r) defined in (6) is typically used to examine the second-order properties, such as clustering or interactions amongst points, of the pattern (Ang et al. 2012;Baddeley, Møller, and Waagepetersen 2000). This function adjusts for the varying intensity by using the intensity estimates at the event locations as weights for the estimator K ih L (r). The accuracy of estimation of K ih L depends on how well we estimate the intensity function λ(u). In case of two-dimensional point patterns, kernel smoothing is a standard nonparametric technique for estimating the intensity function. However, kernel smoothing is time-consuming on a linear network due to its non-homogeneous spatial structure and complex boundary configurations at different locations (McSwiggan et al. 2016). We estimated the spatially varying accident rates using fixed and variable bandwidth smoothing methods for point patterns on a linear network. Details on these methods will be given in a sequel paper . Figures 19a and 19b show, respectively, heatmaps of the fixed and variable bandwidth intensity estimates (with color map on a logarithmic scale) on the Western Australian road network. The estimated intensity values are provided in supplementary files waCrashIntensity.rda and waCrashIntensityAdaptive.rda, respectively. The following code plots both the heatmaps.
because it over-smooths the densely populated parts of the network while producing intensity estimates close to zero for network parts that are sparsely populated. This is evident from the over-smoothing of the densely-populated western part of the state and from several missing pixels in Figure 19a, appearing in the sparsely populated north western part of the network, which corresponds to intensity estimates very close to zero. The variable bandwidth intensity estimates in Figure 19b are computed after adjusting for the underlying density of the point events -relatively large bandwidths are used in the sparsely populated areas than the dense areas. This reduces the over-smoothed nature of the heatmap and decreases the number of zero-valued pixels.
If we contrast the two inhomogeneous K-functions in Figure 18, we observe that, although both plots suggest some form of clustering in the accident pattern, the K-function with the adaptively smoothed intensity estimates reveals a lesser degree of clustering than its counterpart with fixed bandwidth estimates. However, this is expected as the adaptive smoothing based on variable bandwidth provide better estimates of the underlying intensity function than the fixed bandwidth estimates.

Discussion
This paper examined two general approaches to the computation of statistical summaries of events on a network. An approach based on the incidence matrix (Algorithm M) is straightforward to implement, and quite fast to execute, but is severely limited by its very large memory requirements. The adjacency matrix is wasteful because the vast majority of entries are zero. A sparse matrix representation reduces storage requirements but is not efficient for graph topology operations.
The alternative approach using adjacency lists (Algorithm L) results in substantial memory savings. This is evident from Table 1, which gives a comparison of memory usage between these two implementations of the K-function in (5) for three different network data sets available in the R package spatstat (Baddeley and Turner 2005). This efficient use of memory allows Algorithm L to be applied to very large networks, such as the entire road network of Western Australia. Algorithm L also lends itself easily to the calculation of other quantities such as the local K-functions.
Although Algorithm L adapts some ideas from Okabe and Yamada (2001), a direct comparison between our implementation and that of Okabe and Yamada (2001) does not seem appropriate. The latter computes the uncorrected, unweighted K-function (4), whereas our implementation is designed to compute any summary function of the general form (1) with special emphasis on the geometrically-corrected empirical K function (5).
Many improvements are possible. In very large and complex networks it might be more efficient to use quad-trees or other geometric hashing methods to divide the network into manageable pieces (cf. Okabe and Sugihara 2012, Chapter 3). Furthermore, it would be possible to improve parts of Algorithm L by using a priority queue. The operation of extracting the minimum value from a priority queue is O(1), and maintaining the heap property of the priority queue is O(log M ), where M is the number of elements in the queue.
The representation of a road network as a graph is a substantial over-simplification of the real physical network (Okabe and Sugihara 2012). Roads have complicated geometry including curvature and camber, multiple lanes, complicated intersections, overpasses, and structures which separate different lanes. The analysis of road accidents must take into account many of these covariates associated with the road network. The aNode components could easily be extended in order to make these covariates accessible in the adjacency-list data structure.

Supplementary materials
The supplementary materials provide the source code of our implementation of Algorithm L; the data analyzed in the paper; and R scripts needed to reproduce our results.
Computation times reported in the paper were measured on a 2.67 GHz Windows laptop with 8 Gb RAM.
The R package spatstat.Knet contains our implementation of Algorithm L. The implementation is written in C with an R interface through functions named Knet and Knetinhom. The spatstat.Knet package also contains the point pattern data sets analyzed in the paper, and the two estimated intensity functions depicted in Figures 19a and 19b.
The file v90i01.R is a stand-alone R script for reproducing all the results and figures in the paper.