Beyond non-backtracking: non-cycling network centrality measures

Walks around a graph are studied in a wide range of fields, from graph theory and stochastic analysis to theoretical computer science and physics. In many cases it is of interest to focus on non-backtracking walks; those that do not immediately revisit their previous location. In the network science context, imposing a non-backtracking constraint on traditional walk-based node centrality measures is known to offer tangible benefits. Here, we use the Hashimoto matrix construction to characterize, generalize and study such non-backtracking centrality measures. We then devise a recursive extension that systematically removes triangles, squares and, generally, all cycles up to a given length. By characterizing the spectral radius of appropriate matrix power series, we explore how the universality results on the limiting behaviour of classical walk-based centrality measures extend to these non-cycling cases. We also demonstrate that the new recursive construction gives rise to practical centrality measures that can be applied to large-scale networks.

A key novelty in our work is to extend the concept of non-backtracking to the case of nontriangulating, non-squaring and generally the avoidance of all cycles. To make the idea practical, we develop an appropriate recursive extension to the Hashimoto matrix construction which allows the required quantities to be computed via matrix powering and projection. We study theoretical properties of the resulting network centrality measures and show that they can be applied to large-scale datasets. Because the basic Hashimoto matrix construction is not standard in graph theory, and has been derived from different viewpoints in other fields, we give in §1a a simple motivating illustration. This allows us to explain the notation and set up the main combinatoric task.
(a) Illustration Figure 1 shows an undirected, unweighted graph with five nodes. It is convenient to regard each undirected edge as a reciprocal pair of directed edges. We write i → j to denote the directed edge from node i to node j, so, for example, the connection between nodes 1 and 2 in figure 1 gives rise to 1 → 2 and 2 → 1. The adjacency matrix for this graph has the form Here, for example, a 1,2 = 1 because there is an edge 1 → 2 and a 1,3 = 0 because there is no edge 1 → 3.
A walk around a graph is any route from node to node that makes use of the available edges. The adjacency matrix provides a convenient way to count walks. For example the fourth power of A has its 1, 5 entry equal to 6 because there are six distinct walks of length four (that is, using four edges) starting at node 1 and finishing at node 5: these are 1 → 2 → 1 → 2 → 5, 1 → 2 → 3 → 2 → 5, 1 → 2 → 3 → 4 → 5, 1 → 2 → 4 → 2 → 5, 1 → 2 → 5 → 2 → 5 and 1 → 2 → 5 → 4 → 5. Generally (A r ) ij counts the number of distinct walks of length r starting at node i and finishing at node j; see, for example, [26, theorem 2.2.1].
An operation which is central to our work is the construction of the line graph [27]. Here, edges in the original graph are regarded as nodes in the corresponding line graph. Nodes i → j and k → l in this new line graph are connected if j = k, that is, if, together, they represent a walk of length two in the original graph.   For illustration, we show in table 1 the entries in the adjacency matrix for the line graph of the graph in figure 1. Here we have chosen a specific ordering of the edges in the original graph. Zero entries have been left blank. We will denote this 12 × 12 matrix by W. Note that W is not symmetric; for example, w 1→2, 2→3 = 1 but w 2→3, 1→2 = 0. Essentially, W is encoding the presence of walks of length two in the original graph. Its second power, W 2 , then counts walks of length three. For example, by definition, (W 2 ) 1→2, 3→4 = a→b w 1→2, a→b w a→b, 3→4 , which reduces to 1 because the only nonzero product in the sum arises from w 1→2, 2→3 w 2→3, 3→4 , corresponding to the walk 1 → 2 → 3 → 4 in the original graph. Similarly, W 3 counts walks of length four in the original graph. For example, (W 3 ) 1→2, 4→2 = a→b c→d w 1→2, a→b w a→b, c→d w c→d, 4→2 , which equals 2 because of the existence of the two walks 1 → 2 → 3 → 4 → 2 and 1 → 2 → 5 → 4 → 2 in the original graph.
Generally, the rth power of W counts walks of length r + 1 in the original graph. Because of our choice of labelling, the second, third, fifth and seventh rows of W r record walks starting with an edge of the form 2 → , and the first, fourth, sixth and eighth columns record walks that end with an edge of the form → 2. It follows that by taking linear combinations of the appropriate rows and columns we can recover the node-based counts for walks starting at node 2 and finishing at node 2. Similar remarks apply for all nodes, and hence an appropriate linear projection of W r recovers all the walk count information in A r+1 . This fact is formalized in part (i) of proposition 2.4.
From the perspective of this work, a major benefit of the line graph setting is that we may modify the adjacency matrix W in a way that allows us to count only non-backtracking walks; that is, walks which never leave a node and then immediately return to it. In table 1 the starred entries represent reciprocated pairs of edges, such as 1 → 2 and 2 → 1. Replacing all such entries by zero, thereby creating the Hashimoto matrix or non-backtracking matrix [28] and calling this new matrix B, it follows that powers of B will automatically count non-backtracking walks, and the same projection method gives node-based results; see part (ii) of proposition 2.4. The remainder of the manuscript is organized as follows. In §2 we set up the full notation, discuss relevant network centrality measures and describe the benefits that have been found to arise when non-backtracking is introduced. Section 3 then exploits the Hashimoto matrix approach in order to characterize non-backtracking centrality measures based on general Taylor Series expansions. For such measures, it is of interest to characterize universality behaviour arising at the radius of convergence, and in §4 we study this issue. In §5 we then develop and analyse a recursive strategy that promotes non-backtracking into non-triangulating, nonsquaring, and, generally, the removal of all cycles. Having derived the new construction, we consider computational complexity issues and then analyse the universality behaviour. Section 6 gives the results of computational experiments that illustrate the feasibility of non-cycling centrality measures on real networks.

Preliminary material
Our fundamental object of study is an undirected graph. However, as illustrated in §1a, the operations that we apply will typically generate new directed graphs. Hence, we give definitions for the general case of a directed graph G = (V, E), with unweighted edges, and no self-loops or multiple edges. We denote by n the number of nodes and by m the number of edges.
Remark 2.1. For undirected graphs, we interpret each undirected edge i − j as a pair of directed edges i → j and j → i, and we denote by m the total number of such directed edges.
The graph G can be represented by means of its adjacency matrix A = (a ij ) ∈ R n×n , whose nonzero entries are a ij = 1 if and only if i → j ∈ E. The matrix A thus contains m nonzeros, one for each edge in the graph. We use I, 1 and 0 to denote the identity matrix, the vector of ones and the vector of zeros, respectively, and a subscript will indicate the dimension where this is not obvious. For every edge i → j ∈ E we will call i the source node of the edge and j the target node of the edge. The edge j → i will be referred to as the reciprocal of edge i → j. The number d out i of edges originating from node i will be referred to as the out-degree of node i, while the number d in i of edges targeting node i will be referred to as the in-degree of node i. For undirected graphs, d out i = d in i =: d i for all nodes i ∈ V and this common value is usually referred to as the degree of node i. A bold font denotes a vector, so d i is the ith component of d. A walk of length r is a sequence of r + 1 nodes i 1 , i 2 , . . . , i r+1 such that i → i +1 ∈ E for all = 1, . . . , r. A walk is said to be backtracking if it uses a consecutive pair of reciprocal edges and non-backtracking otherwise. We will use the acronyms NBT and NBTW for non-backtracking and non-backtracking walk. A path is a walk with no repeated nodes, with the only possible exception of the first and last nodes. If these coincide the path is then called a cycle. We mentioned in §1a that the entries of the rth power of A record the number of walks of length r; that is, (A r ) ij is the number of distinct walks of length r from node i to node j, for all r = 0, 1, . . .. Following [24], we denote by p r (A) the nonbacktracking analogue of A r , so the (i, j)th entry of p r (A) contains the number of NBTWs of length r from node i to node j. We use the convention that p 0 (A) = I. It is readily seen that p 1 (A) = A and It has been proved [5,10] that for all r ≥ 0 the matrices p r+3 (A) satisfy a four-term recurrence when A T = A: where • denotes the Schur (entrywise) product.
In the undirected setting, where A = A T , this reduces to a three-term recurrence [6]: for all r ≥ 1, As in the example of §1a, we denote by W ∈ R m×m the adjacency matrix whose entries are W i→j, →h = δ j , where δ j is the Kronecker delta. So W represents a network with m nodes, each corresponding to an edge in G, and a connection exists between two nodes if the corresponding edges in G are such that the target node of the first coincides with the source node of the second, and the two edges thus form a walk of length two in G. We refer to W as the edge-matrix. If G is undirected, then W is the adjacency matrix of the line graph corresponding to G. Finally, we denote by B ∈ R m×m the non-backtracking version of W; that is, the adjacency matrix of the network obtained by connecting two of the m nodes, each corresponding to an edge in G, if and only if the corresponding two edges form a NBTW of length two in G. We note that B = W − W • W T . We may aso write The matrix B is often referred to as the Hashimoto matrix [28] or the non-backtracking edge-matrix.

(a) Equivalent centrality vectors
A central issue in network science is to determine the most important players within the graph. This activity has applications in a wide range of areas, ranging from social science, marketing and politics to epidemiology and well-being [27,29]. The problem may be tackled using centrality measures. These functions, which are invariant under relabelling of the nodes in the graph, assign to each node a non-negative number that quantifies its importance-the higher the value, the more important the node. We take the standard viewpoint that the value assigned to each node is not interesting per se; we are concerned with the ranking that arises. It thus follows that two centrality vectors that assign different values to the nodes but induce the same ranking are equivalent. In this sense, it is worth pointing out that neither shifting a centrality measure with a uniform vector nor multiplying a centrality measure with a positive scalar changes the ranking of the nodes. We may, indeed, define an equivalence class among centrality vectors as follows. Let u, v ∈ R n be two non-negative, non-zero vectors. Then Two different representatives of the same equivalence class yield the same node ranking. We note in passing that one could consider a smaller collection of equivalence classes, e.g. two measures are equivalent if they induce the same ranking. However, for the purposes of this work, restricting our study to (2.3) suffices to compare the rankings induced by parametric matrix functions and those induced by their limits.
Many centrality measures have been introduced over the years. In this work we focus on the very broad class of walk based centrality measures induced by functions [30][31][32] f (z) = ∞ r=0 c r z r ∈ P, (2.4) where P is the set of functions analytic in a neighbourhood of zero that can be expressed with a Maclaurin series with non-negative coefficients c r for all r = 0, 1, . . . Clearly, f (z) = e z and f (z) = (1 − z) −1 belong to P. We will denote by ρ f the radius of convergence of the series f (z), which can be finite or infinite.
For a function f ∈ P defined on the spectrum of a matrix A, we will refer to (f (tA)) ii as the f -subgraph centrality of node i, and to (f (tA)1) i as the f -total communicability of node i. Here, t > 0 is a parameter that we are free to choose, with the constraint that the power series must converge. From the power series expansion of f and from the fact that powers of A count walks of given lengths, it follows that the f -subgraph centrality of a node measures how strongly each node is involved in closed walks of any length; similarly, its f -total node communicability measures how well this node communicates with all the nodes in the network. We note that in the classic Katz case, where f (z) = (1 − z) −1 , the parameter t represents an attenuation factor that downweights walks of length k by a factor t k [33], and hence, in a message-passing setting, t may be viewed as the probability of successfully traversing an edge.
These concepts can be extended to the framework of NBTWs, by defining the NBT f -subgraph centrality and NBT f -total (node) communicability of node i as respectively, for non-negative coefficients c r . It is intuitively reasonable that eliminating backtracking walks, and hence focusing on traversals that explore the network more widely, should lead to an improved centrality measure in applications where a message-passing or disease-spreading analogy is relevant. In the case of random walks, it is known that NBTWs mix faster [9]. For NBT generalizations of Katz centrality, three concrete benefits have been identified: -Localization: Suppose we have a family of non-negative unit Euclidean norm vectors x ∈ R n , defined for all large n. Then, from [34], the inverse participation ratio is defined to be S(x) := n i=1 x 4 i . The family of vectors is said to be localized if S = O(1) and nonlocalized if S = o(1), as n → ∞. Intuitively, localization implies that the majority of the mass in the vector is confined to a finite subset of components. When x is a centrality measure, localization corresponds to the undesirable circumstance where the algorithm has focused almost exclusively on a subset of the network, and does not give useful information about the relative importance of the majority of nodes. In this context, the effect was first highlighted in [25]. Numerical tests in [19,24] showed that a NBT version of the standard Katz algorithm [33] avoids localization effects observed for Katz on a range of real networks. Furthermore, rigorous asymptotic analysis on specific network classes backed up these results; for example, in [19] for a directed windmill network with an arbitrary number of blades.
-Range of parameter values: The classic Katz centrality measure [27,29,33] assigns the value ((I − tA) −1 1) i to node i. To produce a well-defined non-negative measure, the downweighting parameter t must lie in the range 0 < t < 1/ρ(A), where ρ(·) denotes the spectral radius. For the NBT version y i (t) in (2.5) with c r ≡ 1, it was shown in [24] that t must be chosen in the range 0 < t < 1/ρ(C), where By construction, since the NBT count cannot exceed the standard walk count, this radius of convergence must be larger than the upper limit 1/ρ(A) for Katz. In practice, the difference can be significant, and hence NBT Katz can support a much greater choice of downweighting parameters, allowing global features of the network to have a stronger influence on the measure. -Pruning: From a practical viewpoint, it appears that ρ(C) in (2.6) must be computed, or approximated, in order to determine an appropriate range of t values. However, it was shown in [19] that ρ(C) does not change when leaves, source nodes and dangling nodes are removed. (The equivalent statement is not true for standard Katz, where ρ(A) is not invariant to such deletions.) Moreover, these operations can be performed recursively until no such nodes exist. On realistic networks, these low cost pruning steps were found to reduce the typical network size by around 30%, making NBT Katz more efficient than standard Katz.
It is also of interest to characterize the form of these centrality measures at their radius of convergence; for example, it was also shown in [19] that the NBT eigenvector approach proposed in [25] arises as the limiting case t → 1/ρ(C) − in NBT Katz.
Our aim here is to show how NBTW-based measures can be studied, and generalized, by working in the edge space and then projecting. We will show that this approach allows us to (1) unify and extend the current theory, yielding results that hold for any analytic function, (2) describe limiting behaviour at the radius of convergence, and (3) extend to the case where walks avoid triangles, squares and, generally, all cycles up to any fixed length.

(b) Source and target matrices
We now collect together some results that allow us to perform projections from the edge space onto the node space. Many of these results can be found in disparate areas of the literature, especially for the case of undirected graphs, [35,36]. We present and justify them here because they form the core of our analysis. Note that both L and R have precisely one nonzero element equal to 1 in every row. This identifies the source/target node of the corresponding edge; hence, L1 n = R1 n = 1 m . We also note in passing that, for directed graphs, the matrix L − R is an incidence matrix. Proposition 2.3 recalls some basic properties of the source and target matrices [35,36].

Proposition 2.3. Let G be an unweighted, directed, graph with no self-loops nor multiple edges. Then, in the above notation, L T R = A, RL T = W, L T L is diagonal with the out degrees on the diagonal, and R T R is diagonal with the in degrees on the diagonal. If the network is undirected, then
Proof. The results can be proved entrywise from the definition of the source and target matrices. For example, to confirm the first equality we note that for all i, j ∈ V it holds that (L T R) ij = m e=1 L ei R ej = 1 if and only if there is an edge from node i to node j and is 0 otherwise. So The following proposition summarizes useful properties of the source and target matrices and, in particular, shows that they can be used to move from the edge space to the node space.

Proposition 2.4. Let G be an unweighted, possibly directed, graph with no self-loops nor multiple edges. Then
Proof.
(i) We proceed by induction. The result has been proved for r = 0 in proposition 2.3. Suppose that L T W r−1 R = A r up to a certain r ≥ 1, then from proposition 2.
The result follows directly from proposition 2.3. (v) For all i, j ∈ V we have that (R T BL) ij = m e=1 m f =1 R ei B ef L fj counts the number of NBTWs of length two formed by edges e and f , and such that e targets node i and f originates from node j. Clearly, this sum is always zero, unless i = j, and in this case the sum equals the number of NBTWs of length two through node i.
In the next section, we describe how to exploit the matrices L and R to compute the NBTW generating function induced by any analytic function. We emphasize that our basic object of study in the remainder of this work is an undirected network, so that A = A T and the matrices p r (A) satisfy the recurrence (2.1), but directed networks will arise when we use the Hashimoto construction and its extensions.

Projection techniques for non-backtracking centralities
Consider a function f (z) ∈ P in (2.4), which we recall is analytic in a neighbourhood of zero, with c r > 0 for all r and radius of convergence ρ f . Define the linear operator ∂ acting on f as follows: Before stating our first results on projection techniques for computing non-backtracking walk based centrality measures, let us remark that, since A is symmetric, its spectrum is real. Moreover, the spectrum of W will be real also, even though W is not symmetric in general. Indeed, from proposition 2.3 we know that A = LR T and W = R T L, and Flanders Theorem [37, theorem 2] implies that the spectrum of W coincides with that of A, up to the multiplicity of 0.
Let us also recall that the spectrum of B coincides with the reversal (e.g. [38]) of that of the symmetric matrix polynomial [39] M(t) = I − tA + t 2 (D − I), i.e. that of the deformed graph Laplacian; e.g. [6,24] and references therein. We note that the reversal of the deformed graph Laplacian has been called Bethe-Hessian by some authors [40,41]. In [24] it was also shown that for every λ in the spectrum of M(t), we have |λ| ≥ 1/ρ(A). (Below, in proposition 5.9, we will improve this result and show that the inequality is always strict for a non-empty graph.) In the remainder of this paper, we will often implicitly make use of the following classical result; see, for example, [42, theorem 4.7].

Theorem 3.1. Suppose f has a Taylor series expansion
is well defined and is given by if and only if each of the distinct eigenvalues λ 1 , . . . , λ s of A satisfies one of the conditions: Finally, let us state here the following simple consequence of the Cauchy-Hadamard theorem [43, theorem 3.39], which relates the radii of convergence of f and ∂f . Using these remarks and the results from the previous section, we may prove the following. Theorem 3.3. Let G be an unweighted, possibly directed, graph with no self-loops nor multiple edges. In the above notation, for Proof. From the definition of ∂f it follows that implying by (i) and (ii) in proposition 2.4 that and thus the conclusion. and is the deformed graph Laplacian of the network. We note that the second equality in (3.2b) was proved in [24]. These results give an equivalence in the sense of (2.3) between Katz centrality on W projected through L T and Katz centrality on A (3.2a), and between NBT resolvent based centrality on A and Katz centrality on B projected via L T (3.2b); indeed, since R1 n = 1 m , we have and More generally, theorem 3.3 implies that we can compute the NBTW generating function associated with (2.4) via (3.1), and thus rewrite (2.5), for appropriate values of t, as This approach induces a duality operation on graphs as described in table 2, which, however, is not invertible; indeed, the dual of the dual graph is not the primal graph.
It was shown in [19,20,24] that there are more direct ways to compute NBTW centrality measures that do not rely on this projection technique. However, as we show below, this approach has the advantages of (1) being simpler to describe for a general f (z), (2) unifying the theory, so that universality results may be studied, and (3) extending to walks that do not allow for cycles up to any fixed length.

Limiting behaviour and universality
It is well known that the classic Katz centrality measure becomes equivalent in the sense of (2.3) to the so-called eigenvector centrality measure [44] as the downweighting parameter approaches its upper limit, and becomes equivalent to degree centrality as the downweighting parameter approaches zero [27]. In [45] the authors derived a general set of such results for walk-based centrality measures. Here, we show how to obtain non-backtracking versions of these results via the Hashimoto matrix construction. We begin by relating the (left and right) Perron eigenvectors of B and the NBT eigenvector; that is, the eigenvector of M(t) associated with the smallest eigenvalue. (Note that M(t) is symmetric, and hence its left and right eigenvectors are the same.) Throughout this work, t → t for any t > 0 is taken to be the limit from below, and t → 0 is taken to be the limit from above.
Proof. Observe first that by [24, proposition 7.5] μ is a simple eigenvalue of M(t) and I − tB (both seen as matrix polynomials), and that it is their smallest. Let us decompose both M(t) = M(t) T and I − tB via the respective analytic SVDs: Moreover, it holds that Multiplying (3.3b) by σ M,n (t) and taking the limit t → μ we see that there exists α ∈ R, α = 0, such that ω = αL T w. Moreover, since L, w, ω are all non-negative, we have α > 0. Similarly, multiplying the transpose of (3.3b) by σ M,n (t) and taking the limit t → μ we see that there exists β > 0 such that ω = βR T z. To conclude the proof, note that since w, z ≥ 0 and w 1 = z 1 = 1, the fact that L, R have precisely one element equal to 1 in each row yields L T w 1 = R T z 1 = 1, and thus α = β = 1.
We note that we are correcting here a typo in the proof of theorem 6.1 in [19]. We now consider the case where there is a single cycle present within the graph.
(ii) We have Proof.
(i) That 1 is the Perron eigenvalue of B, and that its algebraic multiplicity is two, is a consequence of [2, equation (2.3) and corollary 1] and [24, lemma 6.2]. Note that (BF) e1 is equal to the number of NBTWs of length two over edge e and either an edge that goes through the cycle counterclockwise or an edge that goes towards the cycle. This number is 1 if edge e either goes counterclockwise through the cycle or goes towards the cycle, and it is 0 otherwise. Similarly (BF) e2 counts NBTWs of length two that consist of edge e and an edge that either goes clockwise through the cycle or goes towards the cycle. This is 1 if edge e either goes clockwise through the cycle or points towards the cycle, and 0 otherwise. We conclude that BF = F. Moreover, manifestly F has rank two. Hence the geometric multiplicity of the eigenvalue 1 is exactly two, as this cannot exceed the algebraic multiplicity. (ii) By definition of L and F, the (i, 1)th element of L T F counts how many edges, among those either in the cycle and going counterclockwise or not on the cycle and going towards it, start from node i. There is precisely one such edge for all i. Replacing 'counterclockwise' with 'clockwise', the same argument shows that (L T F) i2 = 1.
We now prove a universality result for NBTW-based centralities that generalizes the Katz version in [24, theorem 10.1] and echoes the result presented in [45] for classical centralities. Recall that the equivalence relation is defined in (2.3).  /ρ(B). Then the NBT f -subgraph centrality vector x(t) in (2.5) and the NBT f -total communicability vector y(t) in (2.5) are such that where > 2 is the length of the shortest cycle in the graph (if any), d ( ) is the vector whose ith entry is the number of cycles of length involving node i, and d is the vector of degrees. Moreover, (i) if the graph contains at least two cycles, then where ω is as in theorem 4.1. (ii) if the graph contains exactly one cycle, then x(t → t) i depends only on the distance of node i from the cycle and (iii) if the graph is a tree, then t = ∞ and where κ is the length of the longest non-backtracking walk in the graph.
Proof. We may obtain t → 0 limits directly from the series expansions. We begin by considering x(t). If the graph is a tree, there are no closed walks and thus x(t → 0) = c 0 1 ∼ 1. Suppose now that there is at least one cycle in the graph and that the length of the shortest cycle is > 2. Then working entrywise, for all i = 1, 2, . . . , n we have Letting p (r) be the vector whose ith entry is the element p r (A) ii for all r ≥ 3, we have Finally, we note that since there are no cycles of length < , then p ( ) = d ( ) . The result y(t → 0) ∼ d follows similarly.
We now prove the statements about the upper limit.
(i) If the graph contains at least two cycles, then ρ(B) > 1 [24]. From theorem 3.3 it follows that and similarly, using the fact that R1 n = 1 m , By lemma 3.2 and standard results in matrix theory, the matrix function ∂f (tB) has the same radius of convergence of f (tB), that is, t = ρ f /ρ(B).
(ii) If the graph contains precisely one cycle, then ρ(B) = 1 has geometric multiplicity 2 and, up to relabelling of the nodes, F defined as in lemma 4.2 is a basis for ker(B − I). Hence, for some Z ∈ R m×2 , when t → 1 we have where Z ≥ 0 and Γ ∈ R 2×2 . Using L T F = [1 n 1 n ] from lemma 4.2, we see from (4.1b) that so y(t → 1) ∼ 1 n . For the f -subgraph centrality, note that Suppose that the unique cycle has length and that node i has a distance of k edges from the cycle (k = 0 if node i belongs to the cycle): then, for r ≥ 1, Hence, (iii) If the graph is a tree, i.e. it does not contain any cycle, then ρ(B) = 0 and thus t = ∞. Moreover, p r (A) ii = 0 for all r ≥ 1 and for all i, so that x(t) = c 0 1 and hence x(t → ∞) ∼ 1.
Since the graph is a tree, it also follows that the matrix power series is a polynomial in t; let κ be the length of the longest non-backtracking walk in the graph, i.e. the diameter of the graph. Then p r (A) = 0 for all r > κ and thus Theorem 4.3 highlights very different behaviour of the two types of centrality. It is intuitively clear that the NBT constraint in f -total communicability should become irrelevant as t → 0; here walks of length one dominate, and these never backtrack. However, for f -subgraph centrality, the shortest closed walks under the NBT constraint are cycles of length > 2. Theorem 4.3 shows that in the generic case where the graph contains at least two cycles, as t → t both centrality measures converge to an equivalent of the projection of the Perron eigenvector of M(t) obtained via L T . In the specific cases when the graph either contains exactly one cycle or none, we again have a mismatch between the limiting behaviour of the two NBT f -centrality measures. The qualitatively different behaviour when there are two or more cycles is intuitively explained by the fact that the presence of at least two cycles allows us to "change direction" when walking around the network. If the graph contains only one cycle, then in the edge-space we have two connected components, one corresponding to the cycle being visited clockwise and one corresponding to the cycle being visited counterclockwise. On the other hand, if we have two cycles, then in the edge-space we have one strongly connected component instead of two.

Beyond non-backtracking: non-k-cycling
The projection approach described in §2b is based on a duality relation on graphs that builds on the source and target matrices associated with the adjacency matrix A. We now show how this approach can be iterated to compute weighted sums of walks that do not backtrack and do not contain any cycle of length up to a given k.
We therefore define the matrices p r;k (A) ∈ R n×n , whose (i, j) elements count walks of length r from node i to node j which do not backtrack and do not allow for cycles of length up to k. Our aim is to study generalizations of (2.5) to the case of the non-backtracking and up to non-k-cycling f -subgraph centrality measure x k (t) and f -total communicability measure y k (t), defined entry-wise for i = 1, . . . , n as (x k (t)) i = (5.1) Remark 5.1. Because a closed walk of length r must contain a cycle of length no more than r, it follows that the sum defining (x k (t)) i in (5.1) may be taken from r = k + 1; the terms from r = 0 to r = k are zero. So, although the subgraph centrality concept is based on counting closed walks, the non-cycling constraint rules out all such walks that are deemed to be too short.
Throughout this section, we will adopt the notation (i 1 , i 2 , . . . , i r ) to denote the walk i 1 → i 2 → · · · → i r of length r − 1 in the original graph and we will denote by i the corresponding multiindex. We remark that open walks of length that do not backtrack and do not include any cycle are open paths of length .
The following matrix allows us to perform the iterative computations.

Definition 5.2 (Non-k-cycling matrix).
For k = 1, the non-k-cycling matrix, P k , corresponds to the adjacency matrix A. For k = 2 the matrix P k corresponds to the Hashimoto matrix B in (2.2). More generally, for k > 2 the matrix P k has as many rows and columns as The next result shows how P k may be constructed. We emphasize that k = 2 corresponds to the NBTW setting; see also [35,36] Theorem 5.3. Let W 1 = A be the adjacency matrix of a simple graph. Then, for k ≥ 2 the non-k-cycling matrix P k can be recursively computed as k−1 and L k−1 and R k−1 are the source and target matrix of the graph whose adjacency matrix is P k−1 .

Remark 5.4.
We note that the matrices W k in the statement of theorem 5.3 correspond to the adjacency matrices of the kth order De Bruijn graphs of paths in the network; see [46].
Proof. We first argue that the dimension of W k matches that of P k . By construction W 2 corresponds to the edge-matrix, W, which contains as many rows and columns as the number of edges in the (directed) graph represented by W 1 = A. For k > 2, the matrix W k has as many rows and columns as the number of open paths of length k − 1 in the original graph. Moreover, each row in the matrix L k−1 (resp., R k−1 ) corresponds to a walk of length k − 1, say (i 1 , . . . , i k ), that neither backtracks nor contains cycles of length up to k − 1. Furthermore, such a row will contain a 1 in the entry corresponding to the column associated with the non-backtracking and up to non-(k − 2)-cycling path (i 1 , . . . , i k−1 ) (resp, (i 2 , . . . , i k )). The entries of W k = R k−1 L T k−1 will then equal one if and only if the two paths of length k − 1 corresponding to the row and column indices under consideration are such that the last k − 2 edges of the first path coincide with the first k − 2 of the second path, and thus they form a path of length k. In summary: for any two paths i = (i 1 , . . . , i k ) and j = (j 1 , . . . , j k ) of length k − 1 ≥ 1 it holds that By construction the matrix W k will contain a one where two paths of length (k − 1) form a cycle of length k. It is therefore clear that the matrix P k will be obtained from W k by removing such entries. Hence, to complete the proof we must show that k = W k • (W T k ) k−1 identifies cycles of length k in the original graph. To do so, note that Considering each term in the product on the right-hand side individually, we see from the definition of W k that the first term will equal 1 if and only if j r = h (1) r−1 for r = 2, . . . , k. The second term will equal 1 if and only if h r−1 for r = 2, . . . , k, and this also implies the product of the first two terms will equal 1 if and only if j r = h (2) r−2 for r = 3, . . . , k. Proceeding in this way, the (k − 1)th term will be non zero if and only if h r−1 for r = 2, . . . , k, so that the product of the first (k − 1) terms will equal 1 when j r = h (k−2) r−k+2 for r = k − 1, k. Finally, the last condition is for the last term to equal one, and this happens when h (k−2) r = i r−1 for r = 2, . . . , k. Therefore, the product of all terms will equal one if and only if j r = i r−k+1 for r = k, i.e. when j k = i 1 . Moreover note that, for two given paths i and j there will be at most one non-zero product in the summation, since we only need to check that the final node in the first walk coincides with the first node in the second walk. Therefore, Exploiting the definition of W k and (W T k ) k−1 it follows that k will have a 1 in position (i, j) if and only if i.e. if the two paths form a (directed) cycle of length k in the original graph. This concludes the proof.
It follows from the definition of P k that taking a step in its associated graph corresponds to taking a NBT and up to non-k-cycling walk of length k in the original network; more generally, taking r consecutive steps within the graph associated to P k corresponds to taking k + r − 1 steps in the original graph, while avoiding backtracking and cycles of up to length k.
Using the left and right projectors it is immediately clear that the following theorem holds.
Theorem 5.5. For all r = 0, 1, . . . and for any given k ≥ 2, we have In order to obtain useful expressions for x k (t) and y k (t) in (5.1), we first study the generating function Φ k (t) = ∞ r=0 c r t r p r;k (A). (5.3) Note that, given a certain k, for walks of length r ≤ k − 1 it holds that p r;k (A) = p r;r (A), since no cycles of length k can be formed using less than k edges. Therefore, our problem reduces to that of computing Φ k (t) = ∞ r=k−1 c r t r p r;k (A), (5.4) which implicitly yields all the p r;k (A) for the interesting case r ≥ k.
Our procedure for computing Φ k (t) is the following. From theorem 5.5 it follows that and thus Hence, having obtained P k from the construction in theorem 5.3, we may obtain Φ k (t) as follows: Here, in the general case where ∂ k−1 f (tP k ) takes the form of a power series, it may be approximated by ignoring powers (tP k ) s+1 and higher, for some choice of s. This truncation corresponds to ignoring non-k-cycling walks of length greater than s in the original network. We emphasize, however, that in practice, for a specific choice of f , ∂ k−1 f (tP k ) is nothing but a matrix function [42] of P k ; more efficient techniques than truncating a Taylor series are typically available to compute a matrix function, or its action on a vector. As they depend on the specific function, a full discussion is beyond the scope of this paper, and we refer the reader to the monograph [42] and the references therein.
Let us briefly comment on the last step of the approach described above. First, we note that there is no need to add the term c 0 1, as this addition would just produce a different representative of the same equivalence class under the relation in (5.1). The terms tc 1 A1 and t 2 c 2 p 2 (A)1 = t 2 c 2 (A 2 − D)1 can be easily built from the data. As for the remaining terms t c p ;k (A)1 = t c p ; (A)1 for = 3, . . . , k − 2, these can be computed during the process of building the matrix P k . Indeed, it follows from theorem 5.5 and (5.2) that p ; (A)1 = L T −1 P 1. We observe that, although not necessarily tractable for large networks (see §5a), the procedure described above is mathematically well defined for any k ≥ 2.
We finally describe how to compute the non-backtracking and up to non-k-cycling f -subgraph centrality measure x k (t) in (5.1). It is readily seen from (5.4) that for all i = 1, 2, . . . , n: As observed in theorem 4.3, the behaviour of the f -subgraph communicability is very different from that of the f -total communicability, in the sense that the former lacks 'memory'; indeed, as mentioned in remark 5.1, the vector x k (t) only considers closed walks in assigning importance to the nodes in the network, and completely ignores closed walks whose length is less than k since they have already been removed at previous steps.

(a) Remarks on complexity
This method can go on indefinitely, until we have removed cycles of length n (which is the length of the longest possible cycle). At that point, we have a method to count paths that can be used to define a path centrality.
Counting paths is #P-complete [47]. Therefore, if we had an algorithm for computing the p n;k (A) matrices in polynomial (in n) complexity even for k = n, this would imply that P = NP. Unfortunately for the authors, we do not have such an algorithm.
Observe that the size of the matrix P k is equal to the number of k-plets of nodes in the input graph such that there is a path of length k − 1 through them. The worst case scenario is given by the complete graph with n nodes, for which there are O(n k ) such k-plets. Therefore, even if all the subsequent steps are implemented in a complexity which is linear in the size, for k = n the method would yield an exponential complexity algorithm.
It should be noted, though, that it is entirely conceivable that for real-life networks, which are typically extremely sparse, this worst-case growth might not be relevant. This issue is followed up in §6.

(b) Convergence
In this subsection we study the radius of convergence of the power series (5.4). This analysis will be used in §5, where we study universality properties of the centrality measures defined via the generating functions.
We begin by showing that the node space and generalized edge space series behave similarly.

Lemma 5.7.
For all k, the series Ψ k (t) = ∞ r=0 c r t r P r k and Φ k (t) in (5.3) have the same radius of convergence.
Proof. Denote by ρ Ψ and ρ Φ the radii of convergence of Ψ k (t) and Φ k (t), respectively. Let t < ρ Ψ . Therefore, by theorem 5.5 c r t r p r;k (A).
Hence, the (i, j) entry of the sum Φ k (t) is equal to a finite sum plus an infinite sum. The latter is a linear combination of the entries of the absolutely convergent sum Ψ k (t). It follows that Φ k (t) converges, and hence, ρ Φ ≥ ρ Ψ .
Suppose now t > ρ Ψ and let (r 0 , s 0 ) be such that (Ψ k (t)) r 0 s 0 diverges. Moreover let (i 0 , j 0 ) be such that (L k−1 ) r 0 i 0 = (R k−1 ) s 0 j 0 = 1; note that (i 0 , j 0 ) is uniquely determined because L k−1 and R k−1 in (5.2) have precisely one nonzero element in each row. Observe that The next theorem characterizes the radius of convergence of (5.4) (via lemma 5.7) in terms of the number of cycles of length greater than k. Theorem 5.8. For k ≥ 2 the spectral radius of the non-k-cycling matrix P k of a simple and connected graph G satisfies the following properties:  Before proving this result, let us point out that any undirected cycle of length k in the original graph can be regarded as two directed cycles: one where the nodes are visited clockwise and one where the nodes are visited counterclockwise. It is readily seen that each of these two walks will also appear in the graphs associated to the matrices P for all < k. Proof.
(i) Elementwise it holds p r;k (A) ≤ p r;k−1 (A) for all r, k and thus the statement follows from lemma 5.7. (ii) This is a corollary of Flanders Theorem [37]. Since the graph contains no cycles of length k, then k = 0 and thus the matrices P k−1 = L T k−1 R k−1 and P k = W k = R k−1 L T k−1 have the same spectrum, up to the multiplicity of 0. (iii) If there are no cycles of length greater than k, then the maximal length of non-k-cycling walks is finite. It follows that Φ k (t) in lemma 5.7 has only a finite number of nonzero addends, and hence it converges for all t. Thus, Ψ k (t) also converges for all t (and for all allowed choices of f ) implying ρ(P k ) = 0. (iv) By the Gelfand formula, for any matrix norm · , ρ(P k ) = lim r→∞ P r k 1/r ; e.g. [48, corollary 5.6.14]. Note first that max i,j |(P r k ) ij | ≥ 1, as there are two cycles in the graph of P k and hence there exist walks of arbitrary length. We claim that, for r large enough, there exists a constant c ≥ 1, independent of r, such that max i,j |(P r k ) ij | ≤ c. Since the latter is a matrix norm (not depending on r) of P r k , it follows that ρ(P k ) = 1. It remains to prove the claim. Take r > n − k, where n is the number of nodes in the original graph. Then, any walk counted in P r k contains at least one cycle. Only two cycles of length > k exist in the graph associated with P k , one corresponding to the cycle in the original graph being visited clockwise, and one corresponding to it being visited counterclockwise; clearly it is not possible for a walk to go from one to the other, since this would imply the existence of either (1) another cycle, longer than the one of length > k existing in the original graph, or (2) a cycle of length ≤ k in the graph associated with P k . Case (1) leads to a contradiction, while (2) cannot happen because those walks have been removed at previous steps. This means that the walks we are considering must contain a number of consecutive circuits round one of the two cycles. Fix now i and j, two paths of length k − 1, and consider (P r k ) ij . This quantity is bounded above by a number c ij that can be constructed as the number of ways to enter one of the two cycles in the graph associated to P k from i, times the number of ways to go from such cycle to j. These two numbers are finite, and do not depend on r but only on i and j. Taking c = max i,j c ij completes the argument.
(v) We first consider the case where two of the cycles of length greater than k in the original graph share at least one vertex, and denote their lengths by 1 ≥ 2 > k. Fix some integer κ ≥ 1 + 2 and let s be the integer satisfying We will show that there is at least one entry of P κ k that is bounded below by 2 s , so that then ρ(P k ) ≥ lim around the first cycle and s times around the second cycle. Hence, for κ large enough, at least one entry of P κ k is bounded below by 2 s . Hence the conclusion. Suppose now that no pair of cycles share a vertex, then take any two cycles of length 1 ≥ 2 > k. These cycles are connected by (at least) one walk, whose length we denote by d. Fix now some κ ≥ 2( 1 + 2 + d) and let s be the unique integer such that Let us count non-k-cycling walks that (1) start within the first cycle, (2) go precisely 2s times around the first cycles, 2s times around the second cycle, and s times back and forth on the bridge, and (3) end somewhere in the first cycle. There are at least such walks, and this gives a lower bound for at least one element of P κ k . It follows that After some further analysis, we will go on to prove theorem 5.14, which shows that the converse of item (ii) in theorem 5.8 also holds. As the case k = 2 (note that 'cycles of length exactly two' are reciprocal edges, which are always present in a non-empty undirected graph) admits an easier proof, we treat it separately here.

Proposition 5.9. For any non-empty (i.e. there is at least one edge) simple graph with adjacency matrix A and Hashimoto matrix B, it holds that ρ(B) < ρ(A).
Proof. If the graph is a forest then ρ(B) = 0 < ρ(A). Assume that the graph is not a forest. Then, in view of the results in [24], we may assume that the graph is connected, and ρ(B) is equal to the largest finite eigenvalue of the matrix polynomial It 2 − At + D − I. Moreover, by [24, theorems 4.7 and 6.1], the latter is invariant by iteratively removing all the leaves from the graph. On the other hand, the spectral radius of A can decrease by removing the leaves, but it cannot increase: hence, there is no loss of generality in assuming the graph has no leaves. We now argue similarly to [24, proof of theorem 4.8] and observe that It is an immediate consequence of theorem 5.8 that for different values of k the ranges of the parameters t for which the generalized Katz centralities based on non-k-cycling walks, obtained via the procedure described in §5, consist of a sequence of nested intervals of the form Moreover, these intervals are strictly included in (0, 1) for all the values of k for which the (connected) graph contains at least two cycles of length greater than k; they are equal to (0, 1) for the values of k for which there is precisely one such cycle; and they are equal to (0, ∞) for all values of k such that there are no such cycles.

(c) Generalized pruning
In this subsection, we show how the spectrum of P k is invariant under certain pruning operations. These results are of direct interest, since they quantify the range of allowable values for the parameter t. They will also be used in the next subsection, where we study limiting behaviour. Let G = (V, E) be a simple and connected graph, and let k ≥ 3 be a fixed path length. For the goals of this subsection, we partition the set of nodes for such fixed k into two subsets: Here B is the minimal subset of nodes such that (1) given a fixed cycle-length k ≥ 3, all the cycles of length > k only visit nodes belonging to B and (2) each connected component in the subgraph spanned by C (if any) is connected to just one node in B (multiple connections to the same node in B are however allowed). We omit the trivial proof that, given k, such a partition exists and is unique, although in some cases one may have B = ∅ or C = ∅.
Let us point out that paths that originate in B and end in C cannot be prolonged without introducing cycles of length ≤ k to return to B, as this would imply the existence of a cycle of length > k outside B that we can use to cycle back.
Below, we will for simplicity use the verb 'prolong' to mean 'prolong without introducing cycles of length ≤ k', as above. Consider now the following labelling of the open paths of length (k − 1) in G, i.e., of the row and column indices of the non-k-cycling matrix P k : (i) paths in the original graph that start and end in B (and thus never leave B) and that can be prolonged into arbitrarily long walks; (ii) paths in the original graph that start and end in B (and thus never leave B) and that cannot be prolonged into arbitrarily long walks; (iii) all those paths that do not entirely take place within B and cannot be prolonged into arbitrarily long walks, thus cannot 'return to B'; and finally (iv) all the other paths: these do not entirely take place within B but can be prolonged into arbitrarily long walks, and hence will return to B in the limit.
With the described labelling, the matrix P k can be written as a 4 × 4 block matrix; more specifically 5) where N i for i = 1, 2, 3 are nilpotent matrices of appropriate size. Indeed, the entries of the (1, 1) block, i.e. the entries of the matrix Q k (B) in (5.5), are non-zeros if and only if there are two walks within the subgraph spanned by the nodes in B that can be concatenated and indefinitely prolonged. The (2,2) and (3,3) blocks correspond to paths that are not indefinitely prolongable.
Since it is possible to concatenate two such paths, the matrices in these blocks are not the zero matrix, in general. However, given the type of walks we are considering, powers of these matrices are going to be zero for large enough powers, since the walks are not indefinitely prolongable. Thus the matrices in blocks (2,2) and (3,3), denoted by N 1 and N 2 , are nilpotent. A similar reasoning applies to the matrix N 3 in the (4, 4) block. Indeed, there are only that many walks that do not entirely take place within B and that are prolongable to return to this set. For large enough values of r, all walks will have returned to B and thus N r 3 = 0, and the matrix is nilpotent. We now consider the off-diagonal blocks. The (2, 1) block is the zero matrix, since its entries record whether it is possible to concatenate walks that cannot be indefinitely prolonged with walks that can be indefinitely prolonged. The (1,4) and (2,4) blocks cannot have non-zero entries, as they would correspond to paths that take place in B entirely and are prolonged via paths that are not entirely on B but can return to this set. However this would imply the existence of a cycle outside B.
Blocks (3,1), (3,2) and (3,4) correspond to walks that are not indefinitely prolongable and therefore cannot be connected, in the graph corresponding to P k , to any of the paths that are either taking place entirely within B or that can be prolonged to return to it. Remark 5.10. The non-k-cycling matrix associated with the subgraph of G spanned by B, which we denote by P k (B), is then The following theorem is an immediate consequence of the structure of the matrix P k .
Theorem 5.11. Let G = (V, E) be a simple connected graph and let k be a given cycle length. Let V = B ∪ C be partitioned as described at the beginning of this section and suppose that the edges are labelled as described above. Then, the spectrum of P k , the non-k-cycling matrix corresponding to G, coincides with that of Q k (B) in (5.6) up to the multiplicity of 0.
Theorem 5.11 shows that, similarly to the NBT case k = 2 considered in [19,24], for k ≥ 3 the network dimension may be lowered by pruning in order to reduce the computational cost of finding the spectral radius of P k . The reciprocal of this spectral radius is a strict upper bound for the range of suitable t values in the non-k-cycling centrality measures.
Remark 5.12. According to whether there are more than one, precisely one, or no cycles of length > k, the matrix Q k (B) above is, respectively, irreducible, permutation similar to a block diagonal matrix with two identical irreducible blocks, or empty. Therefore, the study of the spectral radius of P k can be without loss of generality reduced to the case where the latter matrix is irreducible.
Lemma 5.13. Let P k be partitioned as in (5.5). Then, for its right Perron eigenvector w = ρ(P k ) −1 P k w, we have the coherent partition, with u, v > 0, That is, w i = 0 if and only if i is a path of length (k − 1) that cannot be indefinitely prolonged.
Proof. We partition the nodes of the graph of the nonnegative matrix P k into four categories, as described before theorem 5.11. Then, by remark 5.12, we can take w i > 0 if i is an indefinitely prolongable path that takes place entirely on B, i.e. if i belongs to category (i). From the fact that for large enough r (in particular, for r ≥ R where R is the maximum of nilpotency indices of N 1 , N 2 and N 3 in (5.5)), we have that it is clear that w i = 0 if i cannot be indefinitely prolonged (categories (ii) and (iii)). Finally, let i be a path of category (iv). Then, by remark 5.12, there exists a threshold R such that for r ≥ R and by the eigenequation defining w, we have where the summation is taken over all paths j of length (k − 1) within G. By (5.7), if j is a path of type (iv), then (P r k ) ij = 0. Moreover, if j is a path of either type (ii) or (iii), then w j = 0. Hence, the summation in (5.8) can be taken over all paths j of category (i) that can be connected to i in the graph associated with P k via a path of length r. Suppose w i = 0: then, there is no such path j of category (i), for no value of r ≥ R. This contradicts the fact that i can be indefinitely prolonged, and hence, w i > 0.
We are now in a position to prove the converse of item (ii) in theorem 5.8. Proof. We may assume without loss of generality that there are at least two cycles of length > k, otherwise the statement is a trivial corollary of items (iii), (iv) and (v) in theorem 5.8. Recall that by our construction there exist L k−1 , R k−1 , k such that P k−1 = L T k−1 R k−1 , and P k = R k−1 L T k−1 − k and the absence of cycles of length k is tantamount to k = 0.
From the definition of L k−1 and k it follows that (L T k−1 k ) ij = 1 if the (k − 2)-path i in G is part of a k-cycle and can be prolonged within this cycle to form the (k − 1)-path j, while (L T k−1 k ) ij = 0 otherwise. Suppose that ρ(P k ) = ρ(P k−1 ) = ρ > 1. Then for a left Perron eigenvector a of P k−1 and a right Perron eigenvector w of P k the following equations hold: Combining the equations above we thus see that where the sum is taken over all pairs (i, j) such that (L T k−1 k ) ij = 0, i.e. the (k − 2)-path i is part of a k-cycle and can be prolonged within such a cycle to make the (k − 1)-path j. By remark 5.12, we can assume that P k−1 is irreducible, and hence, a > 0. It is worth stressing that in this context we cannot simultaneously make the same assumption on P k : we only know w ≥ 0. We therefore conclude that either the summation in (5.9) is empty, and hence there is no cycle of length precisely k, i.e., k = 0, or w j = 0 for all open paths of length k − 1 that are part of a k-cycle. We claim that the latter is impossible: if there is a cycle of length k in the original graph then there is at least one such open path, labelled j 0 in the graph of P k , such that w j 0 = 0. This claim proves the statement.
To prove the claim, let us partition the nodes V = B ∪ C for fixed k as described at the beginning of this subsection. We further partition the (k − 1)-paths in G, i.e. the nodes of the graph of P k , into the same four categories described before theorem 5.11. Suppose that there exists a cycle of length k in G whose nodes all belong to B. Then there exists a (k − 1)-path j 0 that is indefinitely prolongable and belongs to this cycle. It thus belong to category (i) and hence w j 0 > 0 by lemma 5. 13. Suppose now that the cycle of length k contains at least one node in C. From the definition of B and C it follows that there is at most one node in the cycle that belongs to B, as otherwise we would have a connected component in the graph spanned by C that is connected to at least two nodes in B. Hence, there is at least one open path j 0 of length (k − 1) that belongs to such a cycle and does not entirely take place within B, but can be indefinitely prolonged, i.e. is of category (iv). By lemma 5.13 we again have w j 0 > 0. This proves the claim and hence the theorem.

(d) Non-k-cycling centralities and universality classes
We now extend the results in theorem 4.3 to the case of non-cycling walks. In summary, we find that the limiting behaviour for subgraph centrality measures does not depend on the underlying scalar function f (x). However this is not true for the case of total communicability; here, in the generic case (iii) in theorem 5.15, this quantity is seen to depend on the coefficients c 1 , c 2 , . . . , c k . An important corollary of this result is that, unlike in the NBT case k = 2 studied in [19,24], there can be no universal eigenvector-based non-cycling centrality measure arising as the limit of the walk-counting version. Theorem 5.15. Let c r > 0 for all r, and assume that the underlying graph is simple and connected. Consider the centrality measures x k (t) and y k (t) in (5.1) for k > 2. Suppose that the power series converge with radii of convergence t k . Then, in the limit t → 0 we have where > k is the length of the shortest cycle that can be traversed, if any, d ( ) is the vector whose ith entry is the number of cycles of length centred at node i, and d is the vector of degrees. Moreover, (i) if the graph does not contain any cycle of length > k, then t k = ∞ and x k (t → t k ) ∼ 1 and y k (t → ∞) ∼ p h+k−1;k (A)1, where h is the length of the longest path in P k . (ii) if the graph contains exactly one cycle of length > k, then x k (t → t k ) only depends on the distance of each node from the cycle of length > k, while y k (t → t k ) ∼ 1. (iii) if the graph contains at least two cycles of length > k, then x k (t → t k ) and y k (t → t k ) exist and are unique. The limit vector x k (t → t k ) depends on k, but not on the choice of the coefficients c r . Similarly, the shifted limit vector y k (t → t k ) − (c 0 1 + c 1 d + · · · + c k−1 p k−1;k (A)1) depends on k, but not on the choice of the coefficients c r .
Proof. The limit t → 0 can be analysed straightforwardly, as was the case in theorem 4.3. For the limit for t → t k , there are three cases: (i) If the graph does not contain any cycle of length > k, then the graph associated with P k is a cycle-less digraph, and hence t k = ∞ and x k (t → ∞) ∼ 1. Moreover, if we let h be the length of the longest directed path in the graph associated with P k , then y(t → ∞) ∼ p h+k−1;k (A)1; see theorem 5.5. (ii) If the graph contains exactly one cycle of length > k, the matrix P k has 1 as its eigenvalue, with algebraic and geometric multiplicity two. Indeed, using the same partition of nodes described in §5c and the labelling of paths of length (k − 1) described in theorem 5.11, it follows that Λ(P k ) = Λ(P k (B)) ∪ {0}. It remains to describe P k (B). By the remarks in §5c we can focus on studying Q k (B). It is clear that anything that touches any shortcut is not indefinitely prolongable. Hence, up to permutation similarity, Q k (B) = C ⊕ C T where C ∈ R × is the circulant adjacency matrix of a directed cycle. It immediately follows that 1 is an eigenvalue of P k with both algebraic and geometric multiplicity 2. The conclusion then follows using a similar reasoning to that of theorem 4.3 (ii). (iii) Finally, suppose that the graph contains at least two cycles of length > k. The matrix P k is then permutation similar to (5.5), where the matrix P k (B) = 0 in (5.6) is now nonnegative and irreducible. Therefore, it follows from the Perron-Frobenius theorem that the spectral radius of P k (B), and hence of P k , is a simple positive eigenvalue. The conclusion then follows from a similar reasoning to that of theorem 4.3 (i).

Tests on real data
In this section we record the dimension (number of rows/columns), the number of nonzero elements, and density of the square matrices P 1 = A, P 2 = B, P 3 , P 4 for some example networks. Our aim is to get a feel for the growth of these quantities as a function of the initial network size, n. We first consider samples from widely used random graph models, where testing over a range of n is straightforward. We then take real, fixed, network data and work on increasingly large subgraphs.
We begin by pointing out some relevant analytical results. For any undirected graph G with n 1 := n nodes and m 1 := m (directed) edges, the number of nonzeros in P 2 ∈ R n 2 ×n 2 , where n 2 = m 1 , corresponds to twice the number of undirected open paths of length two in G; so, (e.g. [49])      non-triangulating matrix P 3 ∈ R n 3 ×n 3 , where n 3 = m 2 , has a number of nonzeros that corresponds to twice the number of undirected open paths of length three in G, so where the summation was taken over the m 1 directed edges in G. We define the density δ k of the matrix P k for k = 1, 2, 3, . . . as δ k = m k /(n k (n k − 1)).
In figure 2 we display on a semi-logarithmic scale (a) the evolution of the dimension n k of the matrices P k for k = 1, 2, 3, 4, (b) the evolution of the number of nonzeros m k , and (c) that of their densities δ k for networks of increasing size built using the smallw function from the CONTEST toolbox for Matlab [50], with default parameters. The function smallw(n) returns the adjacency matrix of an independent sample from a class of small world networks [51] with n nodes. In our tests, we selected n = 100, 200, . . . , 4900, 5000. For each of these we have computed the dimensions of the matrices P k for k = 1, 2, 3, 4, the number of nonzeros, and their densities; we ran this test 100 times and averaged the results. Error bars are also shown in the plots to indicate the standard errors. The same test was run for networks of increasing size built using a preferential attachment type of model [52]: pref(n), for the same values of n. Results are displayed in figure 3.  For these two widely used models, it can be seen that although the dimension of the matrices P 2 , P 3 and P 4 increases considerably with the size of the original network, they remain very sparse, thus allowing for fast computations.
For our tests on real world networks, we raise the dimension by constructing increasingly large, well-connected subsets of a fixed network. To do this, we first compute the Fiedler vector of the largest connected component. Since the Fiedler vector is an eigenvector of the graph Laplacian, it is defined up to scalar, nonzero multiples. We retained the sign returned when the eigenvector was computed using the MATLAB built-in function eigs and we selected the n mod 100 1 nodes corresponding to the largest positive entries in the Fiedler vector. We iterated this process by adding, at each new step, 100 more nodes to the subgraph using the ordering of the nodes induced by the Fiedler vector, until we reached the size of the largest connected component of the original graph. Since close components in the Fiedler are good candidates for members of the same cluster [53], this process is designed to run through well connected neighbourhoods. The dimension and density of P 1 , P 2 and P 3 are displayed in figure 4 for the largest connected component of the collaboration network CA-HEPTH (n = 8638) and in figure 5 for the largest connected component of the collaboration network ERDOS02 (n = 5534). Both networks are available at [54]. On the x-axis we display the dimension of P 1 , P 2 and P 3 and on the y-axis we display the number of nonzeros (top plots) and the density (bottom plots). The results associated with P 1 are plotted in a semi-logarithmic scale, while the results for P 2 and P 3 are displayed in log-log plots. Again, we observe that the non-k-cycling matrices are rather sparse for these real world networks.

Conclusion
Motivated by the wide application of non-backtracking walks, our aim here was a natural extension of this concept to the case of non-triangulating, non-squaring and, in general, the elimination of all cycles. From a practical perspective, we showed that recursively unfolding the Hashimoto matrix construction provides building blocks for the required generating functions and non-cycling walk centralities. We also developed a range of theoretical results that characterize the spectra of the associated matrices and the limiting behaviour of the centrality measures.
We hope that this new computational and analytical framework will initiate further study in areas where non-backtracking walks have proved attractive. In particular, for the network science setting of this work, we envisage progress in a number of directions, including : development of spectral results, such as the decay of ρ(P k ) in part (i) of theorem 5.8 as k increases, for specific graph classes, and their consequences in terms of localization of centrality measures; fast linear algebra algorithms that can exploit the structure of the matrix-based subproblems, including the evaluation of general power series in tP k ; large-scale tests of the new network science measures on real-life complex networks of current research interest in science and technology.