A framework for second order eigenvector centralities and clustering coefficients

We propose and analyse a general tensor-based framework for incorporating second order features into network measures. This approach allows us to combine traditional pairwise links with information that records whether triples of nodes are involved in wedges or triangles. Our treatment covers classical spectral methods and recently proposed cases from the literature, but we also identify many interesting extensions. In particular, we define a mutually-reinforcing (spectral) version of the classical clustering coefficient. The underlying object of study is a constrained nonlinear eigenvalue problem associated with a cubic tensor. Using recent results from nonlinear Perron--Frobenius theory, we establish existence and uniqueness under appropriate conditions, and show that the new spectral measures can be computed efficiently with a nonlinear power method. To illustrate the added value of the new formulation, we analyse the measures on a class of synthetic networks. We also give computational results on centrality and link prediction for real-world networks.

Coefficient [47]; here we give extra weight to nodes that form triangles with nodes that are themselves involved in important triangles. We show that our general framework can be studied using recently developed tools from nonlinear Perron-Frobenius theory. As well as deriving existence and uniqueness results we show that these measures are computable via a nonlinear extension of the power method; see Theorem 4.1.
The manuscript is organized as follows. In section 2 we summarize relevant existing work on spectral measures in network science. Section 3 sets out a general framework for combining first and second order information through a tensor-based nonlinear eigenvalue problem. We also give several specific examples in order to show how standard measures can be generalized by including second order terms. In section 4 we study theoretical and practical issues. Section 5 illustrates the effect of using second order information through a theoretical analysis on a specific class of networks. In section 6 we test the new framework on real large scale networks in the context of centrality assignment and link prediction. Conclusions are provided in section 7.
2 Background and related work

Notation
A network or graph G = (V, E) is defined as a pair of sets: nodes V = {1, 2, . . . , n} and edges E ⊆ V × V among them. We assume the graph to be undirected, so that for all (i, j) ∈ E it also holds that (j, i) ∈ E, unweighted, so that all connections in the network have the same "strength", and connected, so that it is possible to reach any node in the graph from any other node by following edges. We further assume for simplicity that the graph does not contain self-loops, i.e., edges that point from a node to itself.
A graph may be represented via its adjacency matrix, A = (A ij ) ∈ R n×n , where A ij = 1 if (i, j) ∈ E and A ij = 0 otherwise. Under our assumptions, this matrix will be symmetric, binary and irreducible. We write G A to denote the graph associated with the adjacency matrix A.
We let 1 ∈ R n denote the vector with all components equal to 1 and 1 i ∈ R n denote the ith vector of the standard basis of R n .

Spectral centrality measures
A centrality measure quantifies the importance of each node by assigning to it a nonnegative value. This assignment must be invariant under graph isomorphism, meaning that relabeling the nodes does not affect the values they are assigned. We focus here on degree centrality and a family of centrality measures that can be described via an eigenproblem involving the adjacency matrix. This latter family includes as special cases eigenvector centrality and PageRank.
The degree centrality of a node is found by simply counting the number of neighbours that it possesses; so node i is assigned the value d i , where d = A1. Degree centrality treats all connections equally; it does not take account of the importance of those neighbours. By contrast eigenvector centrality is based on a recursive relationship where node i is assigned a value x i ≥ 0 such that x is proportional to Ax. We will describe this type of measure as mutually reinforcing, because it gives extra credit to nodes that have more important neighbours. Under our assumption that A is irreducible, the eigenvector centrality measure x corresponds to the Perron-Frobenius eigenvector of A. We note that this measure was popularized in the 1970s by researchers in the social sciences, [42], but can be traced back to algorithms used in the 19th century for ranking chess players, [44]. For our purposes, it is useful to consider a general class of eigenvector based measures of the form where M ∈ R n×n is defined in terms of the adjacency matrix A. For example, we may use the adjacency matrix itself, M = A, or the PageRank matrix with c ∈ (0, 1), v ≥ 0 such that v 1 = 1 and D the diagonal matrix such that D ii = d i . With this second choice, the eigenvector solution of (1) is the PageRank vector [24].

Watts-Strogatz clustering coefficient
The Watts-Strogatz clustering coefficient was used in [47] to quantify an aspect of transitivity for each node. To define this coefficient, we use (i) = (A 3 ) ii /2 to denote the number of unoriented triangles involving node i. Note that node i is involved in exactly d i (d i − 1)/2 wedges centred at i, that is, paths of the form hij where h, i, j are distinct. Hence node i can be involved in at most d i (d i − 1)/2 triangles. The local Watts-Strogatz clustering coefficient of node i is defined as the fraction of wedges that are closed into triangles: It is easy to see that c i ∈ [0, 1] with c i = 0 if node i does not participate in any triangle and c i = 1 if node i has not left any wedges unclosed. Related to this measure of transitivity for nodes there are two network-wide versions; the average clustering coefficient and the global clustering coefficient or graph transitivity [36] where |K 3 | is the number of unoriented triangles in the network and the multiplicative factor of 6 comes from the fact that each triangle closes six wedges, i.e., the six ordered pairs of edges in the triangle. This latter measure has been observed to typically take values between 0.1 and 0.5 for real world networks; see [22]. The global and average clustering coefficients have been found to capture meaningful features and have found several applications [38,40]; however, they may behave rather differently for certain classes of networks [16]. In this work we focus on the local measure defined in (3). Beyond social network analysis, this index has found application, for example, in machine learning pipelines, where nodes features are employed to detect outliers [30] or to inform role discovery [1,26], in epidemiology, where efficient vaccination strategies are needed [15], and in psychology [3], where it is desirable to identify at-risk individuals. We see from (3) that the Watts-Strogatz clustering coefficient may be viewed as a second order equivalent of degree centrality in the sense that it is not mutually reinforcing-a node is not given any extra credit for forming triangles with well-clustered nodes. In Definition 3.2 below we show how a mutually reinforcing clustering coefficient can be defined.

General eigenvector model
To incorporate second order information, given a tensor T ∈ R n×n×n and a parameter p ∈ R we define the operator T p : R n → R n that maps the vector x ∈ R n to the vector entrywise defined as where µ p (a, b) is the power (or binomial) mean Recall that the following well known properties hold for µ: i) lim p→0 µ p (a, b) = |ab| is the geometric mean; ii) µ −1 (a, b) = 2(|x| −1 + |y| −1 ) −1 is the harmonic mean; iii) lim p→+∞ µ p = max{|a|, |b|} is the maximum function; whereas lim p→−∞ µ p = min{|a|, |b|} is the minimum.
We may then define the following nonlinear network operator, and associated spectral centrality measure, which combines first and second order interactions. Definition 3.1. Let α ∈ R be such that 0 ≤ α ≤ 1, let p ∈ R and let M ∈ R n×n and T ∈ R n×n×n be an entrywise nonnegative square matrix and an entrywise nonnegative cubic tensor associated with the network, respectively. Define M : R n → R n as Then the corresponding first and second order eigenvector centrality of node i is given by x i ≥ 0, where x solves the constrained nonlinear eigenvalue problem If we set α = 1 in (6) then only first order interactions are considered, and we return to the classical eigenvector centrality measures discussed in section 2. Similarly, with α = 0 only second order interactions are relevant.
In the next subsection we discuss specific choices for M and T . We also note that in order for the measure in Definition 3.1 to be well defined, there must exist a unique solution to the problem (6). We consider this issue in section 4.

Specifying M and T
In Definition 3.1, the matrix M should encode information about the first order (edge) interactions, with the tensor T representing the triadic relationships among node triples, that is, second order interactions.
Useful choices of M are therefore the adjacency matrix or the PageRank matrix (2). Another viable choice, which we will use in some of the numerical experiments, is a rescaled version of the adjacency matrix M = AD −1 , which we will refer to as the random walk matrix.
We now consider some choices for the tensor T to represent second order interactions.
Binary triangle tensor. Perhaps the simplest choice of second order tensor is As discussed, for example, in [45], we can build T B with worst case computational complexity of O(n 3 ) or O(m 3/2 ), where n is the number of nodes in the network and m is the number of edges. Moreover, in [6] the authors construct the triangles tensor of four large real-world networks (Email EUAll, soc Epinions1, wiki Talk, twitter combined) and observe that the number of nonzero entries in T B is O(6m). Note also that this tensor is closely related to the matrix A•A 2 , where • denotes the componentwise product (also called the Hadamard or Schur product), as shown in (13). It can be easily verified that, regardless of the choice of p, Random walk triangle tensor. A "random walk" normalization of the tensor T B in (7), which will be denoted by T W ∈ R n×n×n , is entrywise defined as where (j, k) = (A • A 2 ) jk is the number of triangles involving the edge (j, k). This is reminiscent of the random walk matrix M ij = (AD −1 ) ij = δ ij∈E /d j (here δ denotes the Kronecker delta) and this is the reason behind the choice of the name.
Clustering coefficient triangle tensor. An alternative normalization in (7) gives Note that, if i, j, k form a triangle, then d i ≥ 2 and hence (T C ) ijk is well defined. This tensor incorporates information that is not used in (7) and (8)-the number of transitive relationships that each node could be potentially involved in-while also accounting for the second order structure actually present. We refer to (12) as the clustering coefficient triangle tensor because for any p we have (T C ) p (1) = c, the Watts-Strogatz clustering coefficient vector. We will return to this property in subsection 3.3.
Local closure triangle tensor. The local closure coefficient [50] of node i is defined as where is the number of paths of length two originating from node i, and N (i) is the set of neighbours of node i. We may also write w = Ad − d = A 2 1 − A1. The following result, which is an immediate consequence of the definition of w(i), shows that we may assume w(i) = 0 when dealing with real-world networks. We then define the local closure triangle tensor as It is easily checked that (T L ) p (1) = h for all p.
Next, we briefly discuss the main differences, for the purposes of this work, among these four tensorial network representations.
The binary triangle tensor (7) and random walk triangle tensor (8) provide no information concerning the wedges involving each node, and hence the consequent potential for triadic closure. Indeed, networks that have very different structures from the viewpoint of potential and actual transitive relationships are treated alike. For example, consider the two networks in row (a) of Figure 1, where solid lines are used to represent the actual edges in the network. The two networks are represented by the same tensors in the case of (7) and (8), but are not equivalent from the viewpoint of transitive relationships. Indeed, by closing wedges following the principle underlying the Watts-Strogatz clustering coefficient, in the network on the left node 1 could participate in five more triangles, whilst in the graph on the right it could participate in only two more. These are highlighted in Figure 1, row (a), using dashed lines. On the other hand, the clustering coefficient triangle tensor defined in (9) encodes in its entries the "potential" for triadic closure of node 1; indeed, for the network on the left it holds that (T C ) 123 = (T C ) 132 = 1/12, while these entries are (T C ) 123 = (T C ) 132 = 1/6 for the network on the right. These values show that there is a potential for node 1 to be involved in respectively 12 and 6 directed triangles.
The local closure triangle tensor defined in (12) encodes another type of triadic closure propertythe potential of a node to become involved in triangles by connecting to nodes that are at distance two from it. In the networks depicted in Figure 1, row (b), it is clear that no such triangles can be formed in the network on the left, while there is one that could be formed in the graph on the right (dashed edge). For the entries of the associated tensor T L , the left network in row (b) of Figure 1 has (T L ) 123 = (T L ) 132 = 1/2, and indeed node 1 is participating in both possible directed triangles that can be formed according to the principals of local closure. The network on the right has (T L ) 123 = (T L ) 132 = 1/3.

The linear cases
The map M defined in (5) becomes linear for particular choices of p and α. One case arises when α = 1, whence it reduces to a standard matrix-vector product, M(x) = M x, and (3.1) boils down to a linear eigenvector problem (1). Using the particular choices of M described in the previous subsection, it then follows that our model includes as a special case standard eigenvector centrality and PageRank centrality. Now let α ∈ [0, 1) and p = 1. Then the mapping T p : R n → R n also becomes linear; indeed, entrywise it becomes and T 1 (x) reduces to the product between the vector x and the matrix with entries 1 2 ( k T ijk + T ikj ). In particular, if the tensor T is symmetric with respect to the second and third modes, i.e. T ijk = T ikj for all j, k, it follows that Note that this is the case for all the tensors defined in subsection 3.1.
We now explicitly compute ( k T ijk ) for some of the tensors T presented in subsection 3.1. If T = T B is the binary triangle tensor in (7), it follows that and hence Overall, the map M then acts on a vector x as follows and so the solution to the constrained eigenvector problem (6) is the Perron-Frobenius eigenvector of the matrix αA This has a flavour of the work in [7], where the use of A • A 2 is advocated as a means to incorporate motif counts involving second order structure. Other choices of the tensor T yield different eigenproblems. For example, when T = T C in (9) we have and hence (5) becomes where † denotes the Moore-Penrose pseudo-inverse. If we let T = T L , as defined in (12), we obtain Note that in formula (14) it is possible to have n k=1 (T L ) ijk = 0 even in the case where d j ≥ 2. This is because, as observed in Proposition 3.1, there are cases where w(i) > 0 but i does not form any triangle and thus Using (14) we obtain . The eigenvector problem (6) then becomes

Spectral clustering coefficient: α = 0
While the choice of α = 1 yields a linear and purely first order map, the case α = 0 corresponds to a map that only accounts for second order node relations. In particular, this map allows us to define spectral, and hence mutually reinforcing, versions of the Watts-Strogatz clustering coefficient (3) and the local closure coefficient (11), where the power mean parameter p in (4) controls how the coefficients of neighbouring nodes are combined. We therefore make the following definition.
Definition 3.2. Let T ∈ R n×n×n be an entrywise nonnegative cubic tensor associated with the network. The spectral clustering coefficient of node i is the ith entry of the vector x ≥ 0 which solves the eigenvalue problem (6) with α = 0 in (5); that is, The solution for T = T C ∈ R n×n×n in (9) will be referred to as the spectral Watts-Strogatz clustering coefficient, and the solution for T = T L ∈ R n×n×n in (12) will be referred to as the spectral local closure coefficient.
We emphasize that, as for standard first order coefficients based on matrix eigenvectors, the spectral clustering coefficient (15) is invariant under node relabeling. Indeed, if T is any tensor associated with the network as in subsection 3.1 and π : V → V is a relabeling of the nodes, i.e., a permutation, then the tensor associated to the relabeled graph is T ijk = T π(i)π(j)π(k) and thus x solves (15) if and only if T p ( x) = λ x, where x π(i) = x i , for all i ∈ V . Of course, the same relabeling invariance property carries over to the general setting α = 0.
Note that if node i does not participate in any triangle, then the summation describing the corresponding entry in T p (x) is empty, and thus the spectral clustering coefficient for this node is zero, as expected. Moreover, the converse is also true, since T ≥ 0 and x ≥ 0. On the other hand, since the spectral clustering coefficient x is defined via an eigenvector equation for T p , it follows that it cannot be unique as it is defined only up to a positive scalar multiple. Indeed, we have T p (θx) = θT p (x) for any θ ≥ 0. Hence, when T = T C , unlike the standard Watts-Strogatz clustering coefficient, it is no longer true that a unit spectral clustering coefficient identifies nodes that participate in all possible triangles. However, we will see in the next section that once we have a solution x of (15) any other solution must be a positive multiple of x. More precisely, we will show that under standard connectivity assumptions on the network, the spectral clustering coefficient and, more generally, the solution to (6) is unique up to positive scalar multiples. This fosters the analogy with the linear setting (1). Therefore, it is meaningful to normalize the solution to (6) and compare the size of its components to infer information on the relative importance of nodes within the graph. The vector T p (1), which is independent of the choice of p, defines a "static" counterpart of the spectral clustering coefficient obtained as the Perron-Frobenius eigenvector x of T p . This may be viewed as a second order analogue of the dichotomy between degree centrality and eigenvector centrality, the former being defined as A1 and the latter as the Perron-Frobenius eigenvector of A. As in the first order case, even though the spectral coefficient x ∝ T p (x) carries global information on the network while the static version T p (1) is highly local, the two measures can be correlated. An example of this phenomenon is shown in Figure 2, which scatter plots T p (1) against T 0 (x), for different choices of T , on the unweighted version 1 of the neural network of C. Elegans compiled by Watts and Strogatz in [47], from original experimental data by White et al. [48]; see Table 1 for further details of this network.
We also remark that our general definition of spectral clustering coefficient in Definition 3.2 includes in the special case p → 0 the Perron H-eigenvector of the tensor T [21]. Indeed, it is easy to observe that the change of variable y 2 = x yields where T yy is the tensor-vector product (T yy) i = jk T ijk y j y k . This type of eigenvector has been used in the context of hypergraph centrality; see, e.g., [4].
The choice of the tensor T affects the way the triangle structure is incorporated in our measure, as we have previously illustrated in the small toy networks in Figure 1. Examples of the differences that one may obtain on real-world networks are shown in Figures 3 and 4. A description of the datasets used in those figures is provided in Section 6 and in Table 1. Figure 3 displays the Karate network and highlights the ten nodes which score the highest according to the spectral clustering coefficient for different choices of T . In this experiment we select p = 0, and thus we are actually computing the Perron H-eigenvector of the corresponding tensors. The size of each of the top ten nodes in Figure 3 is proportional to their clustering coefficients. In Figure 4, instead, we display how the H-eigenvectors corresponding to different triangle tensors correlate with the degree of the nodes for four real-world networks; We group nodes by logarithmic binning of their degree and plot the average degree versus the average clustering coefficient in each bin. As expected, the Watts-Strogatz spectral clustering coefficient may decrease when the degree increases, in contrast with other choices of the triangle tensor. A similar phenomenon is observed, for example, in [50].
In the next section we discuss existence and uniqueness, up to scalar multiples, of a solution to (6). We also describe a power-iteration algorithm for its computation. We begin by discussing the linear case where α = 1 or p = 1, so that the nonnegative operator M : R n → R n is an entrywise nonnegative matrix B. Here, results from Perron-Frobenius theory provide conditions on M that guarantee existence of a solution to (6) and computability of this solution via the classical power method. These conditions are typically based on structural properties of M and of the associated graph. We review below some of the best known and most useful results from this theory.
First, given the entrywise nonnegative matrix B ∈ R n×n , let G B be the adjacency graph of B, with nodes in {1, . . . , n} and such that the edge i → j exists in G B if and only if B ij > 0. Now, recall that a graph is said to be aperiodic if the greatest common divisor of the lengths of all cycles in the graph is one. Also, the matrix B is primitive if and only if there exists an integer k ≥ 1 such that B k > 0, and, moreover, B ≥ 0 is primitive if and only if G B is aperiodic.
It is well known that when G B is strongly connected, then there exists a unique (up to multiples) eigenvector of B, and such vector is entrywise positive. Moreover, this eigenvector is maximal, since the corresponding eigenvalue is the spectral radius of B and, if G B is aperiodic, the power method iteration x k+1 = Bx k / Bx k converges to it for any starting vector x 0 ∈ R n .
In the general case, we will appeal to nonlinear Perron-Frobenius theory to show that the properties of existence, uniqueness and maximality of the solution to (6) carry over to the general nonlinear setting almost unchanged, and to show that an efficient iteration can be used to compute this solution. We first note that for any α ∈ [0, 1], any p ∈ R and any θ > 0 we have thus if x ≥ 0 solves (6), then any positive multiple of x does as well. Therefore, as for the linear case, uniqueness can only be defined up to scalar multiples. We continue by introducing the graph of M.  3. If x is any nonnegative eigenvector M(x) = λx with some zero entry, then λ < ρ(M).
If moreover G M is aperiodic, then (iv) For any starting point x 0 > 0, the nonlinear power method converges to the positive eigenvector of M. Moreover, for all k = 0, 1, 2, ... it holds with both the left and the right hand side sequences converging to ρ(M) as k → ∞.
Proof. The proof combines several results from nonlinear Perron-Frobenius theory. First, note that M is homogeneous of degree one and order preserving. Indeed, if x ≥ y ≥ 0 entrywise, then it is easy to verify that It follows that M has at least one entrywise nonnegative eigenvector that corresponds to the eigenvalue λ = ρ(M) (see, e.g., [32,Theorem 5.4.1]).
Next, recall that 1 j denotes the jth vector of the canonical basis of R n . Now let y j (β) = 1 + (β − 1)1 j be the vector whose jth component is the variable β ∈ R while all the other entries are equal to one. Thus note that if A Mij = 1, then lim β→∞ M(y j (β) i ) = ∞. Since G M is strongly connected, [19,Theorem 1] implies that M has at least one entrywise positive eigenvector u > 0 such that M(u) = λu, with λ > 0.
Third, we show uniqueness and maximality. Note that for any positive vector y > 0 and any p ≥ 0 we have that if G M is strongly connected then the Jacobian matrix of M evaluated at y is irreducible. In fact Therefore, [32,Theorem 6.4.6] implies that u is the unique positive eigenvector of M. Moreover, [32,Theorem 6.1.7] implies that for any other nonnegative eigenvector x ≥ 0 with M(x) = λx we have λ < ρ(M). As there exists at least one nonnegative eigenvector corresponding to the spectral radius, that must be u and we deduce that λ = ρ(M). This proves points (i) − (iii). For point (iv), we note that if G M is aperiodic then A M is primitive and this implies that the Jacobian matrix of M evaluated at u > 0 is primitive as well. Thus Theorem 6.5.6 and Lemma 6.5.7 of [32] imply that the normalized iterates of the homogeneous and order preserving map M converge to u. Finally, [20,Theorem 7.1] proves the sequence of inequalities in (16) and the convergence of both the sequences towards the same limit; α k and β k tend to ρ(M) as k → ∞.
We emphasize that because the mapping M of Theorem 4.1 is defined for an arbitrary nonnegative matrix M and nonnegative tensor T , the graph G M in that theorem may be directed. The next lemma shows that when M and T are defined as in subsection 3.1 the graph G M coincides with the underlying network. Thus, for the undirected case and with any of the choices in subsection 3.1, Theorem 4.1 applies whenever the original graph is connected.  n k=1 (T ijk + T ikj ) > 0 and hence at least one of the two terms has to be positive; however, from the possible definitions of T it is clear that T ijk and T ikj cannot be nonzero unless (i, j) ∈ E, i.e., unless M ij > 0.

Example network with theoretical comparison
In this section we describe theoretical results on the higher order centrality measures. Our overall aim is to confirm that the incorporation of second order information can make a qualitative difference to the rankings. We work with networks of the form represented in Figure 5. These have three different types of nodes: i) node 1, the centre of the wheel, that has degree m and connects to m nodes of the second type, ii) m nodes attached to node 1 and interconnected via a cycle to each other. Each type (ii) node also connects to k nodes of the third type, and iii) mk leaf nodes attached in sets of k to the m nodes of type (ii). We will use node 2 to represent the nodes of type (ii) and node m + 2 to represent the nodes of type (iii).
The network is designed so that node 1 is connected to important nodes and is also involved in many triangles. Node 2, by contrast, is only involved in two triangles and has connections to the less important leaf nodes. If we keep m fixed and increase the number of leaf nodes, k, then eventually we would expect the centrality of node 2 to overtake that of node 1. We will show that this changeover happens for a larger value of k when we incorporate second order information. More precisely, we set p = 1 and show that node 1 is identified by the higher-order measure as being more central than node 2 for larger values of k when compared with standard eigenvector centrality.
With this labeling of the nodes, the adjacency matrix A ∈ R n×n of the network has the form The eigenvector v = [x y1 T m z1 T mk ] T associated to the leading eigenvalue λ = 1 + √ 1 + m + k of A is such that λx = m y and it can be verified that  We now move on to the higher order setting. We begin by specifying the entries of the binary triangle tensor T B = (T B ) ijk defined in (7). It is clear that (T B ) ijk = 0 for all i = m+2, . . . , mk + m + 1. Moreover, . . . , m + 1 are such that (j, k) ∈ E 0 otherwise, and for i = 2, . . . , m + 1 Using (4) it follows that where v = [x y1 T m z1 T mk ] T as before. Overall we thus have that equation (6) rewrites as    λx = (2 − α)my λy = α(x + 2y + kz) + 4(1 − α)µ p (x, y) λz = αy.
For p = 1 and α ∈ (0, 1], this system yields The areas for which x > y in the two settings (standard eigenvector centrality α = 1 and higher order centrality α = 0.2, 0.5) are shaded in Figure 6 (left). It is readily seen that even for small values of m, k needs to become very large (when compared to m) in order for the centrality of nodes i = 2, . . . , m + 1 to become larger than that of node 1 when higher-order information is taken into account. In the standard eigenvector centrality setting we observe a very different behaviour (see Figure 6, left, α = 1). In Figure 6 (right) we display the areas for which x > y for different values of α when T C ∈ R n×n×n is used in (6). Indeed, specializing the definition in (9) to this example, it is easy to see that , and therefore the solution to (6)  is the number of triangles, C is the global clustering coefficient of the network, c is the average clustering coefficient, x C is the average spectral clustering coefficient, w is the average local closure coefficient, and x L is the average spectral local closure coefficient.
After some algebraic manipulation, we obtain that x > y if and only if αm If we now let p = 1, it is easy to show that λ satisfies with c 1 = 2(1−α) (k+3)(k+2) and c 2 = 2(1−α) m−1 . Remark 5.1. Similarly, if the local closure triangle tensor T L is used in the computation, we observe that x > y if and only if αm k+2 . There seems to be no appreciable difference between the profiles for α = 0.2, 0.5, 1, and hence they are not displayed here.
6 Applications and numerical results

Centrality measures
In the previous subsection we observed that α may have a significant effect on the node rankings. Results were shown for T B and T C , p = 1 and α = 0.2, 0.5, 1. In this subsection we test the role of α for real network data. We use α = 0.5 and α = 1 (corresponding to eigenvector centrality) and p = 0 in (6), and combine the adjacency matrix A and the binary tensor T B .
Our tests were performed on four real-world networks which are often used as benchmarks in the graph clustering and community detection communities, and are publicly available at [12]. The Karate network is a social network representing friendships between the 34 members of a karate club at a US university [52]. The network C. Elegans is a neural network. We use here an undirected and unweighted version of the neural network of C. Elegans compiled by Watts and Strogatz in [47], from original experimental data by White et al. [48]. The network Adjnoun is based on common adjective and noun adjacencies in the novel "David Copperfield" by Charles Dickens [41]. Chesapeake represents the interaction network of the Chesapeake Bay ecosystem. Here, nodes represent species or suitably defined functional groups and links create the food web [2]. In Table 1 we report the number of nodes n, (undirected) edges m and triangles = trace(A 3 )/6 for the four networks. We further display the global clustering coefficient C, the average clustering coefficient c, and the average spectral clustering coefficient x C , as well as the average local closure coefficient w [50] and its spectral counterpart x L ; see Defintion 3.2. Figure 7 scatter plots the newly introduced measure against eigenvector centrality for the four different networks. The centrality vectors are normalized with the infinity norm. For the network Karate we see very poor correlation between the two measures. Stronger correlation is displayed for the other networks, but it is still to be noted that the top ranked nodes (corresponding to the nodes with largest centrality scores) differ for the two measures in all but one network, namely Adjnoun. Hence, using second order information can alter our conclusions about which nodes are the most central.

Link Prediction
Link prediction is a fundamental task in network analysis: given a network G 0 = (V, E 0 ), we must identify edges that are not in E 0 but should be there. This problem typically arises in two settings: (a) in a dynamic network where new connections appear over time, and (b) in a noisily observed network, where it is suspected that edges are missing [11,34,35]. For convenience, let us assume that E 0 is the set of edges that we observe and that E 1 with E 1 ∩ E 0 = ∅ is the set of edges that should be predicted, i.e., those that will appear in an evolving network or that are missing in a noisy graph. A standard approach for link prediction is to create a similarity matrix S, whose entries S ij quantify the probability that (i, j) ∈ E 1 . It is worth pointing out that since E 0 ∩ E 1 = ∅, then the nonzero pattern of S will be complementary to that of the adjacency matrix of G 0 . Over the years, several similarity measures have been proposed in order to quantify which nodes are most likely to link to a given node i [37]. While classical methods usually exploit the first order structure of connections around i, there is a growing interest in second order methods that take into account, for example, triangles.
In this context, we propose a new similarity measure based on M and its Perron eigenvector. This measure is a generalization of a well-known technique known as seeded (or rooted ) PageRank [23,29], which we now describe. Given a seed node ∈ V and a teleportation coefficient 0 ≤ c < 1, let x ( ) be the limit of the evolutionary process where P is the random walk matrix P = AD −1 . As 0 ≤ c < 1, it is easy to show that the limit exists and that it coincides with the solution to the linear system The seeded PageRank similarity matrix S P R is then entrywise defined by The idea behind (18) is that the sequence x k is capturing the way a unit mass centered in (the seed or root of the process), and represented in the model by 1 , propagates throughout the network following the diffusion rule described by P . This diffusion map is a first order random walk on the graph. In order to propose a new, second order, similarity measure, we replace this first order map with the second order diffusion described by M = αM + (1 − α)T p and we consider the associated diffusion process. To this end, we begin by observing that, independently of the choice of the starting point x 0 in (18), the first order diffusion process will always converge to x ( ) that satisfies yields As a consequence, the limit of the sequence (18) coincides with the limit of the normalized iterateŝ x k+1 = cP x k + (1 − c)1 , with x k+1 =x k+1 / x k+1 1 . On the other hand, when the linear process P is replaced by the nonlinear map M, the unnormalized sequence may not converge. We thus need to impose normalization of the vectors in our dynamical process defined in terms of M and seeded in the node :ŷ k+1 = cM(y k ) + (1 − c)1 k = 0, 1, 2, . . . . The plots show means and quartiles of the ratio between the fraction of correctly predicted edges using S M and the one obtained using S P R , over ten random trials for different values of p and α in (5).
Note that, for α = 1 and M = P in (5) we retrieve exactly the rooted PageRank diffusion (18). Unlike the linear case, the convergence of the second order nonlinear process (20) is not straightforward. However, ideas from the proof of Theorem 4.1 can be used to show that the convergence is guaranteed for any choice of the tensor T , of the matrix M , and of the starting point y 0 ≥ 0, provided that the graph G M is aperiodic.
Corollary 6.1. Let M : R n → R n be as in Definition 3.1 and let G M be its adjacency graph, as per Definition 4.1. If G M is aperiodic and y 0 > 0, then y k defined in (20) for a given seed converges to a unique stationary point y ( ) > 0.
Proof. Let F : R n → R n be the map F(y) = cM(y) + (1 − c) y 1 1 , where we have omitted the dependency of the map on for the sake of simplicity. Note that the limit points of (20) coincide with the fixed points of F on the unit sphere y 1 = 1. Note moreover that F is homogeneous, i.e., F(θy) = θF(y), for all θ > 0. Finally, notice that the j-th column of the Jacobian matrix of F evaluated at z is ∂ ∂y j F(z) = c ∂ ∂y j M(z) + (1 − c)1 , which shows that if the Jacobian of M is irreducible, the same holds for the Jacobian of F. With these observations, the thesis follows straightforwardly using the same arguments as in the proof of Theorem 4.1, applied to F.
As for the linear dynamical process, the stationary distributions of (20) computed for different seeds allow us to define the similarity matrix S M : (S M ) ij = (y (i) ) j + (y (j) ) i .
In Figure 8 we compare the performance of the link prediction algorithm based on the standard seeded PageRank similarity matrix S P R (18) and the newly introduced similarity matrix S M (20) induced by M with M = P and T = T W , the random walk triangle tensor. The tests were performed on the real-world networks UK faculty and Small World citation. The network UK faculty [39] represents the personal friendships network between the faculty members of a UK University. The network contains n = 81 vertices and m = 817 edges. The network Small World citation [18] represents citations among papers that directly cite Milgram's small world paper or contain the words "Small World" in the title. This network contains n = 233 nodes and m = 994 edges. We transformed both networks by neglecting edge directions and weights.
The experiments were performed as follows. We start with an initial network G = (V, E) and we randomly select a subset of its edges, which we call E 1 , of size |E 1 | ≈ |E|/10. We then define G 0 = (V, E 0 ) to be the graph obtained from G after removal of the edges in E 1 , so that E 0 = E \E 1 . Thus, working on the adjacency matrix of G 0 , we build the two similarity matrices S P R and S M . Then, for each similarity matrix S, we select from V × V \ E 0 the subset E S containing the |E 1 | edges with the largest similarity scores S ij . A better performance corresponds to a larger size of E 1 ∩ E S , since this is equivalent to detecting more of the edges that were originally in the graph. To compare the performance of the two similarity matrices, we thus computed the ratio |E S M ∩ E 1 |/|E S P R ∩ E 1 |. In Figure 8 we boxplot this quantity over 10 random runs where E 1 is sampled from the initial E with a uniform probability. Whenever the boxplot is above the threshold of 1, our method is outperforming standard seeded PageRank. The middle plots in the figure display the results for the two networks when α = 0.5 in (5) and we let p vary. On the other hand, the plots on the right display results for varying values of α and p = 0, which was observed to achieve the best performance in the previous test. Overall, the link prediction algorithm based on the similarity matrix S M typically outperforms the alternative based on S P R , especially for small values of p.

Conclusion
After associating a network with its adjacency matrix, it is a natural step to formulate eigenvalue problems that quantify nodal characteristics. In this work we showed that cubic tensors can be used to create a corresponding set of nonlinear eigenvalue problems that build in higher order effects; notably triangle-based motifs. Such spectral measures automatically incorporate the mutually reinforcing nature of eigenvector and PageRank centrality. As a special case, we specified a mutually reinforcing version of the classical Watts-Strogatz clustering coefficient.
We showed that our general framework includes a range of approaches for combining first and second order interactions, and, for all of these, we gave existence and uniqueness results along with an effective computational algorithm. Synthetic and real networks were used to illustrate the approach.
Given the recent growth in activity around higher order network features [5,4,6,10,14,27,28,31,33,46,49,50,51], there are many interesting directions in which this work could be further developed, including the design of such centrality measures for weighted, directed and dynamic networks, and the study of mechanistic network growth models that incorporate higher-order information.
Data and funding