Characterizing Computation in Artificial Neural Networks by their Diclique Covers and Forman-Ricci Curvatures

The relationships between the structural topology of artificial neural networks, their computational flow, and their performance is not well understood. Consequently, a unifying mathematical framework that describes computational performance in terms of their underlying structure does not exist. This paper makes a modest contribution to understanding the structure-computational flow relationship in artificial neural networks from the perspective of the dicliques that cover the structure of an artificial neural network and the Forman-Ricci curvature of an artificial neural network’s connections. Special diclique cover digraph representations of artificial neural networks useful for network analysis are introduced and it is shown that such covers generate semigroups that provide algebraic representations of neural network connectivity.


I. INTRODUCTION
In recent years there has been an exponential growth in research focused upon machine learning and its application to those classes of real world problems that cannot be easily addressed using traditional computer algorithms. Such classes include image analysis, e.g. [1], web searches, e.g. [2], content filtering and matching, e.g. [3], and speech recognition, e.g. [4]. Due to its robustness and adaptability, the artificial neural network (ANN) has emerged as a computational tool of choice for solving many problems in these domains. The especially powerful deep neural networks and their biologically inspired learning algorithms, e.g. [5], have consistently demonstrated state of the art performance when applied to real world tasks such as these.
The relationship between the structural topology of an ANN, its computational flow, and its performance is not well understood. Although graph theory and algebraic topology have been used with some success to analyze local and global network properties, e.g. [6,7], a unifying mathematical framework that describes the computational performance of an ANN in terms of its underlying structure does not yet exist. This paper makes a modest contribution to understanding the structure-computational flow relationship in ANNs from the perspective of the dicliques, e.g. [8], that cover the structure of an ANN and the Forman-Ricci (FR) curvature, e.g. [9,10], of an ANN's connections. Published  These diclique covers provide digraphs useful for the analysis of ANNs and are also shown to generate band semigroups [11] in which the natural order of their idempotent elements provides lower semi-lattice representations [12] related to ANN connectivity.
Each connection in an ANN has an easily calculated FR curvature that is determined by the associated node and connection weights. FR curvature provides a measure of the total amount of "computational flow"or more simply flow -through a network node and quantifies the divergence of the flow emerging from a connectioni.e., the more negative the curvature, the more the divergence of the emergent flow, and, intuitively, the more wide spread the influence the connection has on the network. Thus, both curvature and flowalong with the underlying connection topologyprovide a characterization of the relationship between computation and the structure of an ANN.
When an ANN's node weight, transfer function, and connection weight labels are ignored, the associated underlying connection topology of the ANN is assumed here to be that of an irreflexive binary relation ⊂ × on the ANN's set of nodes. Even though the numbers of nodes and connections in can number in the tens of thousands, can be algorithmically decomposed into maximally related subnetworks of the ANN called dicliques. Collections of dicliques which cover provide a simpler system-like wiring diagram representation of the ANN's underlying structure than the digraph representation of . An even simpler representation of the ANN's structure is its diclique digraph which is easily obtained from its wiring diagram representation. Judicious assignments of FR curvatures and flows to the cover's dicliques and wiring-diagram connections -or to the diclique digraph's vertices and arcsprovide high level insights into the general importance of subnetwork and node contributions to an ANN computation. Furthermore, although not addressed here, time series of such diclique representations obtained as computations propagate through the layers of an ANN can provide additional insights into the relative importance of maximal subnetworks and nodes to ANN computation.

II. ARTIFICIAL NEURAL NETWORKS
An ANN is defined as an interconnected collection of nodesi.e., artificial neurons. A connection (directed edge) between neuronsi.e., an artificial synapse -transmits a signal from the pre-synaptic neuron to the post-synaptic neuron whichin turnaggregates and processes the incoming signals and then sends signals to post-synaptic neurons connected to it. The neurons and synapses of an ANN which has been configured to perform a class of tasks have assigned weights determined by the algorithm and Characterizing Computation in Artificial Neural Networks by their Diclique Covers and Forman-Ricci Curvatures Allen D. Parks input cases used to train it. These weights increase or decrease transmitted signal strengths. A neuron also has a state value determined by the aggregate pre-synaptic signals it receives and its transfer function. Application of a threshold value to a neuron's aggregate input signal can be used to determine if the neuron sends a transfer function valued signal to post-synaptic neurons connected to it. Neurons are typically organized into layers, each of which can perform different transformations on their input signals. The first layer is the input layer, the last layer is the output layer, and layers between the first and last are hidden layers. Input signals are transmitted from the first layer through the hidden layers and emerge as an output signal in the last layer. To be more precise, let the set = {1,2, ⋯ , ℓ} index the layers and the set = {1,2, ⋯ , }, ∈ , index the neurons within the ℎ layer. Thus, 1 ∈ corresponds to the input layer, ℓ ∈ corresponds to the output layer, and − {1, ℓ} indexes the hidden layers. The collection of neurons in an ANN is the set = { : ∈ , ∈ }, where is the ℎ neuron in the ℎ layer. The associated collection of synapses are the ordered pairs ≡ ( , +1 ) of an irreflexive binary relation ⊂ × , where ∈ − {ℓ}, ∈ , and ∈ +1 (here only feed forward ANNs are addressed and only consecutive layers are connected). An ANN is fully connected if for each ∈ − {ℓ} there is an for every ∈ and ∈ +1 .
A simple ANN architecture showing (1) for several selected neurons is depicted in Fig. 1.
Here an ANN computation is initiated when signals 1 , ∈ 1 , are input simultaneously at layer 1. The computation proceeds though the network one layer at a time according to (1) and the result of the computation emerges at the output layer = ℓ where each neuron ℓ manifests the output signal ℓ , ∈ ℓ . As previously mentioned, a threshold value can also be applied to neuron 's aggregate signal in order to determine if sends a signal to post-synaptic neurons connected to it.

III. DICLIQUE REPRESENTATIONS OF ARTIFICIAL NEURAL NETWORKS
Recall that the underlying network structure of an ANN is assumed here to be an irreflexive binary relation on the ANN's set of neurons. For realistically useful ANN architectures, will generally consist of extremely large numbers of neurons and synapses so that a diagrammatic representation of is typically not easily understood. Consequently, information about the global topology of an ANN; the importance of various regions within an ANN; and relationships between various internal regions of an ANN are difficultif not impossibleto extract from these complicated representations. However, instead of attempting to analyze diagrammatic representations of , issues such as these can be largely alleviated by analyzing diclique cover representations of obtained from a diclique decomposition of .
A diclique of is a pair ( , ) ≡ , ⊆ , ⊆ , such that × ⊆ and ⊆ˊ, ⊆ˊ implies (ˊ,ˊ) = ( , ) [8]. Within the context of ANNs, a diclique defines the set of synapses associated with a maximally connected (bipartite) sub-network of , where the first and second diclique entries and are sets of pre-and post-synaptic neurons, respectively. Let be the set of dicliques of . A diclique cover of is a smallest subset ⊆ which satisfies the equality In which case is said to cover the ANN associated with (note that in general an ANN can have many covers). The dicliques of an ANN cover provide a system level representation of an ANN that is much easier to analyze and understand than the standard diagrammatic representation of . In such a system-level representation: (i) each element in corresponds to a maximally connected sub-network (or sub-system) of neurons in the associated ANN (or total system); and (ii) if ( , ) ∈ , ( , ) ∈ , then for each ∈ ∩ there is a directed connection from ( , ) to ( , ) which corresponds to a fundamental relationship between the sub-networks ( , ) and ( , ) established by the neuron . Thus, the essence of the connectivity occurring in an ANN is given by a simplified (relative to its diagram) system level diagram, or multi-digraph, derived from its diclique cover. The order of such a representation is | | and its size is the number of connections between dicliques.
As an example, observe from Fig. 1 (2) is satisfied when = , then covers the ANN. Its associated system level representation is given in Fig. 2. The boxes in Fig. 2 are labelled with the corresponding dicliques of and the directed connections between boxes are labelled with the neurons common to each diclique. Note the simplification introduced by using a diclique cover representation: whereas the for the ANN in Fig. 1 has 12 neurons and 24 synapses, its diclique cover representation in Fig. 2 has 5 dicliques (i.e., is order 5) and 9 connections (i.e., is size 9) and effectively contains the same connectivity and direction of computational flow as (line segments joining connections indicate that the associated neuron is shared by two dicliques). It is interesting to consider the case where the ANN in Fig. 1 is fully connected such that the dicliques are It is clear that these dicliques uniquely cover the associated underlying relational structure and yield the greatly simplified order 3 and size 7 diclique cover representation in Fig. 3. This result can obviously be generalized for an arbitrary fully connected ANN. This generalization is stated for completeness and without proof as the following somewhat trivial theorem: Theorem 1. The diclique cover of a fully connected ℓ layer ANN is unique and its diclique representation is order ℓ − 1 with dicliques = ({ : ∈ }, { +1 : ∈ +1 }), ∈ − {ℓ}, and size . An even more simplified diclique representation of an ANN is given by its diclique digraph which has as its vertices the dicliques in the cover associated with the ANN. However, unlike the diclique cover representation which can have multiple connections between dicliques, a diclique digraph representation has at most one directed edge connecting dicliques, i.e., there is a directed edge from diclique ( , ) ∈ to diclique ( , ) ∈ only if ∩ ≠ ∅. The neurons in the set ∩ label the directed edge. The order of a diclique digraph is | | and its size is its number of directed connections. Thus, a diclique digraph representation significantly reduces the size of a diclique cover representation of an ANN while effectively containing the same connectivity and flow direction information as an associated diclique cover represention. The size 5 diclique digraph for the ANN in Fig. 1 is shown in Fig. 4. Note that 1 and 2 serve as sources and 5 is a sink in the diclique digraph in Fig. 4. It is easy to see that the diclique digraph representation of the fully connected ANN whose diclique cover representation is given by Fig. 3 is a dipath of length 2. The generalization of this to an arbitrary fully connected ANN is obvious and stated without proof as the following theorem: Theorem 2. The diclique digraph representation of a fully connected ℓ layer ANN is a dipath of length ℓ − 1 whose vertices are the dicliques = ({ : ∈ }, { +1 : ∈ +1 }), ∈ − {ℓ}, and whose directed edge connecting and +1 , ∈ − {ℓ − 1, ℓ}, is labeled with the | +1 | elements of the set { +1 : ∈ +1 }.

IV. SYNAPTIC CURVATURE AND NEURON FLOW
Although the , the cover, and diclique digraph representations of an ANN provide connectivity and direction of flow information about the ANN, none of them provide a quantification of the importance and influence of neurons and synapses to ANN computation. Recently, FR curvature has been used to quantify the notions of flow and its divergence in directed networks [9,10]. Here FR curvature and flow is applied to ANN computation.
The FR curvature ( ) for a synapse in an ANN is given by the expression Note that here: (i) the FR curvature for is determined using only synapses originating at +1 ; and (ii) ( ℓ−1 ) = ℓ because ℓ = ∅ and the summation vanishes. Since +1 ≥ 0, then ( ) < 0 when the factor in parentheses in (3) is negative (obviously the FR curvature vanishes if +1 = 0 or the sum in the factor in parentheses is unity). As mentioned in section I, ( ) quantifies the divergence of flow emerging from and the more negative ( ) is, the more divergent the flow. This increase in divergence with increasing negative curvature is implicit in (3), where the sum effectively adds the (square roots of the) relative (to ) contributions of the weights of all synapses incident on +1 . Since the terms in the sum are positive, so is the sum.
The greater the cardinality of the indexing set +1i.e., the larger the number of synapses incident on +1the greater the sum andprovided the sum exceeds unit value -the more divergent the flow. The following theorem and corollary illustrate this using the special cases of unit synaptic weights and full connectivity.  Proof. This result follows from the fact that for a fully connected ANN, +1 = +2 . ∎ FR curvature also quantifies the total computational flow ( ) through hidden layer neurons according to where is the computational flow entering and is the computational flow leaving . The typical situation that occurs in highly connected ANNs is ( ), ( ) < 0. This has the interpretation that enhances (reduces) the divergence of flow and hence enhances (reduces) the computational influence of the −1 synapses since ( ) > 0 (< 0) when ( ) < (>) ( ). From the perspective of curvature, this latter case follows from the facts that: (i) the more negative the total curvature (or the greater the total divergence of computational flow) of the synapses directed away from relative to that associated with the synapses directed into (or convergent upon) , the more positiveor the greater -the total computational flow through ; and (ii) the more negative the total curvature (or the greater the total divergence of computational flow) of the synapses convergent upon relative to that associated with the synapses directed away from , the more negativeor the less -the total computational flow is through .
The following theorem illustrates the notion of computational flow for fully connected ANNs with unit synaptic weights.
Theorem 5. The computational flow through the hidden neuron in a fully connected ANN in which every synapse has unit weight is Proof. The result follows from the facts that (Corollary 4), and Corollary 6. The computational flow through the hidden neuron in a fully connected ANN in which every synapse has unit weight is positive when Proof. Obvious. Note from this theorem that the typical case ( ) < ( ) < 0 for which ( ) > 0 occurs when | +1 |, | +2 | > 1 and

V. THE FR GRAPHS OF AN ARTIFICIAL NEURAL NETWORK
The synaptic curvatures and neuron flows of an ANN can be used to generate three general types of -what will be called here -FR graphs of an ANN. The first type -i.e., Type 1is the labeled digraph representation of the underlying binary relation associated with an ANN. Each neuron in is labeled with the value of the flow through it and each synapse in is labeled with the value of its FR curvature. Although Type 1 FR graphs provide very detailed quantitative curvature-flow-topology descriptions of ANN's, they are complicated to interpret and their analytical value is likely limited to the simplest ANNs.
A Type 2 FR graph of an ANN is its diclique cover representation in which each diclique is labeled with a single curvature value derived from the curvatures of the synapses contained within the diclique (discussed below) and each connection between dicliques is labeled with the value of the flow through each neuron associated with the connection. Such graphs are useful for analyzing which sub-networks (dicliques) are most effective for dispersing flow in an ANN and for identifying individual neurons which are important to the flow between subnetworks.
A Type 3 FR graph is a labeled diclique digraph representation of an ANN. Its dicliques are labeled with single curvature values obtained from the curvature values of the synapses contained in the dicliques (discussed below) and its connections between dicliques are labeled with single flow values obtained from the values of the flows through each neuron associated with the connection (discussed below). The advantages of Type 3 FR graphs are: (i) although they somewhat dilute flow information of individual neurons connecting dicliques -they provide for easier analyses of the overall flow and divergence properties of ANNs than do Type 1 or 2 FR graphs; and (ii) they lend themselves to analyses using standard digraph algorithms (e.g., shortest dipaths, etc.).
Although a variety of approaches for determining the numerical values of diclique curvature could probably be developed, the obvious are the curvature sum and the mean curvature. Consider an ANN diclique ( , ) and let the sets = {1,2,3, ⋯ , | |} and = {1,2,3, ⋯ , | |} index the neurons in and . If the neurons in and are in layers and + 1, respectively, then the curvature sum ( , ) of ( , ) is defined by
As an illustration of an FR graph and an interpretation of a portion of it, consider the Type 3 FR graph derived from Fig. 4 and shown in Fig. 5 where fictitious mean curvature and mean flow values are assigned to its dicliques and connections. Since ( 2 ) < ( 1 ), then 2 influences early computational flow more than 1 since it "spreads" the computational flow more effectively. In addition, more of the 2 flow is sent to 4 than to 3 because ( 2 , 4 ) = +3 > +1 = ( 2 , 3 ). Note that the longest dipath in the graph is 2 → 4 → 5 (length = 3 + 2 = 5) and the shortest dipath is 2 → 3 → 5 (length = 1 + 2 = 3). It can be concluded from this that more (less) computational flow to 5 occurs along dipath 2 → 4 → 5 ( 2 → 3 → 5 ) than any other dipath in the ANN. By inspection of Fig.  4 (or from the eigenvalue spectrum 2.1358, 0.6622, 0, -0.6622, -2.1358 of the underlying graph of the digraph) it is easily seen that the chromatic number for this digraph is 2i.e., it is bipartite -so that its dicliques can be partitioned into the two sets 1 = { 1 , 2 , 5 } and 2 = { 3 , 4 }. This means that there is no direct computational flow between the dicliques in 1 and between the dicliques in 2 . Some sense of the resistance of the ANN to computational disruption is obtained by noting that for example: (i) neither 3 nor 4 are cut-dicliques -i.e., the removal of either one of them does not totally stop computational flow (Fig. 5); and (ii) removal of 1 3 , 2 3 , and 3 2 from the ANN does not completely stop computational flow (Figs. 1, 4, and 5).
Although contains no curvature information, its algebraic structure can provide some insight about the connectivity of the cover of the associated ANN. For example, consider the ANN of Fig. 1    Proof. This follows from the fact that the first entries for any two dicliques in the cover of a fully connected ANN have no neurons in common. More specifically, let be the unique cover for a fully connected ANN and = ( , ) ∈ , = ( , ) ∈ . Since ∩ = and ∩ = ∅, ≠ , then • = ( ∩ ,∪ ∈ ∩ ( )) = ( , ) = and • = ( ∩ ,∪ ∈ ∩ ( )) = (∅, ∅) = ∅.
Thus, ∪ {∅} are the only elements in 〈 ,•〉. Because is a band, it is also a lower semi-lattice under the natural order ≤ of its idempotents [12]. In particular, if and are idempotents in a band, then ≤ means = = and is said to be under . For the band with the Cayley table given by Table 1, it is easy to see that 6 ≤ 1 , 6 ≤ 2 , 7 ≤ 3 , 7 ≤ 4 , and ∅ ≤ , = 1,2, ⋯ ,7. This natural order provides the following lower semi-lattice for the band associated with the ANN of Fig. 1:   Fig. 6. The lower semi-lattice of dicliques for the ANN in Fig. 1 Note that 5 , 6 , and 7 are primitive idempotents, i.e. they satisfy the condition that the only idempotents under each are and ∅ (recall that ∅ is the zero element of the band)whereas 1 , 2 , 3 , 4 are not.
By definition, the meet of two dicliques ∧ in a diclique semi-lattice for an ANN is their semigroup product • and corresponds to a diclique in which has as its first entry the neurons common to the first entries of and . Thus, such meets can be useful for identifying other maximally connected subnetworks in ANNs. For example, refer to Fig. 1 to see that 6 and 7 are maximally connected subnetworks. Fig. 7. The semi-lattice of dicliques for the ANN in Fig. 3 The semi-lattice associated with the band for the fully connected ANN of Fig. 3 is given by Fig. 7. There the natural order for the idempotents is simply ∅ ≤ , = 1,2,3. Thus, 1 , 2 , and 3 are primitive idempotents. This leads to the following result: Theorem 8. All dicliques in the band associated with a fully connected ANN are primitive idempotents.
Proof. Since the first entries for every two dicliques in the unique cover of a fully connected ANN have empty intersection, then the only diclique generated by the cover that is not in the cover is the empty diclique. Consequently, the only idempotents under any diclique ∈ 〈 ,•〉 = 〈 ∪ {∅},•〉 is and ∅.
This theorem therefore identifies an algebraic signature for fully connected ANNs.

VII. CONCLUDING REMARKS
Dicliques have been shown to provide simplified graph representations of ANN connection topology. When combined with FR curvatures, such representations can provide insights into the flow of computation within ANNs. It was also shown that a band semigroup of an ANN's dicliques can be associated with the ANN and that the natural idempotent order of the dicliques in the band defines a lower semi-lattice whose meets identify other fully connected subnetworks in the ANN. As a final result, it was shown that the dicliques in the band associated with a fully connected ANN are all primitive idempotents.
Future efforts should involve testing the theory developed above using real functioning ANNs; using the theory to analyze the flow of computations through ANNs; and investigating other algebraic properties of ANNs.