Contract number
o
507953
Workpackage 11
Dynamics of Networks
Deliverable 11.3
Report on hierarchical networks
Project funded by the European
Community under the “Information
Society Technology” Programme
1 / 26
Contract number: 507953
Pr oject acronym: DBE
Title: Digital Business Ecosystem
Deliverable N
o
: D11.3
Due date: 05/2006
Delivery date: 07/2006
Short description: The nal report on network dynamics and restruc-
turing. Note that this task has been brought forward by ve months.
Author: UBHAM
Partners contributed: STU, LSE, HWU
Made available to: Public
Versioning
Version Date Author, Organisation
1.0 28/06/2006 Jonathan E. Rowe, Boris Mitavskiy (UBHAM)
2.0 24/07/2006 Jonathan E. Rowe, Boris Mitavskiy (UBHAM)
Quality check
1
st
internal reviewer: Maria Chli (HWU)
2
nd
internal reviewer: Miguel Vidal (SUN)
This work is licensed under the Creative Commons
Attribution-NonCom mercial-ShareAlike 2.5 License. To view a copy of this license,
visit : http://creativecommons.org/licenses/by-nc-sa/2.5/ or send a letter to Creative
Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
2 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
3 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Contents
1 Introduction and relationship to the project 6
2 The network re-routing problem 8
2.1 Theproblem............................... 8
2.2 Thealgorithm.............................. 9
3 Empirical results with scale-free networks 10
3.1 Experiments............................... 10
3.2 Results.................................. 11
3.3 Possibleimprovements ......................... 11
4 Theoretical analysis 16
4.1 Overview ................................ 16
4.2 Basicmathematicalobservations.................... 17
4.3 Localalgorithmanalysis ........................ 18
4.4 Ergodicity of the Algorithm: . . .................... 22
4 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Executive Summary
This report is the nal report from UBHAM on WP11 (sub-task S2) brought forward
from month 36. It is a summary of work done at UBHAM during the period January–
April 2006.
The report is in four main sections. The rst introduces the work and describes its
relationship to other aspects of the DBE project. The second part then describes the
network re-routing problem within the context of the DBE, and ou tlines our adaptive
algorithm for addressing this problem. The third part presents the results of our further
empirical studies on the network re-routing algorithm. This algorithm allows nodes in
the DBE network that frequently communicate with each other to be brought closer
together within the n etwork, and potentially iden tifying clusters of users with common
interests. It is shown that our algorithm can offer some improvements in communica-
tion costs relatively cheaply. Some improvements are discussed that should be inves-
tigated as further work. In the fourth part of the report we present some theoretical
analysis, rstly of the algorithms behaviour with regard to improvements in the ex-
pected number of hops (path length) within the network, and secondly with regard to
the algorithms ergodicity. This latter condition ensures that it is always possible for the
network to adapt towards any target conguration, with non-zero probability.
The main “customers” of this work are SUN and HWU, who are concerned with
the network aspects of the DBE infrastructure.
5 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Chapter 1
Introduction and relationship to
the project
This deliverable is the nal report for sub-task S2 (Dynamics and clustering in hier-
archical networks), which is scheduled to run from month 19 to month 36, and builds
on the work of S1, already reported [7] and the preliminary report from S2 [6]. The
objectives of this sub-task are:
1. To extend work on information propagation in simple networks to hierarchical
ones.
2. To develop theory of structure and dynamics in such networks.
The sub-task interacts with S10 (Symbiosis and competition - HWU) which stud-
ies the effects on markets of information sharing in networks, and WP19 (SUN and
HWU) looking at the specics of the DBE peer-to-peer network behaviour (see, for
example, [5, 4]).
In the previous report [6] we introduced the network re-routing problem and pre-
sented some preliminary experimental results of our proposed algorithm. We had orig-
inally planned to continue this effort in the following ways:
1. Further empirical studies of the algorithm on scale-free networks.
2. Theoretical analysis of the algorithm’s properties.
3. Visit to SUN for large-scale empirical studies.
4. Extension to hierarchical clustering of nodes.
Unfortunately, due to personnel problems, UBHAM has had to withdraw from the
project early. We have made good progress on the rst two parts of the plan. However,
we have not been able to arrange the visit to SUN, nor to implement the extension
to hierarchical clustering. The man-months and funding for the remainder of the task
have instead been reallocated to other partners.
6 / 26
DBE Project (Contract n° 507953)
In this report, then, we focus on our further work on the network re-routing prob-
lem, which is introduced in chapter 2, along with the basic algorithm. The results of
our empirical studies are presented in chapter 3, and our theoretical results to date are
presented in chapter 4.
7 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Chapter 2
The network re-routing problem
2.1 The problem
In [6] we introduced the following problem regarding the dynamical restructuring of
the DBE network. We imagine that users join the network in a rather ad hoc manner.
It seems most likely that preferential attachment of users to already popular nodes
will create a scale-free network topology. As the DBE is used, the user will transfer
information from other nodes of the network. We would like to arrange things so that
nodes that commonly transfer information b etween themselves are close together in
the network. This is partly so that the communication time is more efcient, but also
because this would lead, in the long term, to the identication of a potential cluster of
users with similar interests and requirements. The problem is that the initial attach ment
of the user to the DBE network will not necessarily place it in the optimal position
from this point of view. We would therefore like to be able to adapt the topology of
the network so as to bring nodes which frequently communicate with each other closer
together.
Obviously, the simplest solution would be to have a complete network, in which
every node was a neighbour of every other node. However, this would then make the
network too expensive, with many hubs and many connections (each node is effectively
a hub). To try to maintain the sparsity of the network, then, we restrict the kinds of
restructuring allowed to the replacement of one connection by another. In this way, the
total number of connections in the network remains constant.
Formally, then, the problem is as follows. We are given a graph G with nodes V
and edges E. From time to time, pairs of nodes communicate with each other. We
assume that there is some (unknown) probability distribution μ over the set of pairs
of nodes describing this communication. We follow some standard algorithm (e.g.
Dijkstra’s algorithm) to establish a path between the two nodes, and record the path
length (number of hops required). We would like to minimise the average path length:
E(D)=
a,bV
μ(a, b)d(a, b)
where d(a, b) is the number o f hops from a to b. We seek to do this by incremen-
8 / 26
tally changing the network topology, replacing an existing edge with a new edge. The
problem is to nd an appropriate replacement strategy.
2.2 The algorithm
The algorithm introduced in [6] works as follows:
1. Select a, b V according to μ.
2. Find a shortest path γ from a to b.
3. For each internal node v of the path, create a shortcut with probability p.
4. Go to 1.
To create a shortcut, consider the section of the path surrounding the chosen node v.
That is, there are nodes u and w so that (u, v, w) is in the path γ. We add a new edge
(u, w) to the graph, and delete either edge (u, v) or (v, w). The deleted edge is chosen
randomly.
The idea of this algorithm is that the shortcut will reduce the path-length by one.
Of course it may create problems for other paths, but the hope is that since it will be
applied to the most frequently chosen paths, the overall effect will be benecial.
It should b e noted that the problem, as stated, has, as a special case, the Min-
imum Communication Cost Spanning Tree problem, which is already known to be
NP-complete [1]. It is not feasible, then, to propose an efcient algor ithm which will
solve the problem exactly. The best we can hope for is that our heuristic algorithm will
generate so lutions of acceptable quality. Having said that, it is known that in the case
where:
1. All pairs of nodes communicate with equal frequency, and
2. the number of allowed edges is one less than the number of nodes
that there is a polynomial-time algorithm, due to Gomory and Hu [2, 3]. However, in
our situation, we do not have advance knowledge of the frequency with which pairs of
nodes will communicate and it is very unlikely that this will be equal for all pairs of
nodes, so their algorithm is not directly applicable. We do, though, have the advantage
of having potentially mo re edges in the DBE network that th e minimal set allowed in
the Gomory-Hu algorithm.
9 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Chapter 3
Empirical results with scale-free
networks
3.1 Experiments
Following the set of experiments reported in [6], we continued our investigation of our
algorithm on scale-free networks, with different probability distributions over the pairs
of nodes selected for communication.
We assume that the DBE network will, at least initially, tend to grow following
some sort of preferential attachment, as new users link to nodes that are already well-
known. This will result in a scale-free network topology. We have therefore conducted
experiments with such networks with between 50 and 200 nodes. We seek to simulate
the situation that would occur when two nodes frequently share information, and y et
are currently relatively far apart in the network. We hope our adaptive algorithm will
recongure the network so as to bring them closer together. Consequently, for each
network, we selected a subset of 10% of the nodes with probability inversely propor-
tional to the node’s degree (i.e. the nodes which have smaller degree are more likely to
get selected). Such a subset is likely to be fairly widely spread throughout the network
and will tend not to include major hubs which are already well-connected.
The selected nodes where partitioned into two sets, A and B of equal sizes and the
probability distribution μ was dened which samples every pair from the set A × B
equally likely. That is, nodes from group A always want to communicate with nodes
from group B. The adaptive algorithm should move these two subsets closer together
in the network as it recongures the network topology.
The expected path length with respect to the probability distribution μ described
above has b een estimated by performing 50 independent samplings of pairs of nodes
with respect to μ. Afterwards, 300 independent iterations of the re-routing algorithm
were performed. A single iteration of the algorithm picks a pair of nodes (x, y) V
2
at
random with respect to the distribution μ (notice that our distribution μ is concentrated
on the pairs in A × B only so that with probability 1 we choose a pair in A × B
V
2
). Now we use Dijkstra’s algorithm to nd the shortest path between x and y in the
10 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
original network. Once a shortest path has been selected, we traverse the path from
one end to another, replacing a consecutive pair of edges by a single edge joining non-
common nodes and deleting one of the intermediate edges (see the previous section
for more details on the analysis of this algorithm) with some probability p.Three
independent experiments have been run with the values of p =0.2, p =0.5 and
p =0.8. Again, upon the completion of 300 such iterations, we estimate the average
path length by performing 50 independent samplings of pairs in V
2
with respect to
μ. The plots of the average shortest path length vs. the total number of nodes in
the network before and after the re-routing algorithm have been produced. Also the
plots of the average complexity of the Dijkstra’s computation (the number of nodes
encountered during a single iteration of the Dijkstra’s algorithm) vs. the total number
of nodes in the network have been constructed.
3.2 Results
From gures 3.1–3.3 we can see that the average path length after the algorithm has
been run is signicantly reduced. The same is true for the time complexity of running
Dijkstra’s algorithm fo r nding the shortest paths (measured by the number of nodes
visited). Moreover, the reduction is stronger for larger values of p. A possible explana-
tion for this is th at a single step of the algorithm is unlikely to case to much harm. I f a
step leads to an increase in the mean path length, according to theorem 4.3.2, the only
way this can happen is if the probability of joint occurre nce of the consecutive edges
which have been altered is smaller than the p robability of their separate occurrence.
But then, roughly speaking, the opposite will be true in the next step o f the algorithm
so that the algorithm is likely to cor rect itself in the sequel step. At the same time, the
algorithm modies the n etwork more frequently after a xed number of iterations when
p is larger. However, a deeper theoretical analysis is necessary to understand which pa-
rameters are more suitable for which networks. As the number of nodes g ets larger,
the improvement reduces regardless of the value of p. This is quite easy to understand:
For larger networks, the p ercentage of the nodes selected increases and the number
of possible pairs sampled increases quadratically. The number of iterations is usually
insufcient to sample all the possible pairs, and, in fact, samples only a very limited
number of these pairs. Completely different sets of pairs may b e sampled during the
mean path estimation after all the modications are complete and before the iterations
have been run. Unfortunately we do not possess sufciently powerful equipment to
run more more iterations for a large number of nodes, however, it seems the outcome
should be roughly similar as it is for the small number of nodes.
3.3 Possible improvements
It would seem from the empirical data that it is a good strategy to make the edge re-
placement probability as high as possib le. Taking this to the extreme, we cou ld set
p =1.0 which effectively m eans that every time two nodes communicate, we place an
edge directly between them and delete a random edge from the original path. However,
11 / 26
DBE Project (Contract n° 507953)
Figure 3.1: Improvement in expected path length, and time complexity of Dijkstra’s
algorithm, after running the adaptive re-routing algorithm for 300 iterations, with an
edge replacement probability of 0.2
12 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Figure 3.2: Improvement in expected path length, and time complexity of Dijkstra’s
algorithm, after running the adaptive re-routing algorithm for 300 iterations, with an
edge replacement probability of 0.5
13 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Figure 3.3: Improvement in expected path length, and time complexity of Dijkstra’s
algorithm, after running the adaptive re-routing algorithm for 300 iterations, with an
edge replacement probability of 0.8
14 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
this trend in the data is probably caused by the rather special probability distribution
which we used to model communications, in which all communications were restricted
to being between two small subsets of the nodes. With a rather less strict set of com-
munications, it is likely that the extreme choice p =1.0 will be more likely to disrupt
potentially useful pa thways in the network.
However, this does lead to the question of the best choice of shortcut to make.
A more sophisticated strategy would be to record, for each node in the network, the
number of communications which are routed through it. One would then prefer to
place shortcuts around nodes that are only used infrequently, since this is less likely
to be disruptive. This protocol is one we would like to have validated on large-scale
networks in collaboration with SUN.
A further improvement to the algorithm would h ave to take into account the fact
that new users may join and leave the DBE network during its lifetime. There should
not be any problem in accommodating new nodes in the network within the current
algorithm (or its proposed extension). However, if a node leaves the network, then new
routes would have to be found to accommodate this. Given the scale-free nature of the
DBE network topology, most nodes would be able to leave (or go down temporarily)
with only minor effects on the rest of the network. However, should a major hub leave
the system, then our re-routing algorithm should be able to recongure the network
efciently. This would need to be tested empirically.
It is worth emphasising that, since the problem is an NP-complete one, we d o not
expect to be able to nd optimal network congurations efciently. Our algorithm will
nd local optima (with respect to the removal and addition of single edges), which
may not necessarily be globally optimum. It is important to investigate exactly how
good such local optima are. In fact it is possible to construct articial networks (e.g.
star-shaped networks) in which it is very hard for our algorithm to nd good solu-
tions. In such cases, the algorithm performs rather badly. However, it is very unlikely
that such an articial structure will be found in the DBE. One would therefore need
to experiment with simulations of the DBE set-up to establish whether or not the al-
gorithm perfor ms well within the practical situation of the DBE. The experiments we
reported are encouraging, but more needs to be done with large-scale simulations based
on realistic data. We had planned to visit SUN to perform such experiments on their
high-performance parallel machines, but unfortunately this visit had not taken place by
the time we withdrew from the project. We urge SUN to continue with such experi-
mentation if at all possible. Similarly, it is important that our proposed algorithm be
compared to other approaches that may already be known. We have so far found the
Gomory-Hu algorithm [2, 3], although this only applies to tree-structured networks.
Again, this is work that should have been done before the completion of the project,
which we have been unable to complete before our withdrawal.
15 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Chapter 4
Theore tical analysis
4.1 Overview
In this section we present our preliminary theoretical analysis of our network re-routing
algorithm. It is rather technical, so we rst present a high-level summary.
First we have some mathematical observations in which we dene the object of
interest, namely the expected shortest distance in the n etwork. That is the average
number of hops that messages will make during transmissions between nodes in the
network. We p rove a simple result that relates this quantity to the frequency with
which links (edges) in the network occur in transmissions. If an edge occurs with high
frequency, then it can be considered to be rather important. The connection we prove
is that the expected shortest distance in the network is equal to the sum of the edge
frequencies.
Next we look at the local effects of a single move of our algorithm to try to estab-
lish conditions under which the replacement of one edge by another (a shortcut) will
result in an improvement. Clearly, the shortcut will improve matters between the two
nodes that have just communicated. The problem is the the edge that is being replaced
might itself be important. Remembering that the importance of an edge is related to
its frequency, we derive conditions on edge frequencies which ensure that the overall
average shortest length is improved by the move.
Of course, this is only a local analysis, and may well lead the algorithm to a local
optimum. In the next section, though, we at least prove that the algorithm has a chance
to escape local optima since there is always a non-zero probability of moving between
any two network congurations. technically, this means the algorithm is ergodic.In
the worst case, of course, one may end up waiting a long time for such moves to be
made. Further empirical studies are necessary to consider whether or not this is the
case in the kinds of network structure generated by the DBE.
16 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
4.2 Basic mathematical observations
Suppose we are given a network (an undirected graph) G =(V, E) and a probability
distribution μ on V
2
. For a g iven pair of nodes x and y denote by d(x, y) the distance
between x and y (i.e. the number of edges in the shortest path joining x and y). We are
interested in restructuring the network G so as to preserve the number of of edges (not
to make it bigger), but, at the same tim e, to minimize the expected shortest distance
with respect to the distribution μ, namely E(D)=
x, yV
μ(x, y)d(x, y).Wemake
the following general observation rst:
Proposition 4.2.1 Denote by Γ a chosen collection of shortest paths: one for every
pair of nodes. Then E(D)=
eE
P
Γ
(e) where P
Γ
(e) is the p robability that the
edge e has been encountered in a shortest path from Γ joining a pair selected with
respect to the distribution μ.
Proof: Consider the characteristic function X : E × V
2
→{0, 1} dened as
X (e, x, y)=
1 if e is an edge in the shortest p ath f rom Γ joining x and y
0 otherwise.
First notice that for every x, y V we can write d(x, y)=
eE
X (e, x, y) We then
obtain
E(D)=
x, yV
μ(x, y)d(x, y)=
=
x, yV
μ(x, y)
eE
X (e, x, y)=
eE
x, yV
μ(x, y)X (e, x, y).
Notice that for a xed edge e E, P
Γ
(e)=
x, yV
μ(x, y)X (e, x, y) is just the
probability that the edge e has been traversed by a path from Γ joining some pair of
nodes (x, y) which is chosen randomly with respect to μ. The desired conclusion that
E(D)=
eE
P
Γ
(e) now follows.
Notice that the choice of the collection of the shortest paths Γ is not unique in general.
As an immediate consequence of p roposition 4.2.1 we deduce
Corollary 4.2.2 GivenagraphG =(V, E) and a probability distribution μ over V
2
,
let Γ denote any collection of the shortest paths between the nodes of G containing a
unique path for every pair of nodes of G. Then it follows that
eE
P
Γ
(e) is indepen-
dent of the choice of Γ.
Similar results can be established by expressing the path length counting the nodes
rather than the edges. The proof of proposition 4.2.1 goes through just ne when we
use the characteristic function Y
Γ
: V
3
→{0, 1} dened as
Y(v, x, y)=
1 if v is a node on the path from Γ between x and y
0 otherwise
17 / 26
Del 11.3 Report on hierarchical networks
and then notice that d(x, y)=(
vV
Y(v, x, y)) 1 so that we obtain
E(D)=
x, yV
μ(x, y)d(x, y)=
x, yV
μ(x, y)((
vV
Y(v, x, y)) 1) =
=
x, yV
μ(x, y)
vV
Y(v, x, y)
x, yV
μ(x, y)=
=(
vV
x, yV
μ(x, y)Y(v, x, y)) 1.
Again, for every xed node v V we notice that
x, yV
μ(x, y)Y(v, x, y) is the
probability P
Γ
(v) that the node v has been encountered in the unique path from Γ
joining the pair of nodes (x, y) randomly chosen with respect to the distribution μ.
This produces a result analogous to proposition 4.2.1:
Proposition 4.2.3 Denote by Γ a chosen collection of shortest paths: one for every
pair of nodes. Then E(D)=(
vV
P
Γ
(v)) 1 where P
Γ
(v) is the probability that
the node v has been encountered in a shortest path from Γ joining a pair selected with
respect to the distribution μ.
4.3 Local algorithm analysis
We consider a simple online algorithm whose step is to remove an edge e
from the
network and to insert another edge e
+
into the network. We shall now express the
“improvement” or “worsening” th at the algorithm creates after a single time step. We
continue with the notation of the previous section: G =(V, E) denotes our network
and G
=(V, (E ∪{e
+
}) −{e
}) denotes the modied network. Our immediate goal
is to express the difference between the expected average distances D and D
for the
networks G and G
respectively. Recall that Γ and Γ
denote the collections of shortest
paths between the pairs of nodes of G and G
respectively containing exactly one path
for every pair of nodes (x, y). Once the set of shortest paths Γ has been chosen consider
the subset C(e
,e
+
) Γ consisting of all the paths in Γ which remain the shortest
in G
(i.e. these which do not pass through e
and also remain the shortest regardless
of removing e
and adding e
+
). Notice that Γ
can be chosen so that C(e
,e
+
) Γ
(since these paths remain the shortest in G
) and for the rest of the pairs, we choose the
new shortest path in G
. Denote this new set of shortest paths by N
(e
,e
+
) and also
let N(e
,e
+
)=Γ C(e
,e
+
). Notice also that
P
Γ
(e)=P
C(e
,e
+
)
(e)+P
N(e
,e
+
)
(e)
where P
C(e
,e
+
)
(e) denotes the p robability that the edge e occurs in the shortest path
from Γ when sampling with respect to μ and P
N(e
,e
+
)
(e) denotes the probability that
it occurs in the path from N (e
,e
+
). Likewise,
P
Γ
(e)=P
C(e
,e
+
)
(e)+P
N
(e
,e
+
)
(e)
18 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
where P
N
(e
,e
+
)
(e) denotes the proba bility that the edge e occurs in the path from
N
(e
,e
+
). Combining these observations with proposition 4.2.1 gives us
E(D) E(D
)=
eE
P
Γ
(e)
eE
P
Γ
(e)=
=
eEe
P
Γ
(e)
eEe
P
Γ
(e)+P
Γ
(e
) P
Γ
(e
+
)=
=
eEe
(P
C(e
,e
+
)
(e)+P
N(e
,e
+
)
(e))
eEe
(P
C(e
,e
+
)
(e)+P
N
(e
,e
+
)
(e))+
+P
Γ
(e
)P
Γ
(e
+
)=
eEe
P
N(e
,e
+
)
(e)
eEe
P
N
(e
,e
+
)
(e)+P
Γ
(e
)P
Γ
(e
+
).
We summarize this in the following
Proposition 4.3.1 Let G =(V, E) denote a network and suppose we remove the edge
e
from E and add a new edge e
+
to E. This gives us a new network G
=(V, (E
{e
+
})−{e
}).LetΓ denote any particular choice of the shortest paths in the network
G. Also, let Γ
denote a choice of shortest paths having the property that C(e
,e
+
)
ΓΓ
(as mentioned b efore such a choice does exist) where C(e
,e
+
) Γ consisting
of all the paths in Γ which remain the shortest in G
(i.e. these which do not pass
through e
and also remain the shortest regardless of removing e
and adding e
+
).
Let N(e
,e
+
)=Γ C(e
,e
+
) and N
(e
,e
+
)=Γ
C(e
,e
+
). We then have
E(D)E(D
)=
eEe
P
N(e
,e
+
)
(e)
eEe
P
N
(e
,e
+
)
(e)+P
Γ
(e
)P
Γ
(e
+
).
Proposition 4.3.1 tells us, in particular, that the “add/remove an edge” type of an algo-
rithm leads to an improvement if and only if
eEe
P
N(e
,e
+
)
(e)
eEe
P
N
(e
,e
+
)
(e)+
P
Γ
(e
) P
Γ
(e
+
) > 0. In fact, this quantity measures a single step improvement.
Next, we shall apply the above observations to deduce some basic theoretical prop-
erties of the algorithm described in the previous report. Lets recall how the algorithm
works: We x a small positive number 1.AteverystageWesampleapair(x, y)
of nodes randomly with respect to the probability distribution μ.Nextwend a short-
est path joining x and y in the network G. We traverse the path starting with the node
x and whenever we encounter a consecutive pair of edges (u, v) and (v, w) along the
path, with with probability we replace one of the edges (either (u, v) or (v, w)) with
the edge (u, w). The decision which one of the edges is discarded is made with even
probability
1
2
. Now suppose W.L.O.G. that e
=(u, v) and e
+
=(u, w) (recall that
e
denotes an edge that has just been removed while e
+
denotes the one that has been
added). Once we select the collection of the shortest paths Γ, notice that for every path
γ N(e
,e
+
) joining a pair of nodes, say x and y, the path γ
in the new network
obtained upon removing the edge e
and adding the edge e
+
, constructed by replac-
ing the edge e
with the consecutive pair of edges {v, w} and e
+
(or e
+
and {w, v}
depending on the order) is longer by exactly one edge. Thus, even if γ
is a shortest
path between x and y in the new network, the shortest distance between x and y has
19 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
been increased by exactly one edge. In this case, we let the new set of shortest paths Γ
contain the path γ
. Likewise, the paths which involve both edges, e
and {v, w} are
shortened by exactly one edge in the network Γ
.Again,weletΓ
contain these paths
as well (these where the edges e
and {v, w} are replaced by the single edge e
+
). The
picture below illustrates all the possible situations:
The upper pictures show the shortest paths between the corresponding pairs of purple
nodes. The case when the path remains unchanged is shown on the picture (a), the
path which shall be shortened is displayed on the picture (b) and the path which shall
become longer appears on the picture (c). The shortest paths in the modied graph
appear on the bottom picture for all the corresponding cases.
To shorten the notation, let e
= {v, w}. Now consider the possible re-routings
that occur as a result of replacing e
with e
+
. We will consider three cases:
Case 1. Paths that do not go through e
will either r emain unchanged in Γ
or may
possibly be shortened by one if e
happens to create any fortuitous (i.e. un-
planned) shortcuts.
Case 2. Paths that pass through e
but not through e
. If there is an alternative shortest
path, then this can be used and the path length remains unchanged. Otherwise,
e
in the path has to be replaced by e+,e
which increases the path length b y
one.
Case 3 . Paths that use both e
and e
are shortened by one as they now use the
shortcut e
+
.
We can get a lower bound on the size of the decrease in path length by ignoring fortu-
itous shortcuts in case 1, and alternative routes in case 2. Accordingly,
eE−{e
,e
}
P
N(e
,e
+
)
(e)
eE−{e
,e
}
P
N
(e
,e
+
)
(e)
20 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
According to proposition 4.3.1, we can now write
E(D)E(D
)=
eEe
P
N(e
,e
+
)
(e)
eEe
P
N
(e
,e
+
)
(e)+P
Γ
(e
)P
Γ
(e
+
)=
=
eE−{e
,e
}
P
N(e
,e
+
)
(e)
eE−{e
,e
}
P
N
(e
,e
+
)
(e)+
+P
N(e
,e
+
)
(e
) P
N
(e
,e
+
)
(e
)+P
Γ
(e
) P
Γ
(e
+
)
P
N(e
,e
+
)
(e
) P
N
(e
,e
+
)
(e
)+P
Γ
(e
) P
Γ
(e
+
).
In summary, we have
E(D) E(D
) P
N(e
,e
+
)
(e
) P
N
(e
,e
+
)
(e
)+P
Γ
(e
) P
Γ
(e
+
).
We now proceed to analyze the differences P
N(e
,e
+
)
(e
)P
N
(e
,e
+
)
(e
) and P
Γ
(e
)
P
Γ
(e
+
) more explicitly. First note that we can write N(e
,e
+
)=L(e
,e
+
)
S(e
,e
+
) where L(e
,e
+
) is the set of these paths which have been made longer,
and S(e
,e
+
) is the set of these paths which have b een shortened upon removal of e
and insertion o f e
+
respectively. Notice that the edge e
does not occur in a path which
has been lengthened by removal of e
(according to the previous discussion, all such
paths must pass through e
and not through e
)sothatP
L(e
,e
+
)
(e
)=0. Likewise,
P
S(e
,e
+
)
(e
) is the probability that e
is the edge of a path that has been shortened,
i.e. a path which goes through both, e
and e
. This is simply the probability that e
and e
occur jointly. We shall denote this prob ability by P (e
e
). We n ow deduce
that
P
N(e
,e
+
)
(e
)=P
L(e
,e
+
)
(e
)+P
S(e
,e
+
)
(e
)=
=0+P (e
e
)=P (e
e
).
In general we shall adopt the following notation: P (e
1
e
2
) is the probability that the
edges e
1
and e
2
are encountered jointly, while P (e
1
e
2
) shall denote the probability
that e
1
occurs and e
2
does not. Similarly, we deduce that
P
N
(e
,e
+
)
(e
)=P (e
e
+
)=P (e
e
).
(Here the reader would do well to refer back to the picture above.) Finally, observe
that whenever a path in γ Γ involves e
, the corresponding path γ
Γ
involves
e
+
(regardless of wether it is shortened or lengthened). It follows now that P
Γ
(e
)=
P
Γ
(e
+
) so that P
Γ
(e
) P
Γ
(e
+
)=0and we nally conclude that
E(D) E(D
) P
N(e
,e
+
)
(e
) P
N
(e
,e
+
)
(e
)+P
Γ
(e
) P
Γ
(e
+
)=
= P (e
e
) P (e
e
).
We summarize this nal observation below:
21 / 26
Theorem 4.3.2 Let G =(V, E) denote a network. Let u, v, w V and let e
=
{u, v}, e
interm
= {v, w}. Suppose also that e
+
= {u, w} / E. Assume we remove
the edge e
from E and add a new edge e
+
to E. This gives us a new network G
=
(V, (E ∪{e
+
}) −{e
}). Then, we have
E(D) E(D
) P (e
e
) P (e
e
)
where P (e
1
e
2
) is the probability that the edges e
1
and e
2
are encountered jointly,
while P (e
1
e
2
) denotes the probability that e
1
occurs and e
2
does not.
4.4 Ergodicity of the Algorithm:
This section is devoted to showing that given any two connected graphs G and G
on
n nodes and k edges, there is a way to obtain G
from G by performing the steps of
our algorithm. Before stating the formal theorem we want to e stablish, it is convenient
to introduce the following group action on the set G(S, k) of connected graphs on the
specied set S of n nodes and having a specied number k of edges:
Denition 4.4.1 For every triple of nodes {i, j, k}⊆S we introduce the following
permutations on the set G(S, k) of connected graphs on the specied set S of n nodes
and having a specied number k of edges:
π
(a, b)
(a, b), (b, c)(a, c)
(G)=
G
=(S, E −{(a, b)}∪{(a, c)}) if (a, b) and (b, c) E
and (a, c) / E
G otherwise
where {a, b, c} = {i, j, k} and G =(S, E) ∈G(S, k).
We shall also denote b y A the subgroup of permutations on G(S, k) generated by
the set Δ of all possible permutations of the form described above.
Applying a step of our algorithm to a graph G can then be described as selecting a per-
mutation from Δ ∪{I} with some probability (which does depend on G) and applying
it to G. Notice that such a probability distribution over Δ ∪{I} can always be chosen
so that every one of the elements in Δ ∪{I} is chosen with positive probability as long
as the distribution μ on V
2
assigns a positive probability to every p air of nodes. it fol-
lows that a graph G
can be obtained from a given graph G ∈G(S, k) upon completion
of nitely many steps of our algorithm if and only if there exists an element a ∈A(the
group generated by Δ) such that a · G = G
where the action · is the usual function
evalu a tion (which is the action induced by the generators). In o ther wards, G
can be
obtained from G upon completion of nitely many steps o f our algorithm if and only
if G
and G are in the same orbit under the action of the group A. In the language of
group actions, saying that every G
∈G(S, k) can be obtained from G upon comple-
tion of nitely many steps of our algorithm amounts to saying that the action of A on
G(S, k) is transitive (there is only one orbit under this action).
Denition 4.4.2 We shall write G G
is and only if G and G
are in the same orbit
under the action of A.
22 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
It is well known from group theory (and is very easy to show as well) that is an
equivalence relation. Our goal in the current section is to establish the following result:
Theorem 4.4.1 The group action of A on G(S, k) is transitive, or, e quivalently, G
and G
∈G(S, k) we have G G
.
Theorem 4.4.1 is nontrivial and we shall break down the proof into several simple
lemmas. First of all, the proof is based largely on the following concept:
Denition 4.4.3 Given a graph G =(S, E), we shall say that a star set of G is any
collection of nodes K S such that for every v S either v K or v is a neighbor of
a node in K. We shall also refer to k =min{|K||K is a star set of G}, the minimal
cardinality of a star set of G, as the star number of a graph G. A graph having star
number 1 we shall name a generalized star. Any node of a generalized star comprising
a singleton star set will be referred to as the star center.
Obviously, a star number of a graph G =(S, E) is always bounded above by the total
number of nodes in G since S itself is a star set of G. In fact, it is bounded above by
max{1, |S|−2} since we can consider any spanning forest of G and notice that every
tree has at least two nodes of degree 1. by removing all the nodes in the spanning tree
of degree 1 (except possibly one of them in case G consists of only 2 nodes) we obtain
astarsetofG of cardinality max{1, |S|−2}. Although the following property is not
necessary for proving theorem 4.4.1, we believe it is worth mentio ning:
Proposition 4.4.2
Let K denote a star set of a graph G =(S, E), and suppose |K| >
1.Fixv K and let p = v, x
1
, ...,x
l
,wdenote a shortest path between v and
K −{v} (of course, w K −{v}). Then p consists of at most 4 nodes. This means
that we can write either p = v, x
1
,x
2
,wor p = v, x
1
,wor p = v, w.
Proof: Let p = v, x
1
,x
2
,...,w. Note that x
2
is not a neighbor of v (otherwise the
path p = v, x
2
,...,wwould have been shorter). Since K is a star set of Gx
2
must be
a neighbor of at least some node in K.Sincev is not such a node, it must be a node
w K −{v} which implies that p = v, x
1
,x
2
,w.
The reason we introduced the notion of a star number will become clear thanks to the
following lemmas:
Lemma 4.4.3 Every graph G ∈G(S, k) having star number k>1 is equivalent (in
the sense of denition 4.4.2 ) to a g raph G
∈G(S, k) with star number k 1.
Proof: Fix a star set K of G of size k.pickv K and note that K −{v} =
(since k>1). Since G is a connected graph, there is a path in G from v to K −{v}.
Let p = v, x
1
,x
2
,...,w with w K −{v} denote a shortest such path. According
to proposition 4 .4.2, either p = v, w, p = v, x
1
,wor p = v, x
1
,x
2
,w.First,
observe that G G

where {v, w}∈E(G

). Indeed, if p = v, w then G

= G
will do. If p = v, x
1
,w,letG

= π
(v, x
1
)
(v, x
1
), (x
1
,w)(v, w)
(G) and note that {v, w}∈
E(G

) (see denition 4.4.1). Finally, if p = v, x
1
,x
2
,wthen we rst let G

=
π
(x
1
,x
2
)
(x
1
,x
2
), (x
2
,w)(x
1
,w)
(G) and then apply the argument of the previous sentence to
23 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
G

. By transitivity of it sufces to show that G

G
where K −{v} is a star
set of G
. Again, thanks to transitivity, it sufces to construct a sequence of graphs
G

= G
1
,G
2
,...,G
l+1
= G
with G
i
G
i+1
whenever 1 i l 1.To
do so, enumerate the nodes u
1
,u
2
,...,u
l
which are the neighbors of v and not the
neighbors of any node in K −{v} (if there are no such nodes we would nish right
away, since then K −{v} is a star set of G

= G
1
= G
). We now let G
1
= G

,and
let G
i+1
= π
(v, u
i
)
(w, v), (v, u
i
)(w, c)
(G
i
). Notice that the edge {w, v}∈E(G
), and, by
construction, remains an edge of every G
i
. Therefore, every G
i
contains the edges of
the form {w, u
j
} for whenever 1 j<i(at the expense of loosing the edge {v,u
j
})
so that G
= G
l+1
is our desired graph. The desired conclusion now follows.
From lemma 4.4.3 together with transitivity of and the fact that a star number o f a
nite graph is nite, the fo llowing follows immediately:
Lemma 4.4.4 Every graph G ∈G(S, k) is equivalent to a generalized star.
Thanks to lemma 4.4.4 we only h ave to prove that any two generalized stars are equiv-
alent. The reader can now appreciate the notion of a star number for our purposes.
In fact, the following lemma narrows down the types o f generalized stars we need to
establish the equivalence of:
Lemma 4.4.5 Enumerate the nodes in S as 0, 1, 2,...,|S|−1 and let G ∈G(S, k)
denote a genera lized star with star center 0. Fix any i satisfying 1 i ≤|S|−1.
Then G G
where G
is the generalized star with the node i be ing its star center (see
denition 4.4.3 ) .
Proof: If |S| =1there is nothing to prove. Otherwise W.L.O.G. assume i =1
(otherwise just renumber the nodes). Again, we exploit transitivity and construct a
sequence
G = G
1
,G
2
,...,G
|S|−1
= G
such that G
l
G
l+1
and every node j with j l and j =1is a neighbor of 1 in
G
l
.WeletG
1
= G and note that 0 is a neighbor of i since 0 is a center of G.Forl
satisfying 1 <l<|S|,letG
l
= π
(0,l)
(1, 0), (0,l)(1,l)
(G
l1
).Since0 is a star center of G
and the edge {0,l} was not deleted from any of G
j
for j<lit follows th at G
l
contains
the edge {1,l} at the expense of loosing the edge {0,l}. Notice that the intermediate
graphs G
i
may have star number 2,butthenal graph G
= G
|S|−1
is a generalized
with star center 1.
Thanks to lemmas 4.4.4 and 4.4.5, all that remains to show to establish the orem 4. 4.1
is that any two generalized stars having a common center (see denition 4.4.3) are
equivalent under the relation of denition 4 .4.2. We accomp lish this task in thr ee
steps:
Lemma 4.4.6 Suppose we are a generalized star G ∈G(S, k). Enumerate the nodes
of G as 0, 1,...,|S|−1 with 0 denoting a star center. Suppose an edge {i, j}∈E(G)
and {i, l} / E(G) for i, j and l>0.ThenG is equivalent via to the generalized
star G
∈G(S, k) determined by the set of edges E(G
)=E(G)∪{{i, l}}{{i, j}}.
24 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
In other words, G is equivalent to the generalized star obtained from G by removing
the edge {i, j} and inserting the edge {i, l}.
Proof: First let G

= π
(i, 0)
(i, 0), (0,l)(i, l)
(G) so that E(G

)=E(G) ∪{{i, l}}
{{i, 0}} (notice that G

may not be a generalized star). Now let G
= π
(i, j)
(i, j), (j, 0)(i, 0)
(G

)
and notice that E(G
)=E(G

)∪{{0,i}}− {{i, j}} =(E(G)∪{{i, l}} − {{i, 0}})
{{0,i}} {{i, j}} = E(G) ∪{{i, l}} {{i, j}} whichiswhatwewereafter.
Lemma 4.4.7 Suppose we are a generalized star G ∈G(S, k). Enumerate the nodes
of G as 0, 1,...,|S|−1 with 0 denoting a star center. Suppose an edge {i, j}∈E(G)
and {q, l} / E(G) for i, j, q and l>0.ThenG is equivalent via to the generalized
star G
∈G(S, k) determined by the set of edges E(G
)=E(G)∪{{q, l}} −{{i, j}}.
In other words, G is equivalent to the generalized star obtained from G by removing
the edge {i, j} and inserting the edge {i, l}.
Proof: If q = i the situation is exactly that of lemma 4.4.6 and so we assume that
q = i. In such a case, either one of the following mutually exclusive situations can
occur: either {i, q}∈E(G) or {i, q} / E(G).
Case 1: {i, q}∈E(G). In this case, according to lemma 4.4.6 G G

with
E(G

)=E(G) ∪{{q, l}} {{i, q}} and, again thanks to lemma 4.4.6, G

G
with E(G
)=E(G

) ∪{{i, q}} {{i, j}} =(E(G) ∪{{q, l}} {{i, q}})
{{i, q}} {{i, j}} = E(G) ∪{{q, l}} {{i, j}}
Case 2: {i, q} / E(G). In this case, according to lemma 4.4.6 G G

with
E(G

)=E(G) ∪{{i, q}} {{i, j}} and, again thanks to lemma 4.4.6, G

G
with E(G
)=E(G

) ∪{{q, l}} {{i, q}} =(E(G) ∪{{i, q}} {{i, j}})
{{q, l}} {{i, q}} = E(G) ∪{{q, l}} {{i, j}}
Thus, in any of the possible cases we deduce that G G
with E(G
)=E(G)
{{q, l}} {{i, j}} so that the desired conclusion follows at once.
Lemma 4.4.7 brings us very close to reaching our nal goal. Indeed, let 0, 1,...,|S|−1
enumerate the nodes of a generalized star graph G ∈G(S, k) with 0 denoting the
star center. Notice that G is uniquely determined by the binary sequence of length
(|S|−1)(|S|−2)
2
indexed b y all pairs (i, j) satisfying 1 i<j≤|S|−1 and having
a 1 in position (i, j) if and only if {i, j}∈E(G). Such a sequence contains exactly
k −|S| +11s and the rest are zeros. Lemma 4.4.7 tells us that transposing a one and
a zero in this sequence results in an equivalent generalized star. It is well known that
every permutation is a product of transpositions and so we d educe
Lemma 4.4.8 Any two generalized stars in G(S, k) h aving a common star center are
equivalent via .
In summary, combining lemmas 4.4.8 with lemma 4.4.5 tells us that any two general-
ized stars are equivalent. Now, given any two graphs G and G
∈G(S, k), according
to lemma 4.4.4, G G
1
and G
G
2
with G
1
and G
2
being generalized stars and, ac-
cording to the previous sentence, G
1
G
2
so that we nally have G G
1
G
2
G
so that theorem 4.4.1 now follows.
25 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)
Bibliography
[1] M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of
NP-completeness. Freeman and Company, San Francisco., 1979.
[2] R. Gomory and T. Hu. Multiterminal network ows. SIAM Journal on Applied
Mathematics, 9:551–570, 1969.
[3] T. Hu. Optimum communication spanning trees. SIAM Journal on Computing,
3(3):188–195, 1947.
[4] HWU. Derivation of network topology from combining graph-theoretic and system
requirements. DBE Deliverable 19.2.
[5] SUN. Dbe n ervous system. DBE Deliverable 19.1.
[6] UBHAM. Report on clustering in networks. DBE Deliverable 11.2.
[7] UBHAM. Report on the ow of software components in a static network. DBE
Deliverable 11.1.
26 / 26
Del 11.3 Report on hierarchical networks
DBE Project (Contract n° 507953)