The NAS research section invites MSc students to join our group, and do exciting research on the forefront of Network Science. If you are interested in one of these projects, contact us by emailing Maksim Kitsak at This email address is being protected from spambots. You need JavaScript enabled to view it..
If you already have your own project in mind, propose it to us! We will advise you on feasibility, and judge whether there’s a match.
See the instructions for the information on how to apply for a MSc thesis at the NAS Group.
Do NOT email Prof. Van Mieghem directly.
Epidemic Walks
Introduction
Consider a walk W_{p} of length k of a person p in its contact graph G_{p }, which is a subgraph of the human population contact graph,
W_{p} = {n_{1}(t_{1}), n_{2}(t_{2}), ... , n_{k}(t_{k})}
where nj (t_{j}) is the node ID of person j at time t_{j} of the encounter. Mobile apps can construct the walk W_{p} (by bluetooth interactions) and store the node ID n_{j} (t_{j}) in an encrypted and secure way (by encrypting the MACaddress of the smart phone), which is not directly traceable to the name of person j. In the ideal world, we also would like to have the corresponding infection state X_{j} of each encounterd person j, in which X_{j} = 0 if person j is healthy, else X_{j} = 1. Hence, to the walk W_{p} with nodal IDs, there corresponds an infection vector,
I_{p} = {X_{1} (t_{1}) , X_{2} (t_{2}), .... , X_{k} (t_{k})}
After a walk W_{p} of k contacts encounters, the person p is assumed to be infected (with high probability) if at least one the elements in his infection vector I_{p} equals X_{j} (t_{j}) = 1. In practice, however, the infection vector I_{p} is not known, for three basic reasons:
 if people also report their health condition in their app by specifying X_{j }, then the reported infection vector I_{p;reported} would only contain 0 bits, because infected people should stay at home;
 there is a nonnegligible fraction of asymptomatic people with X_{j} = 1 who do not know that they are infected and infectious, and thus report a (wrong) healthy state X_{j;reported} = 0;
 there are the exposed and already infectious, hence X_{j} (t_{j}) = 1 and X_{j;reported} (t_{j}) = 0, but they only experience infection after some delay T_{j} , implying that they update their state later into X_{j;reported} (t_{j }+T_{j}) = 1.
Challenges
 Given these uncertainties, what is the probability that person p is infected at time t ≥ t_{k}, after a series of k contacts?
 Vice versa, if person p figures out to be infected, how many of its encountered persons in W_{p} are then infected by him?
 Is it possible to deduce the infection tree over all walks that have met an infected person?
Measuring the Recoverability of Networks
Introduction
Network recoverability refers to the ability of a network to return to a desired performance level after suffering malicious attacks or random failures. In [1], we proposed a general topological approach and recoverability indicators to measure the network recoverability in two scenarios: 1) recovery of damaged connections and 2) any disconnected pair of nodes can be connected to each other.
Challenges
 Measure and compare the recoverability of different network types, such as random networks, scale free networks, regular networks as well as realworld networks, etc. Among each type of networks, how do the network parameters influence the recoverability of these networks? This work will give us insight into how to design networks with high recoverability.
 How to analytically measure the recovery efficiency of different recovery strategies? Can we find the best strategy for all types of networks we are investigating? How good is this strategy compared with greedy recovery strategy?
 You are encouraged to steer the master thesis topic towards your own interest and skills. For instance, you could propose analytical approximations to estimate the average failure and recovery process; you can use machine learning method to estimate the recoverability of networks and compare the results with simulation results.
Reference
[1] He, Z., Sun, P. and P. Van Mieghem, 2019, "Topological approach to measure network recoverability", best paper award, The 11th International Workshop on Resilient Networks Design and Modeling, 1416 October, Nicosia, Cyprus.
Hyperbolic Representations of Complex Networks
Introduction
Network embeddings or mappings of networks to latent geometric spaces are standard tools in the arsenal of data analysis, and are routinely used in machine learning, visualization, network science, and graph theory. In general, a procedure of network embedding is mapping network nodes to points in a suitable latent metric space, such that latent distances between connected node pairs are smaller than those between disconnected node pairs. Latentgeometric distances are often interpreted as generalized measures of similarities: similar nodes are separated by small distances.
Applications of network embeddings include prediction of missing links, network reconstruction, as well as the analysis of dynamic processes taking place on the network, e.g., routing or epidemic spreading.
Real networks are often characterized by heterogeneous distributions of number of connections per node (node degrees): while the majority of nodes in a network have only a handful connections to other nodes, there are also hubs, nodes with abnormally large number of connections to other nodes [1]. Networks with heterogeneous degree distributions are often referred to as scalefree.
It has recently been shown that scalefree networks are best represented/embedded in spaces with nonEuclidean geometries. One example, to this end, is embedding of networks into spaces with constant negative curvature, i.e., hyperbolic spaces [2, 3].
The biggest challenge/bottleneck of hyperbolic network representations is the lack of scalable embedding methods. Existing hyperbolic embedding methods are either (i) accurate and nonscalable or (ii) scalable but not very accurate.
Project
Rely on latest developments in Statistical Inference and Machine Learning to develop a SCALABLE and ACCURATE hyperbolic embedding algorithm.
Deliverables
Hyperbolic embedding algorithm.
Requirements
This is a very challenging project at the intersection of Machine Learning, Network Science and nonEuclidean geometry. Successful student will have excellent coding skills, prior experience with Euclidean network embeddings, and some knowledge of Statistical Machine Learning.
References
[1] AL. Barabási, Network Science, Cambridge University Press, (2016).
[2] D. Krioukov, F. Papadopoulos, M. Kitsak, A. Vahdat, and M. Boguná, Hyperbolic Geometry of Complex Networks, Physical Review E, 82 3 (2010).
[3] M. Boguná, F. Papadopoulos, D. Krioukov, Sustaining the internet with hyperbolic mapping, Nature communications, 1 1 (2010).
NonConsensus Opinion models with Byzantine Nodes
Introduction
Social dynamics has been studied extensively in recent years using concepts and methods based on ideas from statistical physics. An important approach is complex networks, where the nodes represent agents and the links represent the interactions between them. There is considerable current interest in the problem of how two competing opinions evolve in populations [1], [2].
Li et al. [3] proposed the socalled NonConsensus Opinion (NCO) model. The models assume a random initialization where a fraction of nodes has one opinion (let us say Black) and all other nodes have the opinion White. Then at each iteration step considers the opinions of all its neighbors, also taking into account its own opinions. The opinion of the node is changed if and only if the number of opposite opinions is larger than its own opinion. It is shown in [3] that the NCO model allows for stable coexistence of two opinions by forming clusters of agents holding the same opinion.
Project
The aim of this project is to study the properties of the NCO model, under the assumption that a fraction of the nodes in the network are socalled Byzantine nodes. Every time such nodes communicate their opinion to their neighbors, they will lie about it. So far example, a Byzantine node with opinion Black will tell its neighbors it has opinion White.
The project seeks answers to the following questions:

What is typical steadystate behavior for the ByzantineNCO model?

What is the relation between initial parameter settings and steadystate behavior?

How are the dynamics of the ByzantineNCO model on toy network models and on realworld networks?

What is a realistic application of the ByzantineNCO model?
The supervisor will be prof. dr. Robert Kooij. He already did some initial research on the topic and has some software available that can be used as starting point for the project. Duration of the project is nine months.
Deliverables
 syste

Userfriendly tool for the simulation of the ByzantineNCO model

Report containing the results of the project
References
[1] C. Castellano et al., Rev. Mod. Phys. 81, 591 (2009)
[2] S. Galam, Europhys. Lett. 70, 705 (2005).
[3] Qian Li, Lidia A Braunstein, Huijuan Wang, Jia Shao, H Eugene Stanley and Shlomo Havlin. 2013. Nonconsensus opinion models on complex networks. Journal of Statistical Physics 151, 12 (2013), 92–112.
Properties of Node Reliability Polynomials
Introduction
The most common measure of robustness of a network to random failures of components is allterminal reliability. For a (finite, undirected) graph G where nodes are always operational and edges are independently operational with probability p ∈ [0, 1], the allterminal reliability of G is the probability that all of the nodes can communicate with one another. The allterminal reliability of a graph gives rise to a polynomial in p, and algorithms for calculating and efficiently estimating this function have been a major focus in the area, see [1]. There is an analogous notion for robustness in a network with node rather than edge failures, see [2]. Given a graph G in which edges are always operational and nodes are independently operational with probability p, the node reliability of G, is the probability that at least one node is operational and that all of the operational nodes can communicate with one another. The node reliability of G also gives rise to a polynomial in p, the so called node reliability polynomial.
Project
The aim of this project is to study properties of node reliability polynomials. In particular, the project will look at node reliability from three perspectives: theoretical, computational and applied.
For the theoretical perspective we will look for a method to construct two graphs, distinct, but with the same number of nodes N and links L, such that their node reliability polynomials intersect an arbitrary number of times. This result would generalize the results of [3], where a construction is given for pairs of graphs with an arbitrary number of crossing for the allterminal reliability polynomials.
For the computational perspective we study how the exact method for the computation of allterminal reliability suggested in [4], can be adjusted, such that it can be used to compute node reliability as well.
Note that the method in [4] depends on concepts such as decomposition and pathwidth.
Finally, for the applied perspective, we want to apply the concept of node reliability to a realworld network. In particular, we want to use the algorithm obtained from the computational perspective to some critical infrastructure, cf. the gas distribution network that was studied in [5].
The supervisor will be prof. dr. Robert Kooij. In addition, prof. Jason Brown, from the Dalhousie University in Canada, will be involved, as he is an expert of (node) reliability polynomials. The involvement of prof. Brown will be mainly through regular Skype sessions and email contact. For the realization of the applied perspective, we will seek contact with owners of critical infrastructures, such as telecom networks, power grids or water distribution networks. Duration of the project is nine months.
Deliverables

Userfriendly tool for the computation of node reliability

Report containing the results of the project
References
[1] C.J. Colbourn, The combinatorics of network reliability, New York: Oxford University Press, 1987.
[2] Jason Brown, Lucas Mol, On the roots of the node reliability polynomial, On the roots of the node reliability polynomial. Networks. 68. 10.1002/net.21697, 2016.
[3] J.I. Brown, Y. Koç and R.E. Kooij, Reliability polynomials crossing more than twice, Proc. of 3rd Int. Workshop on Reliable Network Design and Modeling, Budapest, Hungary, Oct. 57, 2011, pp.135140.
[4] Willem Pino, Teresa Gomes, and Robert Kooij, A Comparison between Two AllTerminal Reliability Algorithms, Journal of Advances in Computer Networks, vol. 3, no. 3, pp. 284290, 2015.
[5] W. Pino, D. Worm, R. van der Linden, R.E. Kooij, “The Reliability of a Gas Distribution Network: A Case Study”, Proc. of 2016 Int. Conf. on System Reliability and Science, Paris, France, Nov. 1518, 2016.
Solving Differential Equations of Epidemic Models around the Epidemic Threshold
Introduction
Various epidemic models have been proposed to understand and mitigate the spread of infectious diseases, such as gonorrhoea, syphilis, or the flu. A common epidemic modelling approach is to split the total population into groups and describe the viral spread among these groups by nonlinear differential equations. Then, each group has a viral state that equals the fraction of infected individuals in the respective group, and the contact network describes how likely an infection is transmitted from one group to another group. A crucial quantity in epidemic models is the epidemic threshold: Above the epidemic threshold, the virus dies out, and below the epidemic threshold, the virus reaches an endemic state. Recently, we have solved the nonlinear NIMFA differential equations around the epidemic threshold, that have been proposed by Lajmanovich and Yorke [1] as an epidemic model for gonorrhoea.
Challenges
There are two possible directions for this master thesis:
 Extend our results to directed contact networks: In our solution for the NIMFA differential equations, we considered that the underlying contact network is undirected. However, simulations indicate that the solution also holds (approximately?) for directed contact networks, but a proof is an open challenge.
 Consider other epidemic models: We focussed on the NIMFA differential equations, whereby every group consists of individuals that are either infected or healthy. Other epidemic model consider that individuals could also be, for instance, removed from the population (by dying or gaining immunity). Can our solution approach be applied to differential equations from epidemic models other than NIMFA?
Reference
[1] Lajmanovich A, Yorke JA (1976) A deterministic model for gonorrhea in a nonhomogeneous population. Mathematical Biosciences 28(34):221236
System identification: theory and applications
Introduction
Complex networks, in general, have two properties:
 The underlying topology, defined by a graph
 The processes/functions that happen in the network, determined by the dynamic equation
To analyse the dynamics of the entire network, both underlying topology and individual systems processes should be merged.
Challenges
 Given the measurements of input and output from a (partly) unknown system, to find the systems equations with the best descriptive/predictive power
 Extraction of the underlying topology
 How to steer the dynamics on a network towards some “desired” regime/state
 To apply this methodology to a realworld case and possibly extend/specialise the methods
Reference
[1] L. Xiang, F. Chen, W. Ren and G. Chen, "Advances in Network Controllability," in IEEE Circuits and Systems Magazine, vol. 19, no. 2, pp. 832, Secondquarter 2019.
MSc Thesis Topics Currently Being Researched
Fair comparison among different network reconstruction methods
Introduction
Data based identification of complex networks has been an active area of research in many scientific disciplines including physical, biological, computer, and social sciences. The susceptibleinfectedsusceptible (SIS) model is one of the most classical epidemic models to study a variety of spreading behaviours in social and computer science. So far, many computationally efficient methods have been proposed to reconstruct networks using binary time series generated by SIS dynamics. Nevertheless, these methods have not been discussed with respect to their limitations. The goal is to have a comparative analysis of different existing network reconstruction methods. The relative results can be further applied to select the appropriate reconstruction algorithm according to the data characteristics.
Challenges
 Many different algorithms require to be studied.
 The influence of many parameters on reconstruction algorithms require to be studied.
 Both the continuoustime SIS process and the discretetime SIS process will be considered.
Reference
[1] Peixoto, Tiago P. "Network reconstruction and community detection from dynamics." arXiv preprint arXiv:1903.10833 (2019).
Network Reconstruction from Observing an Epidemic Outbreak with Mobile Individuals
Introduction
Modern epidemiology is the study of general spreading phenomena, such as the spread of opinions, trends, fake news or posts on online social media. These phenomena have two defining characteristics. First, individuals can be either infected (with the disease, trend, etc.) or healthy. Second, an infection may occur from an infected to a healthy individual if the two individuals are in contact. For instance, the contact between two individuals could be a friendship, or a follower on online social media. However, in many realworld applications, we do not know which individuals have contact with another, which puts a hard constraint on controlling an epidemic outbreak. Network reconstruction methods aim to estimate the contact network, which describes whether any pair of two individuals is in contact or not. The particular focus of this master thesis is to develop a network reconstruction method that accounts for the mobility of individuals, for instance by considering that individuals commute between their work place and their home.
Challenges
 Literature study: get acquainted with epidemic models on networks and stateoftheart network reconstruction methods. What are their advantages and drawbacks?
 Problem formulation: Define an epidemic process that accounts for the mobility of individuals, and formulate the network reconstruction problem (for instance, in the maximum likelihood sense).
 Algorithmic design: Propose a (heuristic, approximate or optimal) network reconstruction method.
 You are encouraged to steer the master thesis topic towards your personal interest and skills. For instance, you could focus on one of the two following:
 Apply the network reconstruction framework to realworld data (online social network data, or epidemic data)
 Explore theoretical aspects (such as reconstructability criteria).
Measuring the robustness of network controllability
Introduction
A system is controllable if it can be driven from any arbitrary state to any desired state in finite time under the control of the driver nodes which are attached to external inputs. The minimum number of driver nodes which gain full controllability after failures or attacks have occurred, can be chosen as the robustness metric.
Challenge
 Though the robustness of network controllability has been studied in the literature, existing approaches tend to study random failure in terms of averagecase behavior, giving no idea of how badly network controllability can degrade purely by chance. How can we get the worstcase and bestcase for the robustness metric under random failures? Can we use the machine learning method to get the approximations for the worstcase, the bestcase as well as the averagecase?
 We have robustness envelope method to measure the robustness of networks. How to apply this method to evaluating the performance of the approximations we obtained?
 There are some special attack strategies, which can be seen as the combination of random failure and targeted attack. What is the impact of these special strategies on the robustness of network controllability?
 Can we find which is the most dangerous or effective attack strategy in terms of their impact on the network controllability?
References
[1] Sun, P., R. E. Kooij, Z. He and P. Van Mieghem, 2019, "Quantifying the Robustness of Network Controllability", 4th International Conference on System Reliability and Safety (ICSRS), 2022 November, Rome, Italy.
[2] Trajanovski, S., J. MartinHernandez, W. Winterbach and P. Van Mieghem, 2013, "Robustness Envelopes of Networks" , Journal of Complex Networks, Vol. 1, pp. 4462.
Link weight tolerance in communication networks
Introduction
We consider a weighted graph G(N,L) that represents a communications network, comprising N nodes, connected through L weighted links. A data communication flow through the network causes load on each of the nodes and on each of the links. The load on a node or a link is defined as the Betweenness [1]. The betweenness is dependent on the shortest path between each node pair; the shortest paths are, in turn, dependent on the link set and on the link weights.
We define the shortest paths matrix: the shortest path between each node pair. This is needed for, among others, OSPF based communication networks. When a link weight changes, this may affect the shortest path matrix. Each link has a tolerance zone; the link weight may vary within the tolerance boundaries without affecting the shortest path matrix.
Challenges
A link weight change may affect the shortest path for one or more node pairs. So long as the link weight varies within its tolerance boundaries, the set of shortest paths is unaffected and so the betweenness (load) of the nodes and links is unaffected. We define a “tolerance vector of TG”, containing the tolerance (weight boundaries) of each link. TG forms an indication of the “criticality” of the weight of the individual links, in order to maintain a stable system.
TG is defined for a specific graph instance. When the weight of a link changes (within its tolerance boundaries), this affects the tolerance of other links. TG is hence a function of the process of changing link weights. Goal is to achieve a stable system (network). Changing the weight of one link may improve the link tolerance of other links.
The stability of a system may be expressed in a single metric, TGave, e.g. the average link weight tolerance. When a link weight changes, TGave varies.
 What relation is there between a change in link weight and TG.
 Links with a high betweenness value are generally “critical” links in a network. What relation is there between link betweenness and link tolerance.
Approach
Link weight tolerance is largely unexplored.
 The student is expected to embark on rudimentary graph theoretic study, to describe and quantify the suggested metrics.
 The student is further expected to conduct a set of experiments on a defined class of graphs, such as ErdősRényi random graph or BarabásiAlbert scale free graph.
References
[1] Freeman; A set of measures of centrality based on betweenness; Sociometry. 40 (1): 35–41 (1977)