Our research group focuses on algorithmic aspects of parallel and distributed computing, and the main goal is to solve different fundamental problems in this research area. Parallel and distributed computing is a key technology for the solution of large, computationally intensive problems in science and engineering. In parallel and distributed systems, a complex task is processed by several computing entities simultaneously. These share the workload in a similar way to an assembly line or a large company with different business areas. We distinguish between parallel computing, in which the computing performance is achieved by a single parallel machine consisting of several processors, and distributed computing, where the task is processed by a distributed network consisting of several computers placed at different locations.
We are particularly interested in the following problems:
Algorithms for Information Dissemination in Large Networks
Information dissemination in large networks has various fields of application in distributed computing. There is an enormous amount of experimental and theoretical study of (deterministic and randomized) broadcasting in various models and on different networks. We are mainly interested in the runtime and communication overhead produced by randomized broadcasting algorithms. As an example, consider the so-called push algorithm: In a graph we place at some time t a piece of information on one of the nodes. Then, in each succeeding time step, any informed vertex forwards a copy of this information to a neighbor selected independently and uniformly at random.
The advantage of randomized broadcasting is its inherent robustness against different kinds of failures and dynamical changes. In our research group we analyze and develop simple randomized broadcasting algorithms with the following properties:
- They can successfully handle restricted communication failures in the network.
- They work well if size or topology of the network change slightly while the algorithm is executed.
- The running time and number of message transmissions they produce is asymptotically minimal.
We analyzed several variants of the simple broadcasting algorithm described above on arbitrary graphs, as well as on some specific network topologies. We established new lower and upper bounds on the runtime for several graph classes. Furthermore, we considered the behavior of the push algorithm if transmissions are not always sent successfully. Also in this noisy environment, the push algorithm is still time-efficient, provided that the transmissions are successful with a constant probability. We also considered the relationship between random walks and the runtime of the push algorithm.
Another main objective is to reduce the message complexity produced by simple randomized broadcasting without increasing the runtime. To achieve this goal, the so called pull model has been introduced. In this model, each (informed or uninformed) node chooses in every step a neighbor, uniformly at random, and uninformed nodes are allowed to pull information from informed nodes called by this node in a specific step. This kind of transmission makes sense if at almost every node updates occur frequently so that almost every node places a random call in each round anyway. It was shown before that by combining push and pull it is possible to significantly reduce the number of message transmissions in a complete graph. We analyzed simple modifications of this combined model on certain graph classes and determined the behavior of randomized broadcasting in different networks.
The design of efficient algorithms for exploring unknown networks is a fundamental problem in the theory of computing. The standard approach to explore a fixed graph is to use e.g. breadth first search. However, in most real world networks this approach is not applicable. In peer-to-peer systems or mobile ad-hoc sensor network, the nodes are not necessarily distinguishable by some IDs and the structure of the graph is dynamically evolving over time. For such cases, randomized strategies should be applied.
A random walk is a simple and robust strategy to explore a network. It only uses a small amount of memory, and can easily cope with dynamical changes in the structure of the network. Although random walks perform well in practice, surprisingly little is known formally about the performance of such walks except the simple case of a single walk on a fixed network. Our goal is to improve our ability to formally predict the performance of simultaneous and interacting walks.
A measure of performance of a search procedure is the cover time; the expected number of steps to visit all nodes of the graph. The simplest randomized exploration algorithm consists of a random walk: an agent moves in each step from its current position to a neighboring node chosen uniformly at random.
However, for a large graph the search time needed by a single walk is unacceptable. Our goal is to derive new strategies to speed-up the cover time. The strategy of local exploration is particularly appropriate to computer network search. That is, at each step the walk uses local probing to find unvisited neighbors.
There have been only a few attempts to analyze the cover time of multiple random walks. The cover time of independent random walks starting from the stationary distribution was studied in connection with time-space trade-offs. In graph exploration, however, random walks are required to start from the same vertex. Some aspects of the speed-up of multiple random walks have been studied, and the understanding of this problem on general graphs has further been improved. However, there are still large gaps between lower and upper bounds for the speed-up.
One of our aims is to analyze the cover time of multiple random walks on graph classes arising from real world applications. Furthermore, we plan to devise strategies, which combine neighborhood exploration with multiple random walks. This should lead to improved cover times on many important graph classes, and open new ways for efficient network exploration.
Efficient Load Balancing
Load balancing is one of the key problems that must be addressed to efficiently use parallel and distributed computer systems. A parallel application can be visually described as a component-wise manufacturing process in industrial production. An application (=fabrication of an industrial product) is divided in several subtasks (=sub-products) and these subtasks are executed on different processors (=workers) of the parallel or distributed system. Subtasks can either be run independently from each other or, if required by the underlying application, interdependencies between them have to be obeyed. In the latter case, the processors must use communication in order to exchange intermediate results. Summarizing, the load balancing problem aims at the following goals:
- The application's total execution time should be minimized.
- All processors of the system should operate during the whole computation and idle times should be avoided.
- The total load in the system should be distributed equally and 'fair' among the processors of the system.
- The communication overhead between the processors should be minimized.
For several years, our research group has studied efficient load balancing algorithms. While we have analyzed the problem theoretically, we have also implemented the resulting algorithms and tested them in real world applications. To obtain good mappings of the tasks to the processors, several efficient methods have been developed. In our research group we have focused on analyzing local iterative load balancing algorithms. Thereby, we distinguish between diffusion and dimension exchange schemes. These two classes differ in the topology's communication abilities. Diffusion algorithms assume that a node of a network can send and receive messages to/from all of its neighbors simultaneously, whereas dimension exchange does only use pair wise communication with one neighbor after the other.
In the future, load balancing algorithms for large, distributed and dynamic networks should be developed. Here, non-cooperative networks (such that the Internet) play a very important role. In order to develop efficient load balancing strategies for such topologies, the use of sophisticated linear algebraic and game theoretic methods is required.
Structure and Spectra of Large Networks
In 1999 two different research groups observed that the degrees of the nodes in many real world networks, including the Internet, WWW, telephone call graphs, as well as several social and biological networks, follow a so called power law degree distribution. In order to study such networks analytically, new random graph models have been introduced.
All known random graph models for power law graphs belong either to the so called on-line model or to the off-line model. In the on-line model at each tick of the clock, a decision is made for adding or deleting vertices or edges. This model can be viewed as a dynamically evolving graph where in each step the graph may depend on earlier decisions. The off-line model is a generalization of the classical random graph model of Erdös and Rényi. In the Erdös-Rényi model, however, the graph is almost regular and the expected degree is the same for every vertex. Therefore, a proper off-line model consists of graphs with an arbitrary degree distribution. Such graph can then easily be used to model random power law graphs.
Several properties of a variety of graphs belonging to the above mentioned random graph models have been analyzed in the past. However, many structural properties of real world networks are not covered by the models mentioned above. Especially, the community structure occurring in social networks cannot be observed in any of these random graphs. In the future, we plan to study the relationship between optimal partitions of random power law graphs and the spectrum of their Laplacian. We will also try to derive new graph models which are able to capture the structure of the communities in social networks.