Home
Assefaw Gebremedhin, Papers on Parallel Graph Algorithms
Title :
Balanced Coloring for Parallel Computing Applications
Authors:
H. Lu, M. Halappanavar, D. Charri-a-Miranda, A.H. Gebremedhin and
A. Kalyanaraman
Status: Proceedings of IPDPS 2015.
Abstract
Graph coloring is used to identify subsets of independent tasks in parallel scientific computing applications. Traditional coloring heuristics aim to reduce the number of colors used as that number also corresponds to the number of parallel steps in the application. However, if the color classes produced have a skew in their sizes, utilization of hardware resources becomes inefficient, especially for the smaller color classes. Equitable coloring is a theoretical formulation of coloring that guarantees a perfect balance among color classes, and its practical relaxation is referred to as balanced coloring. In this paper, we revisit the problem of balanced coloring in the context of parallel computing. The goal is to achieve a balanced coloring of an input graph without increasing the number of colors that an algorithm oblivious to balance would have used. We propose and study multiple heuristics that aim to achieve such a balanced coloring, present parallelization approaches for multi-core and manycore architectures, and cross-evaluate their effectiveness with respect to the quality of balance achieved and performance. Furthermore, we study the impact of the proposed balanced coloring heuristics on a concrete application - viz. parallel community detection, which is an example of an irregular application.
The thorough treatment of balanced coloring presented in this paper from algorithms to application
is expected to serve as a valuable resource to parallel application developers who seek to improve parallel performance of their applications using coloring.
Download paper in PDF
Title:
Parallel Maximum Clique Algorithms with Applications to
Network Analysis and Storage
Authors:
R.A. Rossi, D.F. Gleich, A.H. Gebremedhin, M.M.A. Patwary
Status: Proceedings of WWW2014.
Abstract
We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. Despite clique's status as an NP-hard problem with poor approximation guarantees, our method exhibits nearly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Key to the efficiency of our algorithm are an initial heuristic procedure that finds a large clique quickly and a parallelized branch and bound strategy with aggressive pruning tnd ordering echniques.
We use the algorithm to compute the largest temporal strong components of temporal contact networks.
Download paper in PDF
Title:
Graph Coloring Algorithms for Multi-core and Massively Multithreaded Architectures
Authors:
U. Catalyurek, J. Feo, A.H. Gebremedhin, M. Halappanavar, A. Pothen
Status:
Parallel Computing 38 (2012), 576-594.
Abstract
We explore the interplay between architectures and
algorithm design in the context of shared-memory platforms
and a specific graph problem of
central importance in scientific and high-performance computing,
distance-1 graph coloring.
We introduce two different kinds of multithreaded heuristic algorithms
for the stated, NP-hard, problem.
The first algorithm relies on speculation and iteration,
and is suitable for any shared-memory system.
The second algorithm uses dataflow principles, and is targeted at
the non-conventional, massively multithreaded Cray XMT system.
We study the performance of the algorithms on three different
platforms---Cray XMT, Sun Niagara 2, and Intel Nehalem---representing
varying degrees of multithreading capabilities.
As testbed, we use synthetically generated massive graphs carefully
designed to cover a wide spectrum of input types.
The results show that the algorithms have scalable runtime performance
and use nearly the same number of colors as the underlying
serial algorithm, which in turn is effective in practice.
The study provides insight into the design of high performance algorithms
for irregular problems on many-core architectures.
Download paper in PDF
Title:
New Multithreaded Ordering and Coloring Algorithms for Multicore Architectures
Authors:
M.M.A. Patwary, A.H. Gebremedhin and A. Pothen
Status:
In E. Jeannot, R. Namyst and J. Roman, editors, Euro-Par 2011, volume 6853 of Lecture Notes in Computer Science,
pages 250 -- 262. Springer, 2011.
Abstract
We present new multithreaded vertex ordering and distance-k graph coloring
algorithms that are well-suited for the emerging and rapidly growing multicore platforms.
The vertex ordering techniques rely on various notions of ``degree", are known to be effective
in reducing the number of colors used by a greedy coloring algorithm, and are generic enough
to be applicable to contexts other than coloring.
We employ approximate degree computation in the ordering algorithms and
speculation and iteration in the coloring algorithms as our primary remedies
for breaking sequentiality and achieving effective parallelization.
The algorithms have been implemented using OpenMP, and experiments run on
Intel Nehalem and other multi-core machines using a set of carefully designed synthetic graphs
and real-world graphs attest that the algorithms provide
scalable runtime performance. The number of colors the algorithms use is nearly the same as
in the serial case, which in turn is often very close to optimal.
Download paper in PDF
Title:
Distributed-memory Parallel Algorithms for Matching and Coloring
Authors:
U. Catalyurek, F. Dobrian, A.H. Gebremedhin, M. Halappanavar,
A. Pothen
Status:
Proceedings of IEEE International Parallel and Distributed
Processing Symposium, Workshops and PhD Forums (IPDPSW),
Workshop on Parallel Computing and Optimization (PCO'11), pages 1966--1975, 2011
Abstract
We discuss the design and
implementation of new highly-scalable distributed-memory parallel algorithms
for two prototypical graph problems,
edge-weighted matching and distance-1 vertex coloring.
Graph algorithms in general have low concurrency, poor data locality, and high ratio of
data access to computation costs,
making it challenging to achieve scalability on massively parallel machines.
We overcome this challenge by employing a variety of techniques, including
speculation and iteration, optimized communication, and randomization.
We present preliminary results on weak and strong scalability studies
conducted on an IBM Blue Gene/P machine employing up to tens of
thousands of processors.
The results show that the algorithms hold strong potential for
computing at petascale.
Download paper in PDF
Title:
Distributed-memory Parallel Algorithms for Distance-2 Coloring and
Related Problems in Derivative Computation
Authors:
D. Bozdag, U. Catalyurek, A. Gebremedhin, F. Manne, E. Boman and F. Ozgunner
Status:
SIAM Journal on Scientific Computing, Vol 32, Issue 4, pp 2418--2446, 2010.
Abstract
The distance-2 graph coloring problem aims at partitioning the vertex
set of a graph into the fewest sets consisting of vertices pairwise at
distance greater than two from each other. Its applications include
derivative computation in numerical optimization and channel
assignment in radio networks. We present efficient,
distributed-memory, parallel heuristic algorithms for this NP-hard
problem as well as for two related problems used in the computation of
Jacobians and Hessians. Parallel speedup is achieved through graph
partitioning, speculative (iterative) coloring, and a BSP-like
organization of parallel computation. Results from experiments
conducted on a PC cluster employing up to 96 processors and using
large-size real-world as well as synthetically generated test graphs
show that the algorithms are scalable. In terms of quality of
solution, the algorithms perform remarkably well---the number of
colors used by the parallel algorithms was observed to be very close
to the number used by the sequential counterparts, which in
turn are quite often near optimal. Moreover, the experimental results
show that the parallel distance-2 coloring algorithm compares
favorably with the alternative approach of solving the distance-2
coloring problem on a graph $G$ by first constructing the square
graph $G^2$ and then applying a parallel distance-1 coloring
algorithm on $G^2$. Implementations of the algorithms are made
available via the Zoltan load-balancing library.
Download paper in PDF
Title:
A framework
for Scalable Greedy Coloring on Distributed Memory Parallel Computers
Authors:
D. Bozdag, A. Gebremedhin, F. Manne, E. Boman and U. Catalyurek
Status:
Journal of Parallel and Distributed Computing Vol 68, No 4, pp 515--535, 2008.
Abstract
We present a scalable framework for parallelizing greedy graph
coloring algorithms on distributed-memory computers. The framework
unifies several existing algorithms and blends a variety of techniques
for creating or facilitating concurrency. The latter techniques
include exploiting features of the initial data distribution,
the use of speculative coloring and randomization, and a BSP-style
organization of computation and communication. We experimentally
study the performance of several specialized algorithms designed using
the framework and implemented using MPI. The experiments are
conducted on two different platforms and the test cases include
large-size synthetic graphs as well as real graphs drawn from various
application areas. Computational results show that implementations
that yield good speedup while at the same time using about the same
number of colors as a sequential greedy algorithm can be achieved by
setting parameters of the framework in accordance with the size and
structure of the graph being colored. Our implementation is freely
available as part of the Zoltan parallel data management and
load-balancing library.
Download paper in PDF
Title:
A Parallel Distance-2 Graph Coloring Algorithm for Distributed
Memory Computers
Authors:
D. Bozdag, U. Catalyurek, A.H. Gebremedhin, F. Manne, E. G. Boman and F.
Ozguner
Status:
Lecture Notes in Computer Science, vol 3726,
2005, pages 796 - 806,Springer. Proc. of HPCC 2005, Sept 21 - 25, 2005,
Sorrento, Italy.
Abstract
The distance-2 graph coloring problem aims at partitioning the vertex
set of a graph into the fewest sets consisting of vertices pairwise at
distance greater than two from each other. Application examples
include numerical optimization and channel assignment. We present the
first distributed-memory heuristic algorithm for this NP-hard
problem. Parallel speedup is achieved through graph partitioning,
speculative (iterative) coloring, and a BSP-like organization of
computation. Experimental results show that the algorithm is scalable,
and compares favorably with an alternative approach---solving the
problem on a graph G by first constructing the square graph
G^2 and then applying a parallel distance-1 coloring
algorithm on G^2 .
Download paper in PDF
Title:
A Scalable Parallel Graph Coloring Algorithm for Distributed Memory
Computers
Authors:
E.G. Boman, D. Bozdag, U. Catalyurek, A.H. Gebremedhin and F. Manne
Status:
Lecture Notes in Computer Science, vol 3648 , 2005, pages 241 - 251,Springer.
Proc. of EuroPar 2005, 30 Aug - 2 Sept, 2005, Lisboa, Portugal.
Abstract
In large-scale parallel applications a
graph coloring is often carried out to schedule computational
tasks. In this paper, we describe a new distributed-memory algorithm
for doing the coloring itself in parallel. The algorithm operates in
an iterati ve fashion; in each round vertices are speculatively
colored based on limited information, and then a set of incorrectly
colored vertices, to be recolored in the next round, is identified.
Parallel speedup is achieved in part by reducing the frequency of
communication among processors. Experimental results on a PC cluster
using up to 16 processors show that the algorithm is scalable.
Download paper in PDF
Title:
Speeding up Parallel Graph Coloring
Authors:
A.H. Gebremedhin, F.Manne and T. Woods
Status:
Lecture Notes in Computer Science, vol 3732, pp 1079-1088, 2005,
Springer. Proc. of Para 2004, June 20 - 23, 2004, Lyngby, Denmark.
Abstract
This paper presents new efficient parallel
algorithms for finding approximate solutions to graph coloring
problems. We consider an existing shared memory parallel graph
coloring algorithm and suggest several enhancements both in terms of
ordering the vertices so as to minimize cache misses, and performing
vertex-to- processor assignments based on graph partitioning instead
of random allocation.
We report experimental results that demonstrate the performance of our
algorithms on an IBM Regatta supercomputer when up to 12 processors
are used. Our implementations use OpenMP for parallelization and
Metis for graph partitioning. The experiments show that we get up
to a 70 % reduction in runtime compared to the previous algorithm.
Download paper in PDF
Title
Graph Coloring on Coarse Grained Multicomputers
Authors:
A. Gebremedhin, I.Guerrin-Lassous, J. Gustedt and J.A. Telle
Status:
Discrete Applied Mathematics, Vol 131, No 1, pp 179--198, 2003
Abstract
We present an efficient and scalable Coarse Grained Multicomputer (CGM)
coloring algorithm that colors a graph G with at most
D + 1 colors where D is the maximum degree in G .
This algorithm is given in two variants: a randomized and
a deterministic .
We show that on a p-processor CGM model
the proposed algorithms require
a parallel time of
O(|G|/p) and a total work and overall
communication cost of O(|G|) .
These bounds correspond to the average
case for the randomized version and to the worst case for the
deterministic variant.
Download paper in PDF
Title:
Parallel Distance-k Coloring
Algorithms for Numerical Optimization
Authors:
A.H. Gebremedhin, F. Manne and A. Pothen
Status:
In B. Monien and R. Feldmann
(Eds.): EuroPar 2002, Lecture Notes in Computer Science 2400, pp. 912-921,
Springer-Verlag 2002.
Abstract
Matrix partitioning problems that arise in the efficient estimation of
sparse Jacobians and Hessians can be modeled using variants of graph
coloring problems. In a previous work, we argue that
distance-2 coloring and distance-3/2 coloring [we now call this
star coloring ]
are robust and flexible formulations of the respective matrix
estimation problems. The problem size in large-scale optimization
contexts makes the matrix estimation phase an expensive part of the
entire computation both in terms of execution time and memory
space. Hence, there is a need for both shared- and distributed-memory
parallel algorithms for the stated graph coloring problems. In the
current work, we present the first practical shared address space
parallel algorithms for these problems. The main idea in our
algorithms is to randomly partition the vertex set equally
among the available processors, let each processor
speculatively color its vertices using information about
already colored vertices, detect eventual conflicts in parallel, and
finally re-color conflicting vertices sequentially.
Randomization is also used in the coloring phases
to further reduce conflicts. Our PRAM-analysis shows that the
algorithms should give almost linear speedup for sparse graphs that
are large relative to the number of processors. Experimental results
from our OpenMP implementations on a Cray Origin2000 using various
large graphs show that the algorithms indeed yield reasonable speedup
for modest numbers of processors.
Download paper in PDF
Title:
Scalable Parallel Graph Coloring Algorithms
Authors:
A. Gebremedhin and F. Manne
Status:
Concurrency: Practice and Expereince,
Vol 12, pp1131--1146, 2000.
Abstract
Finding a good graph coloring quickly is
often a crucial phase in the development of efficient, parallel
algorithms for many scientific and engineering applications. In this
paper we consider the problem of solving the graph coloring problem
itself in parallel. We present a simple and fast parallel graph
coloring heuristic that is well suited for shared memory programming
and yields an almost linear speedup on the PRAM model. We also
present a second heuristic that improves on the number of colors used.
The heuristics have been implemented using OpenMP. Experiments
conducted on an SGI Cray Origin 2000 super computer using very large
graphs from finite element methods and eigenvalue computations
validate the theoretical run-time analysis.
Download paper in PDF
Title:
Graph Coloring on a Coarse Grained Multiprocessor
Authors:
A.H. Gebremedhin, I.G. Lassous, J. Gustedt and J.A. Telle
Status:
In Brandes, Ulrik, Wagner
and Dorothea (Eds.): WG 2000, Lecture Notes in Computer Science 1928, pp.
184-195, 2000, Springer-Verlag.
Abstract
We present the first efficient parallel coloring algorithm for
the Coarse Grain Multicomputer model. The algorithm uses at
most D+1 colors, where D is the maximum degree
in the graph.
Download paper in PDF
Title:
Parallel Graph Coloring Algorithms using
OpenMP
Authors:
A.H. Gebremedhin and F. Manne
Status:
In Proc. of EWOMP'99, First European Workshop on OpenMP,
Sept.30 - Oct. 1, 1999, Lund, Sweden.
Download Extended Abstract in PDF
Go to
Research Areas
Coloring and Automatic Differentiation
Parallel Algorithms
Parallel Computation Models