Home
Assefaw Gebremedhin, Papers on Network Analysis
Title:
Performance Prediction for Graph Queries
Authors:
M.H. Namaki, K. Sasani, Y. Wu and A.H. Gebremedhin
Status: ACM SIGMOD International Conference on Management of Data Workshop on Network Data Analytics (NDA 2017)
Abstract
Query performance prediction has shown benefits to query optimization and
resource allocation for relational databases.
Emerging applications are leading to
search scenarios where workloads with
heterogeneous, structure-less analytical queries
are processed over large-scale graph and network data.
This calls for effective models to
predict the performance of graph analytical
queries, which are often more involved than
their relational counterparts.
In this paper, we study and evaluate
predictive techniques for graph query performance prediction.
We make several contributions.
(1) We propose a general learning framework
that makes use of practical and
computationally efficient statistics from
query scenarios and
employs regression models.
(2) We instantiate the framework with
two routinely issued query classes, namely, reachability and
graph pattern matching, that exhibit
different query complexity. We develop
modeling and learning algorithms for
both query classes.
(3) We show that our prediction models
readily apply to resource-bounded querying, by
providing a learning-based workload optimization strategy.
Given a query workload and a time bound, the models
select queries to be processed with
a maximized query profit and a
total cost within the bound.
Using real-world graphs,
we experimentally demonstrate the efficacy of
our framework in terms of
accuracy and the effectiveness of
workload optimization.
Download paper in PDF
Title:
Parallel Maximum Clique Algorithms with Applications to Network Analysis
Authors:
R.A. Rossi, D.F. Gleich, A.H. Gebremedhin
Status: SIAM Journal on Scientific Computing, Vol 37, Issue 5, pages C589-C618, 2015.
Abstract
We present a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits a roughly linear runtime scaling over real-world networks ranging from a thousand to a hundred million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. At its heart the algorithm employs a branch-and-bound strategy with novel and aggressive pruning techniques.
The pruning techniques include the combined use of core numbers of vertices along with a good initial heuristic solution to remove the vast majority of the search space. In addition, the exploration of the search tree is parallelized. During the search, processes immediately communicate changes to upper and lower bounds on the size of maximum clique. This occasionally results in a super-linear speedup because tasks with large search spaces can be pruned by other processes. We demonstrate the impact of the algorithm on applications using two different network analysis problems: computation of temporal strong components in dynamic networks and determination of compress-friendly ordering of nodes of massive networks.
Download paper in PDF
Title:
Fast Algorithms for the Maximum Clique Problem on Massive Graphs
with Applications to Overlapping Community Detection
Authors:
B. Pattabiraman, M.M.A Patwary, A.H. Gebremedhin, W.K. Liao, A. Choudhary
Status:
Internet Mathematics, Vol 11, No 4-5, pp 421-448, 2015.
Abstract
The maximum clique problem is a well known NP-Hard problem with
applications in data mining, network analysis, information retrieval and many other
areas related to the World Wide Web.
There exist several algorithms for the problem with acceptable runtimes for
certain classes of graphs, but many of them are infeasible for massive graphs.
We present a new exact algorithm that employs novel pruning techniques and
is able to find maximum cliques in very large, sparse graphs quickly.
Extensive experiments on different kinds of synthetic and
real-world graphs show that our new algorithm can be orders of magnitude
faster than existing algorithms.
We also present a heuristic that runs orders of magnitude faster than
the exact algorithm while providing optimal or near-optimal solutions.
We illustrate a simple application of the algorithms in developing methods for
detection of overlapping communities in networks.
Download paper in PDF
Title:
Fast Maximum Clique Algorithms for Large Graphs
Authors:
R.A. Rossi, D.F. Gleich, A.H. Gebremedhin, M.M.A. Patwary
Status: Proceedings of WWW2014.
Abstract
We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. Despite clique's status as an NP-hard problem with poor approximation guarantees, our method exhibits nearly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Key to the efficiency of our algorithm are an initial heuristic procedure that finds a large clique quickly and a parallelized branch and bound strategy with aggressive pruning tnd ordering echniques.
We use the algorithm to compute the largest temporal strong components of temporal contact networks.
Download paper in PDF
Title:
Fast Algorithms for the Maximum Clique Problem on Massive Sparse Graphs
Authors:
B. Pattabiraman, M.M.A Patwary, A.H. Gebremedhin, W.K. Liao, A. Choudhary
Status:
WAW 2013: 10th Workshop on Algorithms and Models for the Web Graph,
LNCS 8305, pp 156-169, 2013.
Abstract
The maximum clique problem is a well known NP-Hard problem with
applications in data mining, network analysis, information retrieval and many other
areas related to the World Wide Web.
There exist several algorithms for the problem with acceptable runtimes for
certain classes of graphs, but many of them are infeasible for massive graphs.
We present a new exact algorithm that employs novel pruning techniques and
is able to quickly find maximum cliques in large sparse graphs.
Extensive experiments on different kinds of synthetic and
real-world graphs show that our new algorithm can be orders of magnitude
faster than existing algorithms.
We also present a heuristic that runs orders of magnitude faster than
the exact algorithm while providing optimal or near-optimal solutions.
Download paper in PDF