Accepted Papers
Subexponential parameterized algorithms for graphs of polynomial growth
We show that for a number of parameterized problems for which only 2^{O(k)} n^{O(1)} time algorithms are known on general graphs, subexponential parameterized algorithms with running time 2^{O(k^{11/(1+d)} log^2 k)} n^{O(1)} are possible for graphs of polynomial growth with growth rate (degree) d, that is, if we assume that every ball of radius r contains only O(r^d) vertices. The algorithms use the technique of lowtreewidth pattern covering, introduced by Fomin et al. [FOCS 2016] for planar graphs; here we show how this strategy can be made to work for graphs of polynomial growth.
Formally, we prove that, given a graph G of polynomial growth with growth rate d and an integer k, one can in randomized polynomial time find a subset A of V(G) such that on one hand the treewidth of G[A] is O(k^{11/(1+d)} log k), and on the other hand for every set X of vertices of size at most k, the probability that X is a subset of A is 2^{O(k^{11/(1+d)} log^2 k)}. Together with standard dynamic programming techniques on graphs of bounded treewidth, this statement gives subexponential parameterized algorithms for a number of subgraph search problems, such as Long Path or Steiner Tree, in graphs of polynomial growth.
We complement the algorithm with an almost tight lower bound for Long Path: unless the Exponential Time Hypothesis fails, no parameterized algorithm with running time 2^{k^{11/depsilon}}n^{O(1)} is possible for any positive epsilon and any integer d >= 3.
Improved Bounds for 3SUM, $k$SUM, and Linear Degeneracy
In this paper, we show that the randomized 4linear decision tree complexity of 3SUM is O(n^{3/2}), and that the randomized (2k2)linear decision tree complexity of kSUM and kLDT is O(n^{k/2}), for any odd >= 3. These bounds improve (albeit randomized) the corresponding O(n^{3/2} sqrt{log n}) and O(n^{k/2} sqrt{log n}) decision tree bounds obtained by Gr{\o}nlund and Pettie. Our technique includes a specialized randomized variant of fractional cascading data structure. Additionally, we give another deterministic algorithm for 3SUM that runs in O(n^2 log log n / log n ) time. The latter bound matches a recent independent bound by Freund [Algorithmica 2017], but our algorithm is somewhat simpler, due to a better use of the wordRAM model.
Niv Online Algorithms for Maximum Cardinality Matching with Edge Arrivals
In the adversarial edge arrival model for maximum cardinality matching, edges of an unknown graph are revealed onebyone in arbitrary order, and should be irrevocably accepted or rejected. Here, the goal of an online algorithm is to maximize the number of accepted edges while maintaining a feasible matching at any point in time. For this model, the standard greedy heuristic is 1/2competitive, and on the other hand, no algorithm that outperforms this ratio is currently known, even for very simple graphs.
We present a clean MinIndex framework for devising a family of randomized algorithms, and provide a number of positive and negative results in this context. Among these results, we present a 5/9competitive algorithm when the underlying graph is a forest, and prove that this ratio is best possible within the MinIndex framework. In addition, we prove a new general upper bound of 2/(3+1/phi^2) ~ 0.5914 on the competitiveness of any algorithm in the edge arrival model. Interestingly, this bound holds even for an easier model in which vertices (along with their adjacent edges) arrive online, and when the underlying graph is a tree of maximum degree at most 3.
Online bin packing with cardinality constraints resolved
Cardinality constrained bin packing or bin packing with cardinality constraints is a basic bin packing problem. In the online version with the parameter k >= 2, items having sizes in (0,1] associated with them are presented one by one to be packed into unit capacity bins, such that the capacities of bins are not exceeded, and no bin receives more than k items. We resolve the online problem in the sense that we prove a lower bound of 2 on the overall asymptotic competitive ratio. This closes the long standing open problem of finding the value of the best possible overall asymptotic competitive ratio, since an algorithm of an absolute competitive ratio 2 for any fixed value of k is known. Additionally, we significantly improve the known lower bounds on the asymptotic competitive ratio for every specific value of k. The novelty of our constructions is based on full adaptivity that creates large gaps between item sizes. Thus, our lower bound inputs do not follow the common practice for online bin packing problems of having a known in advance input consisting of batches for which the algorithm needs to be competitive on every prefix of the input. Last, we show a lower bound strictly larger than 2 on the asymptotic competitive ratio of the online 2dimensional vector packing problem, and thus provide for the first time a lower bound larger than 2 on the asymptotic competitive ratio for the vector packing problem in any fixed dimension
Improved Algorithm for Dynamic bMatching
Recently there has been extensive work on maintaining (approximate) maximum matchings in dynamic graphs. We consider a generalisation of this problem known as the maximum bmatching: Every node v has a positive integral capacity b_v, and the goal is to maintain an (approximate) maximumcardinality subset of edges that contains at most b_v edges incident on every node v. The maximum matching problem is a special case of this problem where b_v = 1 for every node v.
Bhattacharya, Henzinger and Italiano [ICALP 2015] showed how to maintain a O(1) approximate maximum bmatching in a graph in O(log^3 n) amortised update time. Their approximation ratio was a large (double digit) constant. We significantly improve their result both in terms of approximation ratio as well as update time. Specifically, we design a randomised dynamic algorithm that maintains a (2+epsilon)approximate maximum $b$matching in expected amortised O(1/epsilon^4) update time. Thus, for every constant epsilon in (0, 1), we get expected amortised O(1) update time. Our algorithm generalises the framework of Baswana, Gupta, Sen [FOCS 2011] and Solomon [FOCS 2016] for maintaining a maximal matching in a dynamic graph.
Tight lower bounds for the complexity of multicoloring
graph G and a set of a colors, and the task is to assign a subset of b colors to each vertex of G
so that adjacent vertices receive disjoint color subsets. This natural generalization of the classic
coloring problem (the b = 1 case) is equivalent to finding a homomorphism to the Kneser graph
KGa,b, and gives relaxations approaching the fractional chromatic number.
We study the complexity of determining whether a graph has an (a:b)coloring. Our main
result is that this problem does not admit an algorithm with running time f(b)·2o(log b)·n, for any
computable f(b), unless the Exponential Time Hypothesis (ETH) fails. A (b+1)n · poly(n)time
algorithm due to Nederlof [2008] shows that this is tight. A direct corollary of our result is that
the graph homomorphism problem does not admit a 2O(n+h) algorithm unless ETH fails, even if
the target graph is required to be a Kneser graph. This refines the understanding given by the
recent lower bound of Cygan et al. [SODA 2016].
The crucial ingredient in our hardness reduction is the usage of detecting matrices of Lindström
[Canad. Math. Bull., 1965], which is a combinatorial tool that, to the best of our knowledge,
has not yet been used for proving complexity lower bounds. As a side result, we prove that the
running time of the algorithms of Abasi et al. [MFCS 2014] and of Gabizon et al. [ESA 2015] for
the rmonomial detection problem are optimal under ETH.
Maxentstress Optimization of 3D Biomolecular Models
Knowing a biomolecule’s structure is inherently linked to and a prerequisite for any detailed understanding of its function. Significant effort has gone into developing technologies for structural characterization. These technologies do not directly provide 3D structures; instead they typically yield noisy and erroneous distance information between specific entities such as atoms or residues, which have to be translated into consistent 3D models.
Here we present an approach for this translation process based on maxentstress optimization. Our new approach extends the original graph drawing method for the new application’s specifics by introducing additional constraints and confidence values as well as algorithmic components. Extensive experiments demonstrate that our approach infers structural models (i.e., sensible 3D coordinates for the molecule’s atoms) that correspond well to the distance information, can handle noisy and errorprone data, and is considerably faster than established tools. Our results promise to allow domain scientists nearlyinteractive structural modeling based on distance constraints.
Local Search Algorithms for the Maximum Carpool Matching Problem
The Maximum Carpool Matching problem is a star packing problem in directed graphs. Formally, given a directed graph G = (V, A), a capacity function c: V > N, and a weight function w: A > R^+, a carpool matching is a subset of arcs, M subseteq A, such that every v in V satisfies: (i) d^{in}M(v) cdot d^{out}_M(v) = 0, (ii) d^{in}_M(v) <= c(v), and (iii) d^{out}_M(v) <= 1. A vertex v for which d^{out}_M(v) = 1 is a passenger, and a vertex for which d^{out}_M(v) = 0 is a driver who has d^{in}_M(v) passengers. In the Maximum Carpool Matching problem the goal is to find a carpool matching M of maximum total weight. The problem arises when designing an online carpool service, such as Zimride, which tries to connect between users based on a similarity function. The problem is known to be NPhard, even in the unweighted and uncapacitated case. The Maximum Group Carpool Matching problem, is an extension of Maximum Carpool Matching where each vertex represents an unsplittable group of passengers. Formally, each vertex u in V has a size s(u) in N, and the constraint d^{in}_M(v) <= c(v) is replaced with sum{u:(u,v) in M} s(u) <= c(v).
We show that Maximum Carpool Matching can be formulated as an unconstrained submodular maximization problem, thus it admits a 1/2approximation algorithm. We show that the same formulation does not work for Maximum Group Carpool Matching, nevertheless, we present a local search (1/2 – epsilon)approximation algorithm for Maximum Group Carpool Matching. For the unweighted variant of both problems when the maximum possible capacity, c_{max}, is bounded by a constant, we provide a local search (1/2 + 1/{2c_{max}} – epsilon)approximation algorithm. We also show that the problem is APXhard, even if the maximum degree and c_{max} are at most
Cache Oblivious Algorithms for Computing the Triplet Distance between Trees
We study the problem of computing the triplet distance between two rooted unordered trees with n labeled leafs. Introduced by Dobson 1975, the triplet distance is the number of leaf triples that induce different topologies in the two trees. The current theoretically best algorithm is an O(nlogn) time algorithm by Brodal et al. [SODA 2013]. Recently Jansson et al. proposed a new algorithm that, while slower in theory, requiring O(n log^3 n) time, in practice it outperforms the theoretically faster O(n log n) algorithm. Both algorithms do not scale to external memory.
We present two cache oblivious algorithms that combine the best of both worlds. The first algorithm is for the case when the two input trees are binary trees and the second a generalized algorithm for two input trees of arbitrary degree. Analyzed in the RAM model, both algorithms require O(n log n) time, and in the cache oblivious model O(n/B log_{2}(n/M)) I/Os. Their relative simplicity and the fact that they scale to external memory makes them achieve the best practical performance. We note that these are the first algorithms that scale to external memory, both in theory and practice, for this problem.
KDominance in Multidimensional Data: Theory and Applications
We study the problem of kdominance in a set of ddimensional vectors, prove bounds on the number of maxima (skyline vectors), under both worstcase and averagecase models, perform experimental evaluation using synthetic and realworld data, and explore an application of kdominant skyline for extracting a small set of topranked vectors in high dimensions where the full skylines can be unmanageably large.
PrizeCollecting TSP with a Budget Constraint
We consider constrained versions of the prizecollecting traveling salesman and the minimum spanning tree problems. The goal is to maximize the number of vertices in the returned tour/tree subject to a bound on the tour/tree cost. We present a 2approximation algorithm for these problems based on a primaldual approach. The algorithm relies on finding a threshold value for the dual variable corresponding to the budget constraint in the primal and then carefully constructing a tour/tree that is just within budget. Thereby, we improve the bestknown guarantees from 3+epsilon and 2+epsilon for the tree and the tour version, respectively. Our analysis extends to the setting with weighted vertices, in which we want to maximize the total weight of vertices in the tour/tree subject to the same budget constraint.
Streaming Algorithms for Matching Size Estimation in Sparse Graphs
Estimating the size of the maximum matching is a canonical problem in graph analysis, and one that has attracted extensive study over a range of different computational models. We present improved streaming algorithms for approximating the size of maximum matching with sparse (bounded arboricity) graphs.
 (InsertOnly Streams) We present a onepass algorithm that takes O(alpha log n) space and approximates the size of the maximum matching in graphs with arboricity alpha within a factor of O(alpha). This improves significantly upon the stateoftheart tilde{O}(alpha n^{2/3})space streaming algorithms, and is the first polylogarithmic space algorithm for this problem.

(Dynamic Streams) Given a dynamic graph stream (i.e., inserts and deletes) of edges of an underlying alphabounded arboricity graph, we present an onepass algorithm that uses space tilde{O}(alpha^{10/3}n^{2/3}) and returns an O(alpha)estimator for the size of the maximum matching on the condition that the number edge deletions in the stream is bounded by O(alpha n). For this class of inputs, our algorithm improves the stateoftheart tilde{O}(\alpha n^{4/5})space algorithms, where the \tilde{O}(.) notation hides logarithmic in n dependencies.
In contrast to prior work, our results take more advantage of the streaming access to the input and characterize the matching size based on the ordering of the edges in the stream in addition to the degree distributions and structural properties of the sparse graphs.
Positiveinstance driven dynamic programming for treewidth
Consider a dynamic programming scheme for a decision problem in which all subproblems involved are also decision problems. An implementation of such a scheme is positiveinstance driven (PID), if it generates positive subproblem instances, but not negative ones, building each on smaller positive instances. We take the dynamic programming scheme due to Bouchitté and Todinca for treewidth computation, which is based on minimal separators and potential maximal cliques, and design a variant (for the decision version of the problem) with a natural PID implementation. The resulting algorithm performs extremely well: it solves a number of standard benchmark instances for which the optimal solutions have not previously been known. Incorporating a new heuristic algorithm for detecting safe separators, it also solves all of the 100 public instances posed by the exact treewidth track in PACE 2017, a competition on algorithm implementation. We describe the algorithm and prove its correctness. We also perform an experimental analysis counting combinatorial structures involved, which gives insights into the advantage of our approach over more conventional approaches and points to the future direction of theoretical and engineering research on treewidth computation.
Profit Sharing and Efficiency in Utility Games
We study utility games (Vetta, FOCS 2002) where a set of players join teams to produce social utility, and receive individual utility in the form of payments in return. These games have many natural applications in competitive settings such as labor markets, crowdsourcing, etc. The efficiency of such a game depends on the profit sharing mechanism – the rule that maps utility produced by the players to their individual payments. We study three natural and widely used profit sharing mechanisms – egalitarian or equal sharing, marginal gain or value addition when a player joins, and marginal loss or value depletion when a player leaves. For these settings, we give tight bounds on the price of anarchy, thereby allowing comparison between these popular mechanisms from a (worst case) social welfare perspective.
Online Submodular Maximization Problem with Vector Packing Constraint
We consider the online vector packing problem in which we have a d dimensional knapsack and items u with weight vectors w_u in R_+^d arrive online in an arbitrary order. Upon the arrival of an item, the algorithm must decide immediately whether to discard or accept the item into the knapsack. When item u is accepted, w_u(i) units of capacity on dimension i will be taken up, for each i in [d]. To satisfy the knapsack constraint, an accepted item can be later disposed of with no cost, but discarded or disposed of items cannot be recovered. The objective is to maximize the utility of the accepted items S at the end of the algorithm, which is given by f(S) for some nonnegative monotone submodular function f.
For any small constant epsilon > 0, we consider the special case that the weight of an item on every dimension is at most a (1 epsilon) fraction of the total capacity, and give a polynomialtime deterministic O(k / epsilon^2)competitive algorithm for the problem, where k is the (column) sparsity of the weight vectors. We also show several (almost) tight hardness results even when the algorithm is computationally unbounded. We first show that under the epsilonslack assumption, no deterministic algorithm can obtain any o(k) competitive ratio, and no randomized algorithm can obtain any o(k / log k) competitive ratio. We then show that for the general case (when epsilon = 0), no randomized algorithm can obtain any o(k) competitive ratio.
In contrast to the (1+delta) competitive ratio achieved in Kesselheim et al. [STOC 2014] for the problem with random arrival order of items and under large capacity assumption, we show that in the arbitrary arrival order case, even when w_u_infinity is arbitrarily small for all items u, it is impossible to achieve any o(log k / log log k) competitive ratio.
Output sensitive algorithms for approximate incidences and their applications
An epsilonapproximate incidence between a point and some geometric object (line, circle, plane, sphere) occurs when the point and the object lie at distance at most epsilon from each other. Given a set of points and a set of objects, computing the approximate incidences between them is a major step in many database and webbased applications in computer vision and graphics, including robust model fitting, approximate point pattern matching, and estimating the fundamental matrix in epipolar (stereo) geometry.
In a typical approximate incidence problem of this sort, we are given a set P of m points in two or three dimensions, a set S of n objects (lines, circles, planes, spheres), and an error parameter epsilon>0, and our goal is to report all pairs (p,s) in P times S that lie at distance at most epsilon from one another. We present efficient outputsensitive approximation algorithms for quite a few cases, including points and lines or circles in the plane, and points and planes, spheres, lines, or circles in three dimensions. Several of these cases arise in the applications mentioned above. Our algorithms report all pairs at distance <= epsilon, but may also report additional pairs, all of which are guaranteed to be at distance at most alphaepsilon, for some constant alpha>1. Our algorithms are based on simple primal and dual grid decompositions and are easy to implement. We note though that (a) the use of duality, which leads to significant improvements in the overhead cost of the algorithms, appears to be novel for this kind of problems; (b) the correct choice of duality in some of these problems is fairly intricate and requires some care; and (c) the correctness and performance analysis of the algorithms (especially in the more advanced versions) is fairly nontrivial.
We analyze our algorithms and prove guaranteed upper bounds on their running time and on the “distortion” parameter alpha. We also briefly describe the motivating applications, and show how they can effectively exploit our solutions. The superior theoretical bounds on the performance of our algorithms, and their simplicity, make them indeed ideal tools for these applications. In a series of preliminary experimentations (not included in this abstract), we substantiate this feeling, and show that our algorithms lead in practice to significant improved performance of the aforementioned applications.
Faster Approximate Diameter and Distance Oracles in Planar Graphs
We present an algorithm that computes a (1+varepsilon)approximation of the diameter of a weighted, undirected planar graph of n vertices with nonnegative edge lengths in O(nlog n(log n + (1/varepsilon)^5)) expected time, improving upon the O(n((1/varepsilon)^4 log^4(n) + 2^{O(1/varepsilon)}))time algorithm of Weimann and Yuster [ICALP 2013]. Our algorithm makes two improvements over that result: first and foremost, it replaces the exponential dependency on 1/varepsilon with a polynomial one, by adapting and specializing Cabello’s recent abstractVoronoidiagrambased technique [SODA 2017] for approximation purposes; second, it shaves off two logarithmic factors by choosing a better sequence of error parameters during recursion.
Moreover, using similar techniques, we improve the (1+varepsilon)approximate distance oracle of Gu and Xu [ISAAC 2015] by first replacing the exponential dependency on 1/varepsilon on the preprocessing time and space with a polynomial one and second removing a logarithmic factor from the preprocessing time.
Improving TSP tours using dynamic programming over tree decompositions
Given a traveling salesman problem (TSP) tour H in graph G, a kmove is an operation which removes k edges from H, and adds k edges of G so that a new tour H’ is formed. The popular kopt heuristic for TSP finds a local optimum by starting from an arbitrary tour H and then improving it by a sequence of kmoves.
Until 2016, the only known algorithm to find an improving kmove for a given tour was the naive solution in time O(n^k). At ICALP’16 de Berg, Buchin, Jansen and Woeginger showed an O(n^{floor(2/3k)+1})time algorithm.
We show an algorithm which runs in O(n^{(1/4 + epsilon_k)k}) time, where lim_{k > infinity} epsilon_k = 0. It improves over the state of the art for every k >= 5. For the most practically relevant case k=5 we provide a slightly refined algorithm running in O(n^{3.4}) time. We also show that for the k=4 case, improving over the O(n^3)time algorithm of de Berg et al. would be a major breakthrough: an O(n^{3 – epsilon})time algorithm for any epsilon > 0 would imply an O(n^{3 – delta})time algorithm for the All Pairs Shortest Paths problem, for some delta>0.
An Encoding for OrderPreserving Matching
Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of orderpreserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an orderpreserving match if the relative order of their characters is the same: e.g., (4, 1, 3, 2) and (10, 3, 7, 5) are an orderpreserving match. We show how, given a string S[1..n] over an arbitrary alphabet of size sigma and a constant c >=1, we can build an O(n log log n)bit encoding such that later, given a pattern P[1..m] with m >= log^c n, we can return the number of orderpreserving occurrences of P in S in O(m) time. Within the same time bound we can also return the starting position of some orderpreserving match for P in S (if such a match exists). We prove that our space bound is within a constant factor of optimal if log(sigma) = Omega(log log n); our query time is optimal if log(sigma) = Omega(log n). Our space bound contrasts with the Omega(n log n) bits needed in the worst case to store S itself, an index for orderpreserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to O(log^c n) neighbouring characters.
Temporal Clustering
We study the problem of clustering sequences of unlabeled point sets taken from a common metric space. Such scenarios arise naturally in applications where a system or process is observed in distinct time intervals, such as biological surveys and contagious disease surveillance. In this more general setting existing algorithms for classical (i.e. static) clustering problems are not applicable anymore.
We propose a set of optimization problems which we collectively refer to as temporal clustering. The quality of a solution to a temporal clustering instance can be quantified using three parameters: the number of clusters k, the spatial clustering cost r, and the maximum cluster displacement delta between consecutive time steps. We consider spatial clustering costs which generalize the wellstudied kcenter, discrete kmedian, and discrete kmeans objectives of classical clustering problems. We develop new algorithms that achieve tradeoffs between the three objectives k, r, and delta. Our upper bounds are complemented by inapproximability results.
On the Tree Augmentation Problem
In the Tree Augmentation problem we are given a tree T=(V,F) and a set E of edges with positive integer costs {c_e:e in E}. The goal is to augment T by a minimum cost edge set J subseteq E such that T cup J is 2edgeconnected. We obtain the following results.
Recently, Adjiashvili [SODA 17] introduced a novel LP for the problem and used it to break the 2approximation barrier for instances when the maximum cost M of an edge in E is bounded by a constant; his algorithm computes a 1.96418+epsilon approximate solution in time n^{{(M/epsilon^2)}^{O(1)}}. Using a simpler LP, we achieve ratio 12/7+epsilon in time ^{O(M/epsilon^2)}. This also gives ratio better than 2 for logarithmic costs, and not only for constant costs. In addition, we will show that (for arbitrary costs) the problem admits ratio 3/2 for trees of diameter <= 7.
One of the oldest open questions for the problem is whether for unit costs (when M=1) the standard LPrelaxation, so called CutLP, has integrality gap less than 2. We resolve this open question by proving that for unit costs the integrality gap of the CutLP is at most 28/15=22/15. In addition, we will suggest another natural LPrelaxation that is much simpler than the ones in previous work, and prove that it has integrality gap at most 7/4.
Shortest Paths in the Plane with Obstacle Violations
We study the problem of finding shortest paths in the plane among h convex obstacles, where the path is allowed to pass through (violate) up to k obstacles, for k <= h. Equivalently, the problem is to find shortest paths that become obstaclefree if k obstacles are removed from the input. Given a fixed source point s, we show how to construct a map, called a shortest kpath map, so that all destinations in the same region of the map have the same combinatorial shortest path passing through at most k obstacles. We prove a tight bound of Theta(kn) on the size of this map, and show that it can be computed in O(k^2 n log n) time, where n is the total number of obstacle vertices.
The Directed Disjoint Shortest Paths Problem
In the k disjoint shortest paths problem (kDSPP), we are given a graph and its vertex pairs (s_1, t_1), … , (s_k, t_k), and the objective is to find k pairwise disjoint paths P_1, … , P_k such that each path P_i is a shortest path from s_i to t_i, if they exist. If the length of each edge is equal to zero, then this problem amounts to the disjoint paths problem, which is one of the wellstudied problems in algorithmic graph theory and combinatorial optimization. EilamTzoreff (1998) focused on the case when the length of each edge is positive, and showed that the undirected version of 2DSPP can be solved in polynomial time. Polynomial solvability of the directed version was posed as an open problem by EilamTzoreff (1998). In this paper, we solve this problem affirmatively, that is, we give a first polynomial time algorithm for the directed version of 2DSPP when the length of each edge is positive. Note that the 2 disjoint paths problem in digraphs is NPhard, which implies that the directed 2DSPP is NPhard if the length of each edge can be zero. We extend our result to the case when the instance has two terminal pairs and the number of paths is a fixed constant greater than two. We also show that the undirected kDSPP and the vertexdisjoint version of the directed kDSPP can be solved in polynomial time if the input graph is planar and k is a fixed constant.
Tight Bounds for Online Coloring of Basic Graph Classes
We resolve a number of longstanding open problems in online graph coloring. More specifically, we develop tight lower bounds on the performance of online algorithms for fundamental graph classes. An important contribution is that our bounds also hold for randomized online algorithms, for which hardly any results were known. Technically, we construct lower bounds for chordal graphs. The constructions then allow us to derive results on the performance of randomized online algorithms for the following further graph classes: trees, planar, bipartite, inductive, boundedtreewidth and disk graphs. It shows that the best competitive ratio of both deterministic and randomized online algorithms is Theta(log n), where n is the number of vertices of a graph. Furthermore, we prove that this guarantee cannot be improved if an online algorithm has a lookahead of size O(n/log n) or access to a reordering buffer of size n^(1epsilon), for any 0 < epsilon <= 1. A consequence of our results is that, for all of the above mentioned graph classes except bipartite graphs, the natural First Fit coloring algorithm achieves an optimal performance, up to constant factors, among deterministic and randomized
online algorithms.
Combinatorics of Local Search: An Optimal 4Local Hall’s Theorem for Planar Graphs
Local search for combinatorial optimization problems is becoming a dominant algorithmic paradigm, with several papers using it to resolve longstanding open problems. In this paper, we prove the following `4local’ version of Hall’s theorem for planar graphs: given a bipartite planar graph G = (B, R, E) such that N(B’) >= B’ for all B’ <= 4, there exists a matching of size at least B/4 in G; furthermore this bound is tight. Besides immediately implying improved bounds for several problems studied in previous papers, we find this variant of Hall’s theorem to be of independent interest in graph theory.
A LinearTime Parameterized Algorithm for Node Unique Label Cover
The optimization version of the Unique Label Cover problem is at the heart of the Unique Games Conjecture which has played an important role in the proof of several tight inapproximability results. In recent years, this problem has been also studied extensively from the point of view of parameterized complexity. Chitnis et al. [FOCS 2012, SICOMP 2016] proved that this problem is fixedparameter tractable (FPT) and Wahlström [SODA 2014] gave an FPT algorithm with an improved parameter dependence. Subsequently, Iwata, Wahlström and Yoshida [SICOMP 2016] proved that the edge version of Unique Label Cover can be solved in linear FPTtime, and they left open the existence of such an algorithm for the node version of the problem. In this paper, we resolve this question by presenting the first lineartime FPT algorithm for Node Unique Label Cover.
Exploring the tractability of the capped hose model
Robust network design concerns the design of networks to support uncertain or varying traffic
patterns. An especially important case is the VPN problem, where the total traffic emanating
from any node is bounded, but there are no further constraints on the traffic pattern. Recently,
Fréchette et al. [10] studied a generalization of the VPN problem where in addition to these socalled
hose constraints, there are individual upper bounds on the demands between pairs of nodes.
They motivate their model, give some theoretical results, and propose a heuristic algorithm that
performs well on realworld instances.
Our theoretical understanding of this model is limited; it is APXhard in general, but tractable
when either the hose constraints or the individual demand bounds are redundant. In this work,
we uncover further tractable cases of this model; our main result concerns the case where each
terminal needs to communicate only with two others. Our algorithms all involve optimally
embedding a certain auxiliary graph into the network, and have a connection
Randomized Contractions for Multiobjective Minimum Cuts
We show that Karger’s randomized contraction method (SODA 93) can be adapted to multiobjective global minimum cut problems with a constant number of edge or node budget constraints to give efficient algorithms.
For global minimum cuts with a single edgebudget constraint, our extension of the randomized contraction method has running time tilde{O}(n^3) in an nnode graph improving upon the bestknown randomized algorithm with running time tilde{O}(n^4) due to Armon and Zwick (Algorithmica 2006). Our analysis also gives a new upper bound of O(n^3) for the number of optimal solutions for a single edgebudget min cut problem. For the case of (k1) edgebudget constraints, the extension of our algorithm saves a logarithmic factor from the bestknown randomized running time of O(n^{2k} log^3 n). A main feature of our algorithms is to adaptively choose, at each step, the appropriate cost function used in the random selection of edges to be contracted.
For the global min cut problem with a constant number of node budgets, we give a randomized algorithm with running time tilde{O}(n^2), improving the current best determinisitic running time of O(n^3) due to Goemans and Soto (SIAM Journal on Discrete Mathematics 2013). Our method also shows that the total number of distinct optimal solutions is bounded by O(n^2) as in the case of global mincuts. Our algorithm extends to the nodebudget constrained global min cut problem excluding a given sink with the same running time and bound on number of optimal solutions, again improving upon the bestknown running time by a factor of O(n). For nodebudget constrained problems, our improvements arise from incorporating the idea of merging any infeasible supernodes that arise during the random contraction process.
In contrast to cuts excluding a sink, we note that the nodecardinality constrained mincut problem containing a given source is strongly NPhard using a reduction from graph bisection.
Stability and Recovery for Independence Systems
Two genres of heuristics that are frequently reported to perform much better on “realworld” instances than in the worst case are greedy algorithms and local search algorithms. In this paper, we systematically study these two types of algorithms for the problem of maximizing a monotone submodular set function subject to downwardclosed feasibility constraints. We consider perturbationstable instances, in the sense of Bilu and Linial [11], and precisely identify the stability threshold beyond which these algorithms are guaranteed to recover the optimal solution. Byproducts of our work include the first definition of perturbationstability for nonadditive objective functions, and a resolution of the worstcase approximation guarantee of local search in pextendible systems.
Dispersion on Trees
In the kdispersion problem, we need to select k nodes of a given graph so as to maximize the minimum distance between any two chosen nodes. This can be seen as a generalization of the independent set problem, where the goal is to select nodes so that the minimum distance is larger than 1. We design an optimal O(n) time algorithm for the dispersion problem on trees consisting of n nodes, thus improving the previous O(n log n) time solution from 1997.
We also consider the weighted case, where the goal is to choose a set of nodes of total weight at least W. We present an O(n log^2n) algorithm improving the previous O(n log^4 n) solution. Our solution builds on the search version (where we know the minimum distance lambda between the chosen nodes) for which we present tight Theta(n log n) upper and lower bounds.
Minimizing Maximum Flow Time on Related Machines via Dynamic Posted Pricing
We consider a setting where selfish agents want to schedule jobs on related machines. The agent submitting a job picks a server that minimizes a linear combination of the server price and the resulting response time for that job on the selected server. The manager’s task is to maintain server prices to (approximately) optimize the maximum response time, which is a measure of social good. We show that the existence of a pricing scheme with certain competitiveness is equivalent to the existence of a monotone immediatedispatch algorithm. Our main result is a monotone immediatedispatch algorithm that is O(1)competitive with respect to the maximum response time.
On minimizing the makespan when some jobs cannot be assigned on the same machine
We study the classical scheduling problem of assigning jobs to machines in order to minimize the makespan. It is wellstudied and admits an EPTAS on identical machines and a (21/m)approximation algorithm on unrelated machines. In this paper we study a variation in which the input jobs are partitioned into bags and no two jobs from the same bag are allowed to be assigned on the same machine. Such a constraint can easily arise, e.g., due to system stability and redundancy considerations. Unfortunately, as we demonstrate in this paper, the techniques of the above results break down in the presence of these additional constraints.
Our first result is a PTAS for the case of identical machines. It enhances the methods from the known (E)PTASs by a finer classification of the input jobs and careful argumentations why a good schedule exists after enumerating over the large jobs. For unrelated machines, we prove that there can be no (log n)^{1/4epsilon}approximation algorithm for the problem for any epsilon > 0, assuming that NP nsubseteq ZPTIME(2^{(log n)^{O(1)}}). This holds even in the restricted assignment setting. However, we identify a special case of the latter in which we can do better: if the same set of machines we give an 8approximation algorithm. It is based on rounding the LPrelaxation of the problem in phases and adjusting the residual fractional solution after each phase to order to respect the bag constraints.
Independent Range Sampling, Revisited
In the independent range sampling (IRS) problem, given an input set P of n points in R^d, the task is to build a data structure, such that given a range R and an integer t >= 1, it returns t points that are uniformly and independently drawn from P cap R. The samples must satisfy interquery independence, that is, the samples returned by every query must be independent of the samples returned by all the previous queries. This problem was first tackled by Hu, Qiao and Tao in 2014, who proposed optimal structures for onedimensional dynamic IRS problem in internal memory and onedimensional static IRS problem in external memory.
In this paper, we study two natural extensions of the independent range sampling problem. In the first extension, we consider the static IRS problem in two and three dimensions in internal memory. We obtain data structures with optimal spacequery tradeoffs for 3D halfspace, 3D dominance, and 2D threesided queries. The second extension considers weighted IRS problem. Each point is associated with a realvalued weight, and given a query range R, a sample is drawn independently such that each point in P cap R is selected with probability proportional to its weight. Walker’s alias method is a classic solution to this problem when no query range is specified. We obtain optimal data structure for one dimensional weighted range sampling problem, thereby extending the alias method to allow range queries.
Finding axisparallel rectangles of fixed perimeter or area containing the largest number of points
Let P be a set of n points in the plane in general position, and consider the problem of finding an axisparallel rectangle with a given perimeter, or area, or diagonal, that encloses the maximum number of points of P. We present an exact algorithm that finds such a rectangle in O(n^{5/2} log n) time, and, for the case of a fixed perimeter or diagonal, we also obtain (i) an improved exact algorithm that runs in O(nk^{3/2} log k) time, and (ii) an approximation algorithm that finds, in O(n+(n/(k epsilon^5))*log^{5/2}(n/k)log((1/epsilon) log(n/k))) time, a rectangle of the given perimeter or diagonal that contains at least (1epsilon)k points of P, where k is the optimum value.
We then show how to turn this algorithm into one that finds, for a given k, an axisparallel rectangle of smallest perimeter (or area, or diagonal) that contains k points of P. We obtain the first subcubic algorithms for these problems, significantly improving the current state of the art.
Clustering in Hypergraphs to Minimize Average Edge Service Time
We study the problem of clustering the vertices of a weighted hypergraph such that on average the vertices of each edge can be covered by a small number of clusters. This problem has many applications such as for designing medical tests, clustering files on disk servers, and placing network services on servers. The edges of the hypergraph model groups of items that are likely to be needed together, and the optimization criteria which we use can be interpreted as the average delay (or cost) to serve the items of a typical edge. We describe and analyze algorithms for this problem for the case in which the clusters have to be disjoint and for the case where clusters can overlap. The analysis is often subtle and reveals interesting structure and invariants that one can utilize.
Approximate Nearest Neighbor Search Amid HigherDimensional Flats
We consider the Approximate Nearest Neighbor (ANN) problem where the input set consists of n kflats in the Euclidean Rd, for any fixed parameters k<d, and=”” where,=”” for=”” each=”” query=”” point=”” q,=”” we=”” want=”” to=”” return=”” an=”” input=”” flat=”” whose=”” distance=”” from=”” q=”” is=”” at=”” most=”” (1=”” +=”” epsilon)=”” times=”” the=”” shortest=”” such=”” distance,=”” where=”” epsilon=””> 0 is another prespecified parameter. We present an algorithm that achieves this task with n^{k+1}(log(n)/epsilon)^O(1) storage and preprocessing (where the constant of proportionality in the bigO notation depends on d), and can answer a query in O(polylog(n)) time (where the power of the logarithm depends on d and k). In particular, we need only nearquadratic storage to answer ANN queries amidst a set of n lines in any fixeddimensional Euclidean space. As a byproduct, our approach also yields an algorithm, with similar performance bounds, for answering exact nearest neighbor queries amidst kflats with respect to any polyhedral distance function. Our results are more general, in that they also
provide a tradeoff between storage and query time.</d,>
Optimal Stopping Rules for Sequential Hypothesis Testing
Suppose that we are given sample access to an unknown distribution p over n elements and an explicit distribution q over the same n elements. We would like to reject the null hypothesis “p=q” after seeing as few samples as possible, when p =/= q, while we never want to reject the null, when p=q. Wellknown results show that Theta(sqrt{n}/epsilon^2) samples are necessary and sufficient for distinguishing whether p equals q versus p is epsilonfar from q in total variation distance. However, this requires the distinguishing radius epsilon to be fixed prior to deciding how many samples to request. Our goal is instead to design sequential hypothesis testers, i.e. online algorithms that request i.i.d. samples from p and stop as soon as they can confidently reject the hypothesis p=q, without being given a lower bound on the distance between p and q, when p =/= q. In particular, we want to minimize the number of samples requested by our tests as a function of the distance between p and q, and if p=q we want the algorithm, with high probability, to never reject the null. Our work is motivated by and addresses the practical challenge of sequential A/B testing in Statistics.
We show that, when n=2, any sequential hypothesis test must see Omega(1/{d_{tv}(p,q)^2} log log 1/{d_{tv}(p,q)}) samples, with high (constant) probability, before it rejects p=q, where d_{tv}(p,q) is the – unknown to the tester – total variation distance between p and q. We match the dependence of this lower bound on d_{tv}(p,q) by proposing a sequential tester that rejects p=q from at most O({\sqrt{n}}/{d_{tv}(p,q)^2}log log 1/{d_{tv}(p,q)}) samples with high (constant) probability. The Omega(sqrt{n}) dependence on the support size n is also known to be necessary.
We similarly provide twosample sequential hypothesis testers, when sample access is given to both p and q, and discuss applications to sequential A/B testing.
Sampling Geometric Inhomogeneous Random Graphs in Linear Time
Realworld networks, like social networks or the internet infrastructure, have structural properties such as large clustering coefficients that can best be described in terms of an underlying geometry. This is why the focus of the literature on theoretical models for realworld networks shifted from classic models without geometry, such as ChungLu random graphs, to modern geometrybased models, such as hyperbolic random graphs.
With this paper we contribute to the theoretical analysis of these modern, more realistic random graph models. Instead of studying directly hyperbolic random graphs, we introduce a generalization that we call geometric inhomogeneous random graphs (GIRGs). Since we ignore constant factors in the edge probabilities, GIRGs are technically simpler (specifically, we avoid hyperbolic cosines), while preserving the qualitative behaviour of hyperbolic random graphs, and we suggest to replace hyperbolic random graphs by this new model in future theoretical studies.
We prove the following fundamental structural and algorithmic results on GIRGs. (1) As our main contribution we provide a sampling algorithm that generates a random graph from our model in expected linear time, improving the bestknown sampling algorithm for hyperbolic random graphs by a substantial factor O(n^0.5). (2) We establish that GIRGs have clustering coefficients in Omega(1), (3) we prove that GIRGs have small separators, i.e., it suffices to delete a sublinear number of edges to break the giant component into two large pieces, and (4) we show how to compress GIRGs using an expected linear number of bits.
Pathontractions, edge deletions and connectivity preservation
We study several problems related to graph modification problems under connectivity constraints from the perspective of parameterized complexity: (Weighted) Biconnectivity Deletion, where we are tasked with deleting k edges while preserving biconnectivity in an undirected graph, Vertexdeletion Preserving Strong Connectivity, where we want to maintain strong connectivity of a digraph while deleting exactly k vertices, and Pathcontraction Preserving Strong Connectivity, in which the operation of path contraction on arcs is used instead. The parameterized tractability of this last problem was posed in [BangJensen and Yeo, Discrete Applied Math 2008] as an open question and we answer it here in the negative: both variants of preserving strong connectivity are W[1]hard. Preserving biconnectivity, on the other hand, turns out to be fixed parameter tractable (FPT) and we provide an FPT algorithm that solves Weighted Biconnectivity Deletion. Further, we show that the unweighted case even admits a randomized polynomial kernel. All our results provide further interesting data points for the systematic study of connectivitypreservation constraints in the parameterized setting.
Halfintegral linkages in highly connected directed graphs
We study the halfintegral kDirected Disjoint Paths Problem (1/2 kDDPP) in highly strongly connected digraphs. The integral kDDPP is NPcomplete even when restricted to instances where k=2, and the input graph is Lstrongly connected, for any L >= 1. We show that when the integrality condition is relaxed to allow each vertex to be used in two paths, the problem becomes efficiently solvable in highly connected digraphs (even with k as part of the input).
Specifically, we show that there is an absolute constant c such that for each k >= 2 there exists L(k) such that 1/2 kDDPP is solvable in time O(V(G)^c) for a L(k)strongly connected directed graph G. As the function L(k) grows rather quickly, we also show that 1/2 kDDPP is solvable in time O(V(G)^{f(k)}) in (36k^3+2k)strongly connected directed graphs. We show that for each epsilon<1, deciding halfintegral feasibility of kDDPP instances is NPcomplete when k is given as part of the input, even when restricted to graphs with strong connectivity epsilon k.
Combinatorial nfold Integer Programming and Applications
Many fundamental NPhard problems can be formulated as integer linear programs (ILPs). A
famous algorithm by Lenstra allows to solve ILPs in time that is exponential only in the dimension
of the program. That algorithm therefore became a ubiquitous tool in the design of
fixedparameter algorithms for NPhard problems, where one wishes to isolate the hardness of
a problem by some parameter. However, it was discovered that in many cases using Lenstra’s
algorithm has two drawbacks: First, the run time of the resulting algorithms is often doublyexponential
in the parameter, and second, an ILP formulation in small dimension can not easily
express problems which involve many different costs.
Inspired by the work of Hemmecke, Onn and Romanchuk [Math. Prog. 2013], we develop
a singleexponential algorithm for socalled combinatorial nfold integer programs, which are
remarkably similar to prior ILP formulations for various problems, but unlike them, also allow
variable dimension. We then apply our algorithm to a few representative problems like Closest
String, Swap Bribery, Weighted Set Multicover, and obtain exponential speedups in
the dependence on the respective parameters, the input size, or both.
Unlike Lenstra’s algorithm, which is essentially a bounded search tree algorithm, our result
uses the technique of augmenting steps. At its heart is a deep result stating that in combinatorial
nfold IPs an existence of an augmenting step implies an existence of a “local” augmenting step,
which can be found using dynamic programming. Our results provide an important insight into
many problems by showing that they exhibit this phenomenon, and highlights the importance of
augmentation techniques.
Computing Maximum Agreement Forests without Cluster Partitioning is Folly
Computing a maximum (acyclic) agreement forest (M(A)AF) of a pair of phylogenetic trees is known to be fixedparameter tractable; the two main techniques are kernelization and depthbounded search. In theory, kernelizationbased algorithms for this problem are not competitive, but they perform remarkably well in practice. We shed light on why this is the case. Our results show that, probably unsurprisingly, the kernel is often much smaller in practice than the theoretical worst case, but not small enough to fully explain the good performance of these algorithms. The key to performance is cluster partitioning, a technique used in almost all fast M(A)AF algorithms. In theory, cluster partitioning does not help: some instances are highly clusterable, others not at all. However, our experiments show that cluster partitioning leads to substantial performance improvements for kernelizationbased M(A)AF algorithms. In contrast, kernelizing the individual clusters before solving them using exponential search yields only very modest performance improvements or even hurts performance; for the vast majority of inputs, kernelization leads to no reduction in the maximal cluster size at all. The choice of the algorithm applied to solve individual clusters also significantly impacts performance, even though our limited experiment to evaluate this produced no clear winner; depthbounded search, exponential search interleaved with kernelization, and an ILPbased algorithm all achieved competitive performance.
Benchmark Graphs for Practical Graph Isomorphism
The stateoftheart solvers for the graph isomorphism problem can readily solve generic instances
with tens of thousands of vertices. Indeed, experiments show that on inputs without particular
combinatorial structure the algorithms scale almost linearly. In fact, it is nontrivial to create
challenging instances for such solvers and the number of difficult benchmark graphs available is
quite limited.
We describe a construction to efficiently generate small instances for the graph isomorphism
problem that are difficult or even infeasible for said solvers.
Up to this point the only other available instances posing challenges for isomorphism solvers
were certain incidence structures of combinatorial objects (such as projective planes, Hadamard
matrices, Latin squares, etc.). Experiments show that starting from 1500 vertices our new instances
are several orders of magnitude more difficult on comparable input sizes. More importantly,
our method is generic and efficient in the sense that one can quickly create many
isomorphism instances on a desired number of vertices. In contrast to this, said combinatorial
objects are rare and difficult to generate and with the new construction it is possible to generate
an abundance of instances of arbitrary size.
Our construction hinges on the multipedes of Gurevich and Shelah and the CaiFürerImmerman
gadgets that realize a certain abelian automorphism group and have repeatedly played a role in
the context of graph isomorphism. Exploring limits, we also explain that there are group theoretic
obstructions to generalizing the construction with nonabelian gadgets.
Modeling and Engineering Constrained Shortest Path Algorithms for Battery Electric Vehicles
We study the problem of computing constrained shortest paths for battery electric vehicles. Since battery capacities are limited, fastest routes are often infeasible. Instead, users are interested in fast routes where the energy consumption does not exceed the battery capacity. For that, drivers can deliberately reduce speed to save energy. Hence, route planning should provide both path and speed recommendations. To tackle the resulting NPhard optimization problem, previous work trades correctness or accuracy of the underlying model for practical running times. In this work, we present a novel framework to compute optimal constrained shortest paths for electric vehicles that uses more realistic physical models, while taking speed adaptation into account. Careful algorithm engineering makes the approach practical even on large, realistic road networks: We compute optimal solutions in less than a second for typical battery capacities, matching performance of previous inexact methods. For even faster performance, the approach can easily be extended with heuristics that provide high quality solutions within milliseconds.
Social goods are goods that grant value not only to their owners but also to the owners’ surroundings, be it their families, friends or office mates. The benefit a nonowner derives from the good is affected by many factors, including the type of the good, its availability, and the social status of the nonowner. Depending on the magnitude of the benefit and on the price of the good, a potential buyer might stay away from purchasing the good, hoping to free ride on others’ purchases. A revenuemaximizing seller who sells social goods must take these considerations into account when setting prices for the good. The literature on optimal pricing has advanced considerably over the last decade, but little is known about optimal pricing schemes for selling social goods. In this paper, we conduct a systematic study of revenuemaximizing pricing schemes for social goods: we introduce a Bayesian model for this scenario, and devise nearlyoptimal pricing schemes for various types of externalities, both for simultaneous sales and for sequential sales.
Counting restricted homomorphisms via Möbius inversion over matroid lattices
We present a framework for the complexity classification of parameterized counting problems that can be formulated as the summation over the numbers of homomorphisms from small pattern graphs H_1,…,H_l to a big host graph G with the restriction that the coefficients correspond to evaluations of the Möbius function over the lattice of a graphic matroid. This generalizes the idea of Curticapean, Dell and Marx [STOC 17] who used a result of Lovász stating that the number of subgraph embeddings from a graph H to a graph G can be expressed as such a sum over the lattice of partitions of H. In the first step we introduce what we call graphically restricted homomorphisms that, inter alia, generalize subgraph embeddings as well as locally injective homomorphisms. We provide a complete parameterized complexity dichotomy for counting such homomorphisms, that is, we identify classes of patterns for which the problem is fixedparameter tractable (FPT), including an algorithm, and prove that all other pattern classes lead to #W[1]hard problems. The main ingredients of the proof are the complexity classification of linear combinations of homomorphisms due to Curticapean, Dell and Marx [STOC 17] as well as a corollary of Rota’s NBC Theorem which states that the sign of the Möbius function over a geometric lattice only depends on the rank of its arguments. We apply the general theorem to the problem of counting locally injective homomorphisms from small pattern graphs to big host graphs yielding a concrete dichotomy criterion. It turns out that – in contrast to subgraph embeddings – counting locally injective homomorphisms has “real” FPT cases, that is, cases that are fixedparameter tractable but not polynomial time solvable under standard complexity assumptions. To prove this we show in an intermediate step that the subgraph counting problem remains #Phard when both the pattern and the host graphs are restricted to be trees. We then investigate the more general problem of counting homomorphisms that are injective in the rneighborhood of every vertex. As those are graphically restricted as well, they can also easily be classified via the general theorem. Finally we show that the dichotomy for counting graphically restricted homomorphisms readily extends to socalled linear combinations.
Bounds on the Satisfiability Threshold for Power Law Distributed Random SAT
Propositional satisfiability (SAT) is one of the most fundamental problems in computer science. The worstcase hardness of SAT lies at the core of computational complexity theory. The averagecase analysis of SAT has triggered the development of sophisticated rigorous and nonrigorous techniques for analyzing random structures.
Despite a long line of research and substantial progress, nearly all theoretical work on random SAT assumes a uniform distribution on the variables. In contrast, realworld instances often exhibit large fluctuations in variable occurrence. This can be modeled by a scalefree distribution of the variables, which results in distributions closer to industrial SAT instances.
We study random kSAT on n variables, m = Theta(n) clauses, and a power law distribution on the variable occurrences with exponent beta. We observe a satisfiability threshold at beta = (2k1)/(k1). This threshold is tight in the sense that instances with beta <= (2k1)/(k1)epsilon for any constant epsilon > 0 are unsatisfiable with high probability (w.h.p.). For beta >= (2k1)/(k1)+epsilon, the picture is reminiscent of the uniform case: instances are satisfiable w.h.p. for sufficiently small constant clausevariable ratios m/n; they are unsatisfiable above a ratio m/n that depends on beta.
Dynamic clustering to minimize the sum of radii
In the kdispersion problem, we need to select k nodes of a given graph so as to maximize the
minimum distance between any two chosen nodes. This can be seen as a generalization of the
independent set problem, where the goal is to select nodes so that the minimum distance is larger
than 1. We design an optimal O(n) time algorithm for the dispersion problem on trees consisting
of n nodes, thus improving the previous O(n log n) time solution from 1997.
We also consider the weighted case, where the goal is to choose a set of nodes of total weight
at least W. We present an O(n log2 n) algorithm improving the previous O(n log4 n) solution.
Our solution builds on the search version (where we know the minimum distance between the
chosen nodes) for which we present tight (n log n) upper and lower bounds.
Realtime Streaming MultiPattern Search for Constant Alphabet
In the streaming multipattern search problem, which is also known as the streaming dictionary matching problem, a set D={P_1,P_2, . . . ,P_d} of d patterns (strings over an alphabet Sigma), called the dictionary, is given to be preprocessed. Then, a text T arrives one character at a time and the goal is to report, before the next character arrives, the longest pattern in the dictionary that is a current suffix of T. We prove that for a constant size alphabet, there exists a randomized MonteCarlo algorithm for the streaming dictionary matching problem that takes constant time per character and uses O(d log m) words of space, where m is the length of the longest pattern in the dictionary. In the case where the alphabet size is not constant, we introduce two new randomized MonteCarlo algorithms with the following complexities:
 O(log log Sigma) time per character in the worst case and O(d log m) words of space.

O(1/epsilon) time per character in the worst case and O(d \Sigma^epsilon log m/epsilon) words of space for any 0<epsilon<= 1.<br=””>
These results improve upon the algorithm of [Clifford et al., ESA’15] which uses O(d log m) words of space and takes O(log log (m+d)) time per character. </epsilon<=>
On the Impact of Singleton Strategies in Congestion Games
To what extent does the structure of the players’ strategy space influence the efficiency of decentralized solutions in congestion games? In this work, we investigate whether better performance is possible when restricting to load balancing games in which players can only choose among single resources. We consider three different solutions concepts, namely, approximate pure Nash equilibria, approximate oneround walks generated by selfish players aiming at minimizing their personal cost and approximate oneround walks generated by cooperative players aiming at minimizing the marginal increase in the sum of the players’ personal costs. The last two concepts can also be interpreted as solutions of simple greedy online algorithms for the related resource selection problem. Under fairly general latency functions on the resources, we show that, for all three types of solutions, better bounds cannot be achieved if players are either weighted or asymmetric. On the positive side, we prove that, under mild assumptions on the latency functions, improvements on the performance of approximate pure Nash equilibria are possible for load balancing games with weighted and symmetric players in the case of identical resources. We also design lower bounds on the performance of oneround walks in load balancing games with unweighted players and identical resources (in this case, solutions generated by selfish and cooperative players coincide).
Exponential lower bounds for historybased simplex pivot rules on abstract cubes
The behavior of the simplex algorithm is a widely studied subject. Specifically, the question of the existence of a polynomial pivot rule for the simplex algorithm is of major importance. Here, we give exponential lower bounds for three historybased pivot rules. Those rules decide their next step based on memory of the past steps. In particular, we study Zadeh’s least entered rule, Johnson’s leastrecently basic rule and Cunningham’s leastrecently considered (or roundrobin) rule. We give exponential lower bounds on Acyclic Unique Sink Orientations of the abstract cube, for all of these pivot rules. For Johnson’s rule our bound is the first superpolynomial one in any context; for Zadeh’s it is the first one for AUSO. Those two are our main results.
New Abilities and Limitations of Spectral Graph Bisection
Spectral based heuristics belong to wellknown commonly used methods which determines provably minimal graph bisection or outputs “fail” when the optimality cannot be certified. In this paper we focus on Boppana’s algorithm which belongs to one of the most prominent methods of this type. It is well known that the algorithm works well in the random planted bisection model – the standard class of graphs for analysis minimum bisection and relevant problems. In 2001 Feige and Kilian posed the question if Boppana’s algorithm works well in the semirandom model by Blum and Spencer. In our paper we answer this question affirmatively. We show also that the algorithm achieves similar performance on graph classes which extend the semirandom model.
Since the behavior of Boppana’s algorithm on the semirandom graphs remained unknown, Feige and Kilian proposed a new semidefinite programming (SDP) based approach and proved that it works on this model. The relationship between the performance of the SDP based algorithm and Boppana’s approach was left as an open problem. In this paper we solve the problem in a complete way by proving that the bisection algorithm of Feige and Kilian provides exactly the same results as Boppana’s algorithm. As a consequence we get that Boppana’s algorithm achieves the optimal threshold for exact cluster recovery in the stochastic block model. On the other hand we prove some limitations of Boppana’s approach: we show that if the density difference on the parameters of the planted bisection model is too small then the algorithm fails with high probability in the model.
The Online House Numbering Problem: MinMax Online List Labeling
We introduce and study the online house numbering problem, where houses are added arbitrarily along a road and must be assigned labels to maintain their ordering along the road. The online house numbering problem is related to classic online list labeling problems, except that the optimization goal here is to minimize the maximum number of times that any house is relabeled. We provide several algorithms that achieve interesting tradeoffs between upper bounds on the number of maximum relabels per element and the number of bits used by labels.
The Power of Vertex Sparsifiers in Dynamic Graph Algorithms
We introduce a new algorithmic framework for designing dynamic graph algorithms in minorfree graphs, by exploiting the structure of such graphs and a tool called vertex sparsification, which is a way to compress large graphs into small ones that well preserve relevant properties among a subset of vertices and has previously mainly been used in the design of approximation algorithms.
Using this framework, we obtain a Monte Carlo randomized fully dynamic algorithm for (1 + epsilon)approximating the energy of electrical flows in nvertex planar graphs with tilde{O}(r epsilon^{2}) worstcase update time and tilde{O}((r + n / sqrt{r}) epsilon^{2}) worstcase query time, for any r larger than some constant. For r=n^{2/3}, this gives tilde{O}(n^{2/3} epsilon^{2}) update time and tilde{O}(n^{2/3} epsilon^{2}) query time. We also extend this algorithm to work for minorfree graphs with similar approximation and running time guarantees. Furthermore, we illustrate our framework on the allpairs max flow and shortest path problems by giving corresponding dynamic algorithms in minorfree graphs with both sublinear update and query times. To the best of our knowledge, our results are the first to systematically establish such a connection between dynamic graph algorithms and vertex sparsification.
We also present both upper bound and lower bound for maintaining the energy of electrical flows in the incremental subgraph model, where updates consist of only vertex activations, which might be of independent interest.
Improved Guarantees for Vertex Sparsification in Planar Graphs
Graph Sparsification aims at compressing large graphs into smaller ones while (approximately) preserving important characteristics of the input graph. In this work we study Vertex Sparsifiers, i.e., sparsifiers whose goal is to reduce the number of vertices. Given a weighted graph G=(V,E), and a terminal set K with K=k, a qualityq vertex cut sparsifier of G is a graph H with K contained in V_H that preserves the value of minimum cuts separating any bipartition of K, up to a factor of q. We show that planar graphs with all the k terminals lying on the same face admit quality1 vertex cut sparsifier of size O(k^2) that are also planar. Our result extends to vertex flow and distance sparsifiers. It improves the previous best known bound of O(k^2 2^(2k)) for cut and flow sparsifiers by an exponential factor, and matches an Omega(k^2) lowerbound for this class of graphs.
We also study vertex reachability sparsifiers for directed graphs. Given a digraph G=(V,E) and a terminal set K, a vertex reachability sparsifier of G is a digraph H=(V_H,E_H), K contained in V_H that preserves all reachability information among terminal pairs. We introduce the notion of reachabilitypreserving minors, i.e., we require H to be a minor of G. Among others, for general planar digraphs, we construct reachabilitypreserving minors of size O(k^2 log^2 k). We complement our upperbound by showing that there exists an infinite family of acyclic planar digraphs such that any reachabilitypreserving minor must have Omega(k^2) vertices.
Triangle packing in (sparse) tournaments: approximation and kernelization.
Given a tournament T and a positive integer k, the C_3PackingT asks if there exists a least k (vertex)disjoint directed 3cycles in T. This is the dual problem in tournaments of the classical minimal feedback vertex set problem. Surprisingly C_3PackingT did not receive a lot of attention in the literature. We show that it does not admit a PTAS unless P=NP, even if we restrict the considered instances to sparse tournaments, that is tournaments with a feedback arc set (FAS) being a matching. Focusing on sparse tournaments we provide a (1+6/(c1)) approximation algorithm for sparse tournaments having a linear representation where all the backward arcs have “length” at least c. Concerning kernelization, we show that C_3PackingT admits a kernel with O(m) vertices, where m is the size of a given feedback arc set. In particular, we derive a O(k) vertices kernel for C_3PackingT when restricted to sparse instances. On the negative size, we show that C_3PackingT does not admit a kernel of (total bit) size O(k^{2epsilon}) unless NP is a subset of coNP / Poly. The existence of a kernel in O(k) vertices for C_3PackingT remains an open question.
LZEnd Parsing in Linear Time
We present a deterministic algorithm that constructs in linear time and space the LZEnd parsing (a variation of LZ77) of a given string over an integer polynomially bounded alphabet.
Distancepreserving Subgraphs of Interval Graphs
We consider the problem of finding small distancepreserving subgraphs of undirected, unweighted interval graphs that have k terminal vertices. We show that every interval graph admits a distancepreserving subgraph with O(k log k) branching vertices. We also prove a matching lower bound by exhibiting an interval graph based on bitreversal permutation matrices. In addition, we show that interval graphs admit subgraphs with O(k) branching vertices that approximate distances up to an additive term of +1.
Computing optimal homotopies over a spiked plane with polygonal boundary
Computing optimal deformations between two curves is a fundamental question with various
applications, and has recently received much attention in both computational topology and in
mathematics in the form of homotopies of disks and annular regions. In this paper, we examine
this problem in a geometric setting, where we consider the boundary of a polygonal domain with
spikes, point obstacles that can be crossed at an additive cost. We aim to continuously morph
from one part of the boundary to another, necessarily passing over all spikes, such that the most
expensive intermediate curve is minimized, where the cost of a curve is its geometric length plus
the cost of any spikes it crosses.
We first investigate the general setting where each spike may have a different cost. For the
number of inflection points in an intermediate curve, we present a lower bound that is linear in
the number of spikes, even if the domain is convex and the two boundaries for which we seek a
morph share an endpoint. We describe a 2approximation algorithm for the general case, and an
optimal algorithm for the case that the two boundaries for which we seek a morph share both
endpoints, thereby representing the entire boundary of the domain.
We then consider the setting where all spikes have the same unit cost and we describe a
polynomialtime exact algorithm. The algorithm combines structural properties of homotopies
arising from the geometry with methodology for computing Fréchet distances.
A SpaceOptimal Grammar Compression
A grammar compression is a contextfree grammar (CFG) deriving a single string deterministically. For an input string of length N over an alphabet of size sigma, the smallest CFG is O(log N)approximable in the offline setting and O(log N log^* N)approximable in the online setting. In addition, an informationtheoretic lower bound for representing a CFG in Chomsky normal form of n variables is log (n!/n^sigma) + n + o(n) bits. Although there is an online grammar compression algorithm that directly computes the succinct encoding of its output CFG with O(log N log^* N) approximation guarantee, the problem of optimizing its working space has remained open. We propose a fullyonline algorithm that requires the fewest bits of working space asymptotically equal to the lower bound in O(N log log n) compression time. In addition we propose several techniques to boost grammar compression and show their efficiency by computational experiments.
Improved Approximate Rips Filtrations with Shifted Integer Lattices
Rips complexes are important structures for analyzing topological features of metric spaces. Unfortunately, generating these complexes constitutes an expensive task because of a combinatorial explosion in the complex size. For n points in R^d, we present a scheme to construct a 4.24approximation of the multiscale filtration of the Rips complex in the Linfinity metric, which extends to a O(d^{0.25})approximation of the Rips filtration for the Euclidean case. The kskeleton of the resulting approximation has a total size of n2^{O(d log k)}. The scheme is based on the integer lattice and on the barycentric subdivision of the dcube.
Singlesink Fractionally Subadditive Network Design
We study a generalization of the Steiner tree problem, where we are given a weighted network G together with a collection of k subsets of its vertices and a root r. We wish to construct a minimum cost network such that the network supports one unit of flow to the root from every node in a subset simultaneously. The network constructed does not need to support flows from all the subsets simultaneously.
We settle an open question regarding the complexity of this problem for k=2, and give a 3/2approximation algorithm that improves over a (trivial) known 2approximation. Furthermore, we prove some structural results that prevent many wellknown techniques from doing better than the known O(log n)approximation. Despite these obstacles, we conjecture that this problem should have an O(1)approximation. We also give an approximation result for a variant of the problem where the solution is required to be a path.
Contracting a Planar Graph Efficiently
We present a data structure that can maintain a simple planar graph under edge contractions in linear total time. The data structure supports adjacency queries and provides access to neighbor lists in O(1) time. Moreover, it can report all the arising selfloops and parallel edges.
By applying the data structure, we can achieve optimal running times for decremental bridge detection, 2edge connectivity, maximal 3edge connected components, and the problem of finding a unique perfect matching for a static planar graph. Furthermore, we improve the running times of algorithms for several planar graph problems, including decremental 2vertex and 3edge connectivity, and we show that using our data structure in a blackbox manner, one obtains conceptually simple optimal algorithms for computing MST and 5coloring in planar graphs.
On the Complexity of Bounded Context Switching
Bounded context switching (BCS) is an underapproximate method for finding violations to safety properties in sharedmemory concurrent programs. Technically, BCS is a reachability problem that is known to be NPcomplete. Our contribution is a parameterized analysis of BCS.
The first result is an algorithm that solves BCS when parameterized by the number of context switches (cs) and the size of the memory (m) in O*(m^(cs)2^(cs)). This is achieved by creating instances of the easier problem Shuff which we solve via fast subset convolution. We also present a lower bound for BCS of the form m^o(cs / log(cs)), based on the exponential time hypothesis. Interestingly, the gap is closely related to a conjecture that has been open since FOCS’07. Further, we prove that BCS admits no polynomial kernel.
Next, we introduce a measure, called scheduling dimension, that captures the complexity of schedules. We study BCS parameterized by the scheduling dimension (sdim) and show that it can be solved in O*((2m)^(4sdim)4^t), where t is the number of threads. We consider variants of the problem for which we obtain (matching) upper and lower bounds.
Fast Dynamic Arrays
We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of n elements supporting access in time O(1) and insertion and deletion in time O(n^e) for e > 0 while using o(n) extra space. We consider several different implementation optimizations in C++ and compare their performance to that of vector and set from the standard library on sequences with up to 10^8 elements. Our fastest implementation uses much less space than set while providing speedups of 40x for access operations compared to set and speedups of 10.000x compared to vector for insertion and deletion operations while being competitive with both data structures for all other operations.
Permuting and Batched Geometric Lower Bounds in the I/O model
We study permuting and batched orthogonal geometric reporting problems in the External Memory Model (EM), assuming indivisibility of the input records.
Our main results are twofold. First, we prove a general simulation result that essentially shows that any permutation algorithm (resp. duplicate removal algorithm) that does alpha*N/B I/Os (resp. to remove a fraction of the existing duplicates) can be simulated with an algorithm that does alpha phases where each phase reads and writes each element once, but using a factor alpha smaller block size.
Second, we prove two lower bounds for batched rectangle stabbing and batched orthogonal range reporting queries. Assuming a short cache, we prove very high lower bounds that currently are not possible with the existing techniques under the tall cache assumption.
Dynamic Space Efficient Hashing
We consider space efficient hash tables that can grow and shrink dynamically and are always highly space efficient, i.e., their space consumption is always close to the lower bound even while growing and when taking into account storage that is only needed temporarily. None of the traditionally used hash tables have this property. We show how known approaches like linear probing and bucket cuckoo hashing can be adapted to this scenario by subdividing them into many subtables or using virtual memory overcommitting. However, these rather straightforward solutions suffer from slow amortized insertion times due to frequent reallocation in small increments.
Our main result is DySECT (Dynamic Space Efficient Cuckoo Table) which avoids these problems. DySECT consists of many subtables which grow by doubling their size. The resulting inhomogeneity in subtable sizes is equalized by the flexibility available in bucket cuckoo hashing where each element can go to several buckets each of which containing several cells. Experiments indicate that DySECT works well with load factors up to 98%. With up to 2.7 times better performance than the next best solution.
Inplace Parallel Super Scalar Samplesort (IPSSSSo)
We present a sorting algorithm that works inplace, executes in parallel, is cacheefficient, avoids branchmispredictions, and performs work O(n log n) for arbitrary inputs with high probability. The main algorithmic contributions are new ways to make distributionbased algorithms inplace: On the practical side, by using coarsegrained blockbased permutations, and on the theoretical side, we show how to eliminate the recursion stack. Extensive experiments shw that our algorithm IPSSSSo scales well on a variety of multicore machines. We outperform our closest inplace competitor by a factor of up to 3. Even as a sequential algorithm, we are up to 1.5 times faster than the closest sequential competitor, BlockQuicksort.
A quasipolynomialtime approximation scheme for vehicle routing on planar and boundedgenus graphs
The Capacitated Vehicle Routing problem is a generalization of the Traveling Salesman problem in which a set of clients must be visited by a collection of capacitated tours. Each tour can visit at most Q clients and must start and end at a specified depot. We present the first approximation scheme for Capacitated Vehicle Routing for nonEuclidean metrics. Specifically we give a quasipolynomialtime approximation scheme for Capacitated Vehicle Routing with fixed capacities on planar graphs. We also show how this result can be extended to boundedgenus graphs and polylogarithmic capacities, as well as to variations of the problem that include multiple depots and charging penalties for unvisited clients.
[contactform][contactfield label=”Name” type=”name” required=”true” /][contactfield label=”Email” type=”email” required=”true” /][contactfield label=”Website” type=”url” /][contactfield label=”Message” type=”textarea” /][/contactform]