Approximation Algorithms for Throughput Maximization, Resource Minimization for Fire Containment and Hierarchical Clustering

  • Author / Creator
    Mirmahdi Rahgoshay
  • In this thesis, we present a variety of approximation algorithms for Resource Minimization for Fire Containment, Throughput Maximization, and Hierarchical Clustering.Resource Minimization Fire Containment (RMFC) is a natural model for optimal inhibition of harmful spreading phenomena on a graph. In the RMFC problem on trees, we are given an undirected tree $G$, and a vertex $r$ where the fire starts at, called root. At each time step, firefighters can protect up to $B$ vertices of the graph while the fire spreads from burning vertices to all their neighbors that have not been protected so far. The task is to find the smallest $B$ that allows for saving all the leaves of the tree. The problem is hard to approximate up to any factor better than 2 even on trees unless P = NP [DM2010]. The main contribution to the RMFC problem is an asymptotic QPTAS for RMFC on trees. More specifically, let $\epsilon>0$, and $\mathcal{I}$ be an instance of RMFC where the optimum number of firefighters to save all the leaves is $OPT(\mathcal{I})$. We present an algorithm which uses at most $\lceil (1+\epsilon)OPT(\mathcal{I})\rceil$ many firefighters at each time step and runs in time $n^{O(\log\log n/\epsilon)}$. This suggests that the existence of an asymptotic PTAS is plausible especially since the exponent is $O(\log \log n)$. Our result combines a more refined height reduction lemma than the one in [SODA2019] with LP rounding and dynamic programming to find the solution. We also apply our height reduction lemma to the algorithm provided in [SODA2019] plus a more careful analysis to improve their 12-approximation and provide a polynomial time ($5+\epsilon$)-approximation.In Throughput Maximization we have a collection $J$ of $n$ jobs, each having a release time $rj$, deadline $dj$, and processing time $pj$. They have to be scheduled non-preemptively on $m$ identical parallel machines. The goal is to find a schedule which maximizes the number of jobs scheduled entirely in their $[rj,dj]$ window. This problem has been studied extensively (even for the case of $m=1$). Several special cases of the problem remain open. The approximability of the problem for $m=O(1)$ remains a major open question. We study the case of $m=O(1)$ and show that if there are $c$ distinct processing times, i.e. $pj$'s come from a set of size $c$, then there is a randomized $(1-\eps)$-approximation that runs in time $O(n^{mc^7\eps^{-6}}\log T)$, where $T$ is the largest deadline. Therefore, for constant $m$ and constant $c$ this yields a PTAS. Our algorithm is based on proving structural properties for a near optimum solution that allows one to use a dynamic programming with pruning.In Hierarchical Clustering, one is given a set of $n$ data points along with a notion of similarity/dis-similarity between them. More precisely for each two items $i$ and $j$ we are given a weight $w{i,j}$ denoting their similarity/dis-similarity. The goal is to build a recursive (tree like) partitioning of the data points (items) into successively smaller clusters, which is represented by a rooted tree, where the leaves correspond to the items and each internal node corresponds to a cluster of all the items in the leaves of its subtree. Typically, the goal is to have the items that are relatively similar, to separate at deeper levels of the tree (and hence stay in the same cluster as deep as possible).Dasgupta [STOC2016] defined a cost function for a tree $T$ to be $Cost(T) = \sum{i,j \in [n]} \big(w{i,j} \times |T{i,j}| \big)$ where $T{i,j}$ is the subtree rooted at the least common ancestor of $i$ and $j$ and presented the first approximation algorithm for such clustering.Later Cohen-Addad et al. [JACM2019] considered the same objective function as Dasgupta's but for dissimilarity-based metrics: $Rev(T)=\sum{i,j\in [n]} \big(w{i,j}\times |T{i,j}|\big)$,where $w{i,j}$ is the weight of dissimilarity between two nodes $i$ and $j$. In this version a good clustering should have larger $T{i,j}$ when $w{i,j}$ is relatively large. It has been shown that both random partitioning and average linkage have ratio $2/3$ which has been only slightly improved to $0.667078$ [Charikar et al. SODA2020]. Our first main result is to improve this ratio of 0.667078 for $Rev(T)$. We achieve this by building upon the earlier work and use a more delicate algorithm and careful analysis which can be refined to achieve approximation $0.71604$. We also introduce a new objective function for dissimilarity-based Hierarchical Clustering. Consider any tree $T$, we define $H{i,j}$ as the number of $i$ and $j$'s common ancestors in $T$. In other words, when we think of the process of building tree as a top-down procedure, $H{i,j}$ is the step in which $i$ and $j$ are separated into two clusters (they were stayed within the same cluster for $H{i,j}-1$ many steps). We suggest the cost of each tree $T$, which we want to minimize, to be $CostH(T) = \sum{i,j \in [n]} \big(w{i,j} \times H{i,j} \big)$. We present a $1.3977$-approximation for this objective.

  • Subjects / Keywords
  • Graduation date
    Fall 2021
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.