#5536. Improvements on approximation algorithms for clustering probabilistic data

August 2026publication date
Proposal available till 21-05-2025
4 total number of authors per manuscript0 $

The title of the journal is available only for the authors who have already paid for
Journal’s subject area:
Information Systems;
Human-Computer Interaction;
Hardware and Architecture;
Software;
Artificial Intelligence;
Places in the authors’ list:
place 1place 2place 3place 4
FreeFreeFreeFree
2350 $1200 $1050 $900 $
Contract5536.1 Contract5536.2 Contract5536.3 Contract5536.4
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)

Abstract:
Uncertainty about data appears in many real-world applications and an important issue is how to manage, analyze and solve optimization problems over such data. An important tool for data analysis is clustering. When the data set is uncertain, we can model them as a set of probabilistic points each formalized as a probability distribution function which describes the possible locations of the points. In this paper, we study k-center problem for probabilistic points in a general metric space. First, we present a fast greedy approximation algorithm that builds k centers using a farthest-first traversal in k iterations. This algorithm improves the previous approximation factor of the unrestricted assigned k-center problem from 10 (see [1]) to 6. Next, we restrict the centers to be selected from all the probabilistic locations of the given points and we show that an optimal solution for this restricted setting is a 2-approximation factor solution for an optimal solution of the assigned k-center problem with expected distance assignment. Using this idea, we improve the approximation factor of the unrestricted assigned k-center problem to 4 by increasing the running time. The algorithm also runs in polynomial time when k is a constant. Additionally, we implement our algorithms on three real data sets.
Keywords:
Approximation algorithms; Clustering; k-center problem; k-median problem; Uncertain data

Contacts :
0