Clustering algorithms are an important tool in data analysis and machine learning. They allow us to categorize large data sets automatically, giving insight into underlying structures which are otherwise inaccessible. However, these Clustering problems are usually difficult to solve, so sophisticated algorithmic techniques are needed to find solutions. A recently proposed hierarchical clustering scheme promises good results if a tight approximation to Uniform Sparsest Cut problems can be found quickly . While this is a similarly difficult task in general, at least on certain types of Data such approximations can be computed efficiently . This implies, at least in theory, the existence of good clustering algorithms for such data.
In this thesis we aim to improve our understanding of the relationship between Clustering and Sparsest Cut problems as well as assess the viability of Sparsest-Cut-Based Hierarchical Clustering for practical applications. You should therefore have:
- An interest in graph theory and graph algorithms
- Experience programming in C++ or Python
- A willingness to engage with mathematical aspects of Computer Science