Implementing a library for compressed arrays supporting random access

Betreuer/in:            Tobias Stamm           
Dekanat/Institut:   E-11 Institute for Algorithms and Complexity           

E-Mail:   algo@tuhh.de

Compression is a pervasive technique in data handling and practical implementations of algorithms to reduce the storage and bandwith needs and thereby increase the processing speeds of applications. Usually this is either implemented using a custom format that exploits the structure of the respective problem or by relying on general purpose compression and archiving libraries. Sometimes however one would also like to be able to perform random access on the compressed data and indeed there are utilities that implement this functionality for popular archive formats. These usually work on a block basis and therefore introduce a tradeoff between compression efficiency and access speed not well suited to single word reads for example. And indeed one would not expect much hidden potential in this problem given the difficulty of implementing the functionality at all using the usual prefix-free coding approaches.
Nonetheless recently various authors have published compression codes based on clever indexing schemes that can achieve both good compression and fast random access of small elements. Unfortunalety though no public implementations of these algorithms seem to be available.
The goal of this bachelor topic would therefore be to read and understand the relevant papers and implement the most practical of the algorithms in a performance oriented way as a C++ library. The library should be able to provide an array data type whose compression is transparent and could therefore be used to effortlessly obtain compressed versions (albeit potentially read-only) of the standard array based data types, for example heaps and hashmaps.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert