Detailseite
Projekt Druckansicht

Quantitative Aspekte Grammatik-basierter Kompression

Fachliche Zuordnung Theoretische Informatik
Förderung Förderung von 2014 bis 2022
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 261105198
 
Erstellungsjahr 2021

Zusammenfassung der Projektergebnisse

In the QUANT-KOMP project we made progress on several intriguing problems in grammar-based compression. For grammar-based string compression, our main achievements are the following: • We solved several open problems from the seminal paper of Charikar et al. concerning the approximation ratio of grammar-based string compressors. For LZ78 and BiSection we precisely determined the approximation ratio. For the global grammar-based string compressors RePair and Greedy we improved the existing lower bounds on the approximation ratio. For our conference paper, we received the best paper award at SPIRE 2016. The journal version was published at IEEE Transactions on Information Theory. • We solved a long standing open problem concerning the depth of string straight-line programs by showing that every string-straight line program of size n that produces a string of length N can be transformed (in linear time) into a string straight-line program of size O(n) and depth O(log N ). In fact, a more general version of this result that covers also grammarbased tree compression can be shown. The result has many applications for algorithms on grammar-based strings and trees. The major part of our results were obtained in the area of grammar-based string compression. • A classical special case of grammar-based tree compression is compression with DAGs (directed acyclic graphs). For a large variety of different classes of random tree models we proved upper and lower bounds on the average size of the DAG. These results generalize older results of Devroye, Flajolet, Gourdon, Martinez, Sipala, and Steyaert. We used these results in order to obtain upper bounds of o(n) on the average case redundancy for a large class of random tree models. • We also studied intensively grammar-based tree compression using tree/forest straight-line programs. We proved an upper bound of O(n/ log n) on the size of a smallest tree/forest straight-line program for a ranked/unranked tree. Using this bound we developed a grammarbased tree compressor that achieves worst-case redundancy o(n) for several natural random tree models. • Based on tree coverings we came up with a tree compressor that achieves all redundancy bounds that are achieved by the tree compressors from the two previous points. This tree compressor has the advantage that it allows to execute a large variety of query operations on compressed trees in constant time. • Finally, we defined a new version of tree entropy, compared it to other notions of tree entropy and proved that in terms of compression ratio this tree entropy can be achieved by grammarbased tree compressors.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung