An efficient, versatile approach to suffix sorting
- 12 June 2008
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Journal of Experimental Algorithmics
- Vol. 12, 1-23
- https://doi.org/10.1145/1227161.1278374
Abstract
Sorting the suffixes of a string into lexicographical order is a fundamental task in a number of contexts, most notably lossless compression (Burrows--Wheeler transformation) and text indexing (suffix arrays). Most approaches to suffix sorting produce a sorted array of suffixes directly, continually moving suffixes into their final place in the array until the ordering is complete. In this article, we describe a novel and resource-efficient (time and memory) approach to suffix sorting, which works in a complementary way—by assigning each suffix its rank in the final ordering, before converting to a sorted array, if necessary, once all suffixes are ranked. We layer several powerful extensions on this basic idea and show experimentally that our approach is superior to other leading algorithms in a variety of real-world contexts.Keywords
This publication has 11 references indexed in Scilit:
- A taxonomy of suffix array construction algorithmsACM Computing Surveys, 2007
- The Performance of Linear Time Suffix Sorting AlgorithmsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String MatchingSIAM Journal on Computing, 2005
- Cache-conscious sorting of large sets of strings with dynamic triesACM Journal of Experimental Algorithmics, 2004
- Engineering a Lightweight Suffix Array Construction AlgorithmAlgorithmica, 2004
- Replacing suffix trees with enhanced suffix arraysJournal of Discrete Algorithms, 2004
- Two Space Saving Tricks for Linear Time LCP Array ComputationLecture Notes in Computer Science, 2004
- Implementing radixsortACM Journal of Experimental Algorithmics, 1998
- Engineering a sort functionSoftware: Practice and Experience, 1993
- Suffix Arrays: A New Method for On-Line String SearchesSIAM Journal on Computing, 1993