WAVELET TREES( A SUCCINCT-DATA STRUCTURE)

Table of contents
Reading Time: 3 minutes

WAVELET TREES 

INTRODUCTION-

  • The Wavelet Tree is a relatively new, but versatile data structure, offering solutions for many problem domains such as string processing, computational geometry, and data compression. Storing, in its basic form, a sequence of characters from an alphabet enables higher-order entropy compression and supports various fast queries.
  • A wavelet tree is a succinct data structure that recursively partitions a stream into two parts until we’re left with homogeneous data.
  • The name derives from an analogy with the wavelet transform for signals, which recursively decomposes a signal into low-frequency and high-frequency components. Wavelet trees can be used to answer range queries efficiently.
  • The wavelet tree supports access, rank and select queries. An access(p) query on a wavelet tree constructed on string S is the query for what character c is at position p in the string S. The rank of a character c in a string S up to position p is written as rankp(c) and is defined as the number of occurrences of c in the sub-string S[0,…,p]. The position of the oth occurrence of a character c can be found with a select c(o) query.

SUCCINCT DATA STRUCTURE-

  • In computer science, a succinct data structure is a data structure which uses an amount of space that is “close” to the information-theoretic lower bound, but (unlike other compressed representations) still allows for efficient query operations.
  • Unlike general lossless data compression algorithms, succinct data structures retain the ability to use them in-place, without decompressing them first. A related notion is that of a compressed data structure, in which the size of the data structure depends upon the particular data being represented.

CREATION OF WAVELET TREE-

Example :- 1

Figure 1: Wavelet Tree on string adsfadaadsfaads with alphabet Σ=[adfs]. Note that only the bitmaps are actually stored in the tree. the characters are annotations for ease of understanding.

Example :-2

Figure 2: Wavelet tree of elements 121334533322173276

IT’S NEED-

  • The constant increase in the volume of transmitted and stored data makes imperative the design of efficient algorithms and data structures for handling them.
  • The field of data and text compression has been traditionally involved in providing tools that could effectively face the problems emerging in large scale information retrieval applications. 
  • Moreover in the time course of the previous decade, a challenging and interesting issue has flourished: Self-Indexing data structures.

WAVELET TREE – TERMS

  • Wavelet Tree organizes a string into a hierarchy of bit vectors.
  • Rank[c](S,i) := the number of char c at or before position i in S.
  • Select[c](S,j) := the position of the jth occurrence of c in S.
  • S[i] = “access character i” .

                                      

                                         

APPLICATIONS: Inverted Indices-

APPLICATIONS: Document Retrieval-

CONCLUSION

  • In this blog we have described the wavelet tree, a versatile data structure offering
    solutions within problem domains such as data compression and information retrieval.
  • We have discuss survey applications of wavelet trees including efficient data
    compression and fast information retrieval with more detail on how the wavelet tree is
    used for some of the applications.