Glossary Term
Inverted index
Applications of Inverted Index
- Inverted index is a central component of search engine indexing algorithms.
- It allows for fast full-text searches.
- It optimizes query speed by listing the documents per word.
- Inverted index is used in bioinformatics for DNA sequence assembly.
- It is used to search for fragments of sequenced DNA against a reference DNA sequence.
Compression Techniques for Inverted Index
- Inverted list compression and bitmap compression solve the same problem.
- Initially developed as separate lines of research.
- Both methods are used for compressing inverted indexes.
- Compression reduces storage requirements.
- Bitmap compression and inverted list compression are related techniques.
Related Concepts to Inverted Index
- Index (search engine) is related to inverted index.
- Reverse index is another type of index.
- Vector space model is used in information retrieval.
- These concepts are related to inverted index.
- They are important in search engine technology.
References
- Knuth, D. E. (1997) . 'Retrieval on Secondary Keys' in The Art of Computer Programming.
- Salton, Gerard; Fox, Edward A.; Wu, Harry (November 1983). 'Extended Boolean information retrieval' in Communications of the ACM.
- Zobel, Justin; Moffat, Alistair; Ramamohanarao, Kotagiri (December 1998). 'Inverted files versus signature files for text indexing' in ACM Transactions on Database Systems.
- Baeza-Yates, Ricardo; Ribeiro-Neto, Berthier (1999). Modern information retrieval.
- Zobel, Justin; Moffat, Alistair (July 2006). 'Inverted Files for Text Search Engines' in ACM Computing Surveys.
External Resources
- NISTs Dictionary of Algorithms and Data Structures: inverted index.
- Managing Gigabytes for Java: a free full-text search engine for large document collections written in Java.
- Lucene: a full-featured text search engine library written in Java.
- Sphinx Search: an open-source high-performance text search engine library employing an inverted index.
- Example implementations on Rosetta Code.
- Caltech Large Scale Image Search Toolbox: a Matlab toolbox implementing Inverted File Bag-of-Words image search.