Glossary Term
Ranking (information retrieval)
Ranking Algorithms
- PageRank: Originated in the 1940s, developed by Wassily Leontief in economics.
- Endorsement Ranking: Developed by Charles H Hubbell in 1965, based on the importance of endorsers.
- Journal Ranking: Developed by Gabriel Pinski and Francis Narin, based on citations from important journals.
- HITS Algorithm: Developed by Jon Kleinberg, treating web pages as hubs and authorities.
Ranking Models
- Boolean Model: Fetches complete matches, does not rank documents.
- Vector Space Model: Addresses partial matches, assigns weights to index items, calculates similarity scores using cosine similarity.
- Probabilistic Model: Uses probability theory, ranks documents based on decreasing probability of relevance.
Evaluation Measures
- Precision: Measures the proportion of top-ranked results that are relevant.
- Recall: Measures the completeness of the information retrieval process.
- F1 Score: Combines precision and recall into a harmonic mean.
- Precision-Recall Curves: Plotted to evaluate ranked retrieval results.
HITS Algorithm
- HITS uses Link Analysis for analyzing page relevance.
- Works on small sets of subgraph.
- Query dependent.
- Subgraphs are ranked according to weights in hubs and authorities.
- Pages with the highest ranks are fetched and displayed.
Learning to Rank: Application of Machine Learning
- Learning to rank is an application of machine learning.
- Used for solving the ranking problem.
- Machine learning techniques are applied to rank items.
- Widely used in information retrieval.
- Learning to rank improves the accuracy of search results.