Glossary Term
Cross-language information retrieval
Introduction to Cross-language Information Retrieval
- CLIR is a subfield of information retrieval.
- CLIR deals with retrieving information in a different language from the user's query.
- CLIR has synonyms like cross-lingual information retrieval and multilingual information retrieval.
- CLIR can be used for both retrieval of multilingual collections and translation of material from one language to another.
- CLIR systems use various translation techniques such as dictionary-based, parallel corpora based, comparable corpora based, and machine translator based.
Improvements in CLIR Systems
- CLIR systems have improved significantly and are nearly as effective as monolingual systems.
- CLIR technology benefits users with poor to moderate competence in the target language.
- CLIR services include technologies like morphological analysis, decompounding, and translation mechanisms.
- CLIR systems face challenges with coverage due to variation in human language.
- CLIR is particularly useful when users know the target language only to some extent.
Related Information Access Tasks
- Other information access tasks like media monitoring, information filtering, sentiment analysis, and information extraction require sophisticated models.
- These tasks typically involve more processing and analysis of the information items of interest.
- The processing for these tasks needs to be aware of the specifics of the target languages.
- CLIR technology can be applied to these tasks to improve their effectiveness.
- CLIR systems can handle inflection, compound terms, and translation of queries.
Workshops and Conferences Related to CLIR
- The first workshop on CLIR was held in Zürich during the SIGIR-96 conference.
- Workshops on CLIR have been held yearly since 2000 at the Cross Language Evaluation Forum (CLEF) meetings.
- The Text Retrieval Conference (TREC) serves as a point of reference for the CLIR subfield.
- Early CLIR experiments were conducted at TREC-6 in 1997.
- Researchers discuss their findings regarding different CLIR systems and methods at TREC.
Additional Resources and References
- EXCLAIM (EXtensible Cross-Linguistic Automatic Information Machine) is a related technology.
- CLEF (Conference and Labs of the Evaluation Forum) is a forum for evaluating CLIR systems.
- References include articles on matching meaning for CLIR, introduction to CLIR approaches, and multilingual information access.
- The proceedings of the first CLIR workshop can be found in the book 'Cross-Language Information Retrieval.'
- External links include a resource page and a search engine for CLIR.