Skip to main content
Glossary Term

Cross-language information retrieval

Introduction to Cross-language Information Retrieval - CLIR is a subfield of information retrieval. - CLIR deals with retrieving information in a different language from the user's query. - CLIR has synonyms like cross-lingual information retrieval and multilingual information retrieval. - CLIR can be used for both retrieval of multilingual collections and translation of material from one language to another. - CLIR systems use various translation techniques such as dictionary-based, parallel corpora based, comparable corpora based, and machine translator based. Improvements in CLIR Systems - CLIR systems have improved significantly and are nearly as effective as monolingual systems. - CLIR technology benefits users with poor to moderate competence in the target language. - CLIR services include technologies like morphological analysis, decompounding, and translation mechanisms. - CLIR systems face challenges with coverage due to variation in human language. - CLIR is particularly useful when users know the target language only to some extent. Related Information Access Tasks - Other information access tasks like media monitoring, information filtering, sentiment analysis, and information extraction require sophisticated models. - These tasks typically involve more processing and analysis of the information items of interest. - The processing for these tasks needs to be aware of the specifics of the target languages. - CLIR technology can be applied to these tasks to improve their effectiveness. - CLIR systems can handle inflection, compound terms, and translation of queries. Workshops and Conferences Related to CLIR - The first workshop on CLIR was held in Zürich during the SIGIR-96 conference. - Workshops on CLIR have been held yearly since 2000 at the Cross Language Evaluation Forum (CLEF) meetings. - The Text Retrieval Conference (TREC) serves as a point of reference for the CLIR subfield. - Early CLIR experiments were conducted at TREC-6 in 1997. - Researchers discuss their findings regarding different CLIR systems and methods at TREC. Additional Resources and References - EXCLAIM (EXtensible Cross-Linguistic Automatic Information Machine) is a related technology. - CLEF (Conference and Labs of the Evaluation Forum) is a forum for evaluating CLIR systems. - References include articles on matching meaning for CLIR, introduction to CLIR approaches, and multilingual information access. - The proceedings of the first CLIR workshop can be found in the book 'Cross-Language Information Retrieval.' - External links include a resource page and a search engine for CLIR.