Skip to main content
Glossary Term

Text mining

Introduction to Text Mining and Text Analytics - Text mining is the process of deriving high-quality information from text. - It involves automatically extracting information from written resources such as websites, books, emails, reviews, and articles. - High-quality information is obtained by devising patterns and trends using statistical pattern learning. - Text mining tasks include text categorization, text clustering, concept/entity extraction, sentiment analysis, and document summarization. - Text mining involves the application of natural language processing, algorithms, and analytical methods to turn text into data for analysis. - Text analytics uses linguistic, statistical, and machine learning techniques to model and structure the information content of textual sources. - It is synonymous with text mining and is used in business settings. - Text analytics is used to respond to business problems and extract knowledge from unstructured text. - 80% of business-relevant information originates in unstructured form, primarily text. - Text analytics discovers and presents facts, business rules, and relationships that are otherwise locked in textual form. Applications of Text Mining - Text mining technology is applied to government, research, and business needs. - Legal professionals use text mining for e-discovery. - Governments and military groups use text mining for national security and intelligence purposes. - Scientific researchers use text mining to organize large sets of text data and support scientific discovery. - In business, text mining is used for competitive intelligence and automated ad placement. - Text mining software packages are marketed for security applications. - They are used for monitoring and analyzing online plain text sources for national security purposes. - Text mining is also involved in the study of text encryption/decryption. - Text mining has various applications in the biomedical literature. - It assists with studies in protein docking, protein interactions, and protein-disease associations. - Text mining facilitates clinical studies and precision medicine with large patient textual datasets. - It helps in the stratification and indexing of specific clinical events in electronic health records. - Text mining algorithms analyze symptoms, side effects, and comorbidities in healthcare data. Software Applications - Text mining methods and software developed by major firms like IBM and Microsoft. - Efforts in the public sector to create software for tracking and monitoring terrorist activities. - Weka software as a popular option for study purposes. - NLTK toolkit for Python programmers. - Gensim library for advanced programmers focusing on word embedding-based text representations. Online Media Applications - Text mining used by large media companies like the Tribune Company to clarify information and improve search experiences. - Editors benefit from text mining by being able to share, associate, and package news across properties. - Increased opportunities to monetize content through text mining. - Improved site stickiness and revenue through better search experiences. - Use of text mining on the back end of online media platforms. Business and Marketing Applications - Text analytics used in business, particularly in marketing and customer relationship management. - Application of text mining to improve predictive analytics models for customer churn. - Text mining applied in stock returns prediction. - Use of text mining to gain insights into customer behavior and preferences. - Integration of text mining in marketing strategies for targeted advertising.