Glossary Term
Text mining
Introduction to Text Mining and Text Analytics
- Text mining is the process of deriving high-quality information from text.
- It involves automatically extracting information from written resources such as websites, books, emails, reviews, and articles.
- High-quality information is obtained by devising patterns and trends using statistical pattern learning.
- Text mining tasks include text categorization, text clustering, concept/entity extraction, sentiment analysis, and document summarization.
- Text mining involves the application of natural language processing, algorithms, and analytical methods to turn text into data for analysis.
- Text analytics uses linguistic, statistical, and machine learning techniques to model and structure the information content of textual sources.
- It is synonymous with text mining and is used in business settings.
- Text analytics is used to respond to business problems and extract knowledge from unstructured text.
- 80% of business-relevant information originates in unstructured form, primarily text.
- Text analytics discovers and presents facts, business rules, and relationships that are otherwise locked in textual form.
Applications of Text Mining
- Text mining technology is applied to government, research, and business needs.
- Legal professionals use text mining for e-discovery.
- Governments and military groups use text mining for national security and intelligence purposes.
- Scientific researchers use text mining to organize large sets of text data and support scientific discovery.
- In business, text mining is used for competitive intelligence and automated ad placement.
- Text mining software packages are marketed for security applications.
- They are used for monitoring and analyzing online plain text sources for national security purposes.
- Text mining is also involved in the study of text encryption/decryption.
- Text mining has various applications in the biomedical literature.
- It assists with studies in protein docking, protein interactions, and protein-disease associations.
- Text mining facilitates clinical studies and precision medicine with large patient textual datasets.
- It helps in the stratification and indexing of specific clinical events in electronic health records.
- Text mining algorithms analyze symptoms, side effects, and comorbidities in healthcare data.
Software Applications
- Text mining methods and software developed by major firms like IBM and Microsoft.
- Efforts in the public sector to create software for tracking and monitoring terrorist activities.
- Weka software as a popular option for study purposes.
- NLTK toolkit for Python programmers.
- Gensim library for advanced programmers focusing on word embedding-based text representations.
Online Media Applications
- Text mining used by large media companies like the Tribune Company to clarify information and improve search experiences.
- Editors benefit from text mining by being able to share, associate, and package news across properties.
- Increased opportunities to monetize content through text mining.
- Improved site stickiness and revenue through better search experiences.
- Use of text mining on the back end of online media platforms.
Business and Marketing Applications
- Text analytics used in business, particularly in marketing and customer relationship management.
- Application of text mining to improve predictive analytics models for customer churn.
- Text mining applied in stock returns prediction.
- Use of text mining to gain insights into customer behavior and preferences.
- Integration of text mining in marketing strategies for targeted advertising.