Day 1
less than 1 minute read
IST 736
1.1 Readings
1.2 Student intro
1.3 Text Representation/Vectorization
1.4 Exploratory Text Mining
- Corpus Statistics
- Document Clustering
- Topic Modeling
1.5 Predictive Text Mining
- Text categorization
- Sentiment classification
- News topic classification
- Genre classification
- USING: Naive Bayes and SVM algorithms
- Regression problems
1.6 Difference between Text Mining and NLP
- NLP
- Deep linguistic analysis
- (May take a long time to analyze large collections)
- Text Mining
- Shallow analysis (e.g. N-grams) for quick analysis of large collections
- (Sometimes use deep NLP features like PoS tags or dependencies for feature engineering)
1.7 Class Policies
UNSCHOOLING:
Khan Academy