Portfolio
OUTLINE OVERVIEW
The overall goal of the Project Portfolio is to demonstrate to a panel of faculty experts that the student is able to:
- Describe a broad overview of the major practice areas in data science
- Collect and organize data
- Identify patterns in data via visualization, statistical analysis, and data mining
- Develop alternative strategies based on the data
- Develop a plan of action to implement the business decisions derived from the analyses
- Demonstrate communication skills regarding data and its analysis for managers, IT professionals, programmers, statisticians, and other relevant professionals in their organization
- Synthesize the ethical dimensions of data science practice (e.g., privacy)
OUTLINE
1. Broad Overview
2. Collect & Organize Data
COLLECTING DATA:
CLEANING DATA:
- How to deal with “dirty” data? HW3_B
LABELING DATA:
- Using Amazon Mechanical Turk HW5
3. Identify Patterns & Visualize Data
-
HW7 – Kaggle Sentiment HW7_V2
-
Data Viz – Happiness Graphs Graphs in R
4. Analyze Data
- Naive Bayes (IST736_HW4_HW6) HW4_6
- SVM (IST736_HW7_V2) HW7_V2
- Topic Modeling (IST736_HW8) HW8
- Using NLP (NLP_FinalProject) NLP_FinalProject
5. Implement Business Decisions
- Final Project
- Joker?
- Scripting for IMDB
6. Communicate analysis (Visualizations)
7. Review ethical ramifications
- Final Project
- HW3_B
- Scraping TMDB
- Using ARM for IMDB (Woody Allen)
10 Projects By Class:
IST ??? – Natural Language Processing
IST ??? – Scripting for Data Analysis
- IMDB Final Project
IST 736 – Text Mining
Projects by Topic (with code)
10: IST 736 HW8 – Topic Modeling
Projects In Progress
CLASSES:
iSchool:
IST 659 – DATABASE ADMIN & MGMT
IST 687 – INTRO TO DATA SCIENCE