IST 736 WK7 Class 11.13.19
Dimensionality Reduction methods
Cannot reduce dimensions in protein folding Order matters so much Facial Recognition – Needs Advanced Linear Algebra
PCA and ICA
aren’t going to give things that the model things is important These generate the ones with the most information Common to use PCA to reduce to 3D so we can visualize
ICA
Highest independence from each other Used a lot for signal separation
Calculate variance Calculate co-variance – correlation (if high correlation, don’t need both)
TODO: Code your OWN svm in python legrangian multipliers
LABEL IS NEVER PART OF THE DATASET
SVMS
Can my dataset be broken into two groups? Can it be linearly separated? A single SVM can only classify into one of two groups If you have MULTIPLE variables, you use MULTIPLE svms First SVM is mammal and not mammal. Second SVM is fish or birds ^^ Called an ENSEMBLE of SVMs it will find THE BEST linear separater It has to define what is best Soft Margin SVM
A kernel is a function applied to… all the things
SVM ONLY works on NUMERIC data
Cannot label Titanic data as 0 and 1
Look up min max
When you build a model, you know the testing and training and you have the labels for both NORMALIZE THE DATA BEFORE SPLITTING INTO TEST AND TRAIN
Google “how to find encoding for text”