IST 736 WK7 Class 11.13.19

1 minute read

Dimensionality Reduction methods

Cannot reduce dimensions in protein folding Order matters so much Facial Recognition – Needs Advanced Linear Algebra

PCA and ICA

aren’t going to give things that the model things is important These generate the ones with the most information Common to use PCA to reduce to 3D so we can visualize

ICA

Highest independence from each other Used a lot for signal separation

Calculate variance Calculate co-variance – correlation (if high correlation, don’t need both)

TODO: Code your OWN svm in python legrangian multipliers

LABEL IS NEVER PART OF THE DATASET

SVMS

Can my dataset be broken into two groups? Can it be linearly separated? A single SVM can only classify into one of two groups If you have MULTIPLE variables, you use MULTIPLE svms First SVM is mammal and not mammal. Second SVM is fish or birds ^^ Called an ENSEMBLE of SVMs it will find THE BEST linear separater It has to define what is best Soft Margin SVM

A kernel is a function applied to… all the things

SVM ONLY works on NUMERIC data

Cannot label Titanic data as 0 and 1

Look up min max

When you build a model, you know the testing and training and you have the labels for both NORMALIZE THE DATA BEFORE SPLITTING INTO TEST AND TRAIN

Google “how to find encoding for text”

Updated: