Week8
Types of Data for classificatin
Qualitative Data
Categorical or nominal data
Ordinal or Ranked Data
Quantitative Data
Discrete Data
Continuous measurements
Handling Categorical/Nominal data----dummy encoding
from statsmodels.tools import categorical
cat_encod=categorical(data, dictnames=False, drop=True)
# dictnames: create a dict
# drop: create new /drop dataSupport Vector Machine (SVM)
https://oceandatamining.sciencesconf.org/data/program/OBIDAM14_Canu.pdf https://web.stanford.edu/~hastie/Papers/ESLII.pdf
Naive Bayesian
A collection of classification algorithms.
Random Forest
Classification Metrics: confusion matrix!
Diagonal elements of the matrix, it contains the number of correctly identified samples for each class
important note: use heatmap to plot the CM
Exercise
RF
Import the dataset from the following ‘url’ and do a classification with decision tree and Random Forest (RF) with number of trees equal to 5, and compare the result of testing data with confusion matrix.
Last updated
Was this helpful?