from statsmodels.tools import categorical
cat_encod=categorical(data, dictnames=False, drop=True)
# dictnames: create a dict
# drop: create new /drop data
Support Vector Machine (SVM)
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test=train_test_split(x,y,test_size=f)
from sklearn.svm import SVC
svc=SVC(c=1.0, kernel='rbf', degree=3,gamma='auto',probability=Fales, tol=0.001, max_iter=-1, random_state=None)
svc.fit(x_train, y_train)
svc.predict(x_test)
svc.predict(x_train)
# c: penalty parameter
#kernel: 'rbf','sigmoid','poly'
#tol: stop criterion
#max_iter: -1 no limit;
#random_state: random-seed to use
Naive Bayesian
A collection of classification algorithms.
from sklearn.naive_bayes import GaussianNB
NB =GaussianNB(priors)
NB.fit(X,y)
NB.predict(X_dash)
Import the dataset from the following ‘url’ and do a classification with decision tree and Random Forest (RF) with number of trees equal to 5, and compare the result of testing data with confusion matrix.