Fit and predict
Machine learning in python is elusively easy and convenient.
For any model one simply calls:
# Fit the model to the data
clf.fit(X,y)
# Use the model to predict y
clf.predict(X)
Fitting a DecisionTreeClassifier model to the famous Iris flower data set can be done as:
# Import built in iris data from scikit learn
from sklearn.datasets import load_iris
# Import a classifier, in this case a Decision Tree
from sklearn.tree import DecisionTreeClassifier
# Grab the iris data
X, y = load_iris(return_X_y=True)
# Import and create a classifier
clf = DecisionTreeClassifier()
# Simply fit the classifier to your data
clf.fit(X,y)
# Predict the class of the first examples
clf.predict(X[0:2,:])
This interface makes it amazingly easy to use these relatively advanced statistical models. However, it's never a good idea to just throw a model on your data.
Last updated
Was this helpful?