Skip to content Skip to sidebar Skip to footer

Batch Gradient Descent With Scikit Learn (sklearn)

I'm playing with a one-vs-all Logistic Regression classifier using Scikit-Learn (sklearn). I have a large dataset that is too slow to run all at one go; also I would like to study

Solution 1:

What you want is not batch gradient descent, but stochastic gradient descent; batch learning means learning on the entire training set in one go, while what you describe is properly called minibatch learning. That's implemented in sklearn.linear_model.SGDClassifier, which fits a logistic regression model if you give it the option loss="log".

With SGDClassifier, like with LogisticRegression, there's no need to wrap the estimator in a OneVsRestClassifier -- both do one-vs-all training out of the box.

# you'll have to set a few other options to get good estimates,# in particular n_iterations, but this should get you goinglr = SGDClassifier(loss="log")

Then, to train on minibatches, use the partial_fit method instead of fit. The first time around, you have to feed it a list of classes because not all classes may be present in each minibatch:

import numpy as npclasses= np.unique(["ham", "spam", "eggs"])

for xs, ys in minibatches:
    lr.partial_fit(xs, ys, classes=classes)

(Here, I'm passing classes for each minibatch, which isn't necessary but doesn't hurt either and makes the code shorter.)

Post a Comment for "Batch Gradient Descent With Scikit Learn (sklearn)"