Predict_proba For A Cross-validated Model
I would like to predict the probability from Logistic Regression model with cross-validation. I know you can get the cross-validation scores, but is it possible to return the value
Solution 1:
This is now implemented as part of scikit-learn version 0.18. You can pass a 'method' string parameter to the cross_val_predict method. Documentation is here.
Example:
proba = cross_val_predict(logreg, X, y, cv=cv, method='predict_proba')
Also note that this is part of the new sklearn.model_selection package so you will need this import:
from sklearn.model_selectionimport cross_val_predict
Solution 2:
An easy workaround for this is to create a wrapper class, which for your case would be
classproba_logreg(LogisticRegression):defpredict(self, X):
return LogisticRegression.predict_proba(self, X)
and then pass an instance of it as the classifier object to cross_val_predict
# cross validation probabilitiesprobas = cross_val_predict(proba_logreg(), X, y, cv=cv)
Solution 3:
There is a function cross_val_predict
that gives you the predicted values, but there is no such function for "predict_proba" yet. Maybe we could make that an option.
Solution 4:
This is easy to implement:
defmy_cross_val_predict(
m, X, y, cv=KFold(),
predict=lambda m, x: m.predict_proba(x),
combine=np.vstack
):
preds = []
for train, test in cv.split(X):
m.fit(X[train, :], y[train])
pred = predict(m, X[test, :])
preds.append(pred)
return combine(preds)
This one returns predict_proba.
If you need both predict and predict_proba just change predict
and combine
arguments:
defstack(arrs):
if arrs[0].ndim == 1:
return np.hstack(arrs)
else:
return np.vstack(arrs)
defmy_cross_val_predict(
m, X, y, cv=KFold(),
predict=lambda m, x:[ m.predict(x)
, m.predict_proba(x)
],
combine=lambda preds: list(map(stack, zip(*preds)))
):
preds = []
for train, test in cv.split(X):
m.fit(X[train, :], y[train])
pred = predict(m, X[test, :])
preds.append(pred)
return combine(preds)
Post a Comment for "Predict_proba For A Cross-validated Model"