What Is The Difference Between Class Weight = None And Auto In Svm Scikit Learn
Solution 1:
This takes place in the class_weight.py file:
elif class_weight == 'auto':
# Find the weight of each class as present in y.
le = LabelEncoder()
y_ind = le.fit_transform(y)
ifnot all(np.in1d(classes, le.classes_)):
raise ValueError("classes should have valid labels that are in y")
# inversely proportional to the number of samples in the class
recip_freq = 1. / bincount(y_ind)
weight = recip_freq[le.transform(classes)] / np.mean(recip_freq)
This means that each class you have (in classes
) gets a weight equal to 1
divided by the number of times that class appears in your data (y
), so classes that appear more often will get lower weights. This is then further divided by the mean of all the inverse class frequencies.
The advantage is that you no longer have to worry about setting the class weights yourself: this should already be good for most applications.
If you look above in the source code, for None
, weight
is filled with ones, so each class gets equal weight.
Solution 2:
This is quite an old post, but for all those who've just encountered this problem, note that class_weight == 'auto' has been deprecated as of version 0.17. Use class_weight == 'balanced' instead.
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
This is implemented as follows:
n_samples / (n_classes * np.bincount(y))
Cheers!
Post a Comment for "What Is The Difference Between Class Weight = None And Auto In Svm Scikit Learn"