Keras Model For Siamese Network Not Learning And Always Predicting The Same Ouput
Solution 1:
Mentioning the resolution to this issue in this section (even though it is present in Comments Section), for the benefit of the community.
Since the Model is working fine with other Standard Datasets, the solution is to use more Data. Model is not learning because it has less data for Training.
Solution 2:
The Model is working fine with more data as mentioned in comments and in the answer by Tensorflow Support. Tweaking the model a little is also working. Changing the number of filters in 2nd and 3rd convolutional layers from 256 to 64 is decreasing the number of trainable parameters by a large number and therefore model started learning.
Solution 3:
I want to mention few things here which may be useful to others:
1) Data stratification / random sampling
When you use validation_split
Keras uses the last x percent of data as validation data. This means that if the data is ordered by class, e.g. because "pairs" or "tripletts" are made in a sequence, validation data will only come from classes (or the class) contained in the last x percent of data. In this case, the validation set will be of no use. Thus it is essential to suffle input data to make sure that the validation set contains random samples from each class.
The docs for validation_split
say:
Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling
2) Choice of optimizer
In model.compile()
choosing optimizer='sgd'
may not be the best approach since sgd
can get stuck in local minima etc. Adam
(see docs) seems to be a good choice to start with since it...
[...] combines the advantages of [...] AdaGrad to deal with sparse gradients, and the ability of RMSProp to deal with non-stationary objectives.
according to Kingma and Ba (2014, page 10).
from keras.optimizers import Adam
...
model.compile(loss=contrastive_loss, optimizer=keras.optimizers.Adam(lr=0.0001))
3) Early stopping / learning rate
Using early stopping and adjusting the learning rate during training may also be highly useful to achieve good results. So the model can train until there is no more success (stop automatically in this case).
from keras.callbacks import EarlyStopping
from keras.callbacks import ReduceLROnPlateau
...
early_stopping = EarlyStopping(monitor='val_loss', patience=50, mode='auto', restore_best_weights=True)
reduce_on_plateau = ReduceLROnPlateau(monitor="val_loss", factor=0.8, patience=15, cooldown=5, verbose=0)
...
hist = model.fit([img_1, img_2], y,
validation_split=.2,
batch_size=128,
verbose=1,
epochs=9999,
callbacks=[early_stopping])
4) Kernel initialization
Kernel initialization (with a small SD) may be helpful as well.
# Layer 1
seq.add(Conv2D(8, (5,5), input_shape=input_shape,
kernel_initializer=keras.initializers.TruncatedNormal(mean=0.0, stddev=0.01, seed=None),
data_format="channels_first"))
seq.add(Activation('relu'))
seq.add(MaxPooling2D(pool_size=(2, 2)))
seq.add(Dropout(0.1))
5) Overfitting
I noticed that instead of using dropout to fight overfitting, adding some noise can be rather helpful. In this case simply add some GaussianNoise at the top of the network.
Post a Comment for "Keras Model For Siamese Network Not Learning And Always Predicting The Same Ouput"