Neural Network With Tensorflow Doesn't Update Weights/bias

August 21, 2024 Post a Comment

Problem I'm trying to classify some 64x64 images as a black box exercise. The NN I have written doesn't change my weights. First time writing something like this, the same code, bu

Solution 1:

The key to the problem is that the class number of you output y_ and y is 1.You should adopt one-hot mode when you use tf.nn.softmax_cross_entropy_with_logits on classification problems in tensorflow. tf.nn.softmax_cross_entropy_with_logits will first compute tf.nn.softmax. When your class number is 1, your results are all the same. For example:

import tensorflow as tf

y = tf.constant([[1],[0],[1]],dtype=tf.float32)
y_ = tf.constant([[1],[2],[3]],dtype=tf.float32)

softmax_var = tf.nn.softmax(logits=y_)
cross_entropy = tf.multiply(y, tf.log(softmax_var))

errors = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)

with tf.Session() as sess:
    print(sess.run(softmax_var))
    print(sess.run(cross_entropy))
    print(sess.run(errors))

[[1.]
 [1.]
 [1.]][[0.]
 [0.]
 [0.]]
[0.0.0.]

This means that no matter what your output y_, your loss will be zero. So your weights and bias haven't been updated.

The solution is to modify the class number of y_ and y.

I suppose your class number is n.

First approch:You can change data to one-hot before feed data.Then use the following code.

y_ = tf.placeholder(tf.float32, [None, n])
W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))

Second approch：change data to one-hot after feed data.

y_ = tf.placeholder(tf.int32, [None, 1])
y_ = tf.one_hot(y_,n) # your dtype of y_ need to be tf.int32W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))

Solution 2:

All your initial weights are zeros. When you have that way, the NN doesn't learn well. You need to initialize all the initial weights with random values.

"Why Not Set Weights to Zero? We can use the same set of weights each time we train the network; for example, you could use the values of 0.0 for all weights.

In this case, the equations of the learning algorithm would fail to make any changes to the network weights, and the model will be stuck. It is important to note that the bias weight in each neuron is set to zero by default, not a small random value. "

See https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/