What Is Going Wrong With The Training And Predictions Using Tensorflow?
Solution 1:
The answer is 2 fold.
One problem is with the dimensions/parameters. The other problem is that the features are being placed in the wrong spot.
W_conv1 = weight_variable([1, 2, 1, 80])
b_conv1 = bias_variable([80])
Notice the first two numbers in the weight_variable
correspond to the dimensions of the input. The second two numbers correspond to the dimensions of the feature tensor. The bias_variable
always takes the final number in the weight_variable
.
Second Convolutional Layer
W_conv2 = weight_variable([1, 2, 80, 160])
b_conv2 = bias_variable([160])
Here the first two numbers still correspond to the dimensions of the input. The second two numbers correspond to the amount of features and the weighted network that results from the 80 previous features. In this case, we double the weighted network. 80x2=160. The bias_variable
then takes the final number in the weight_variable
. If you were to finish the code at this point, the last number in the weight_variable
would be a 1 in order to prevent dimensional errors due to the shape of the input tensor and the output tensor. But, instead, for better predictions, let's add a third convolutional layer.
Third Convolutional Layer
W_conv3 = weight_variable([1, 2, 160, 1])
b_conv3 = bias_variable([1])
Once again, the first two numbers in the weight_variable
take the shape of the input. The third number corresponds to the amount of the weighted variables we established in the Second Convolutional Layer. The last number in the weight_variable
now becomes 1 so we don't run into any dimension errors on the output that we are predicting. In this case, the output has the dimensions of 1, 2
.
W_fc2 = weight_variable([80, 1024])
b_fc2 = bias_variable([1024])
Here, the number of neurons is 1024
which is completely arbitrary, but the first number in the weight_variable
needs to be something that the dimensions of our feature matrix needs to be divisible by. In this case it can be any number (such as 2, 4, 10, 20, 40, 80
). Once again, the bias_variable
takes the last number in the weight_variable
.
At this point, make sure that the last number in h_pool3_flat = tf.reshape(h_pool3, [-1, 80])
corresponds to the first number in the W_fc2
weight_variable
.
Now when you run your training program you will notice that the outcome varies and won't always guess all 1's or all 0's.
When you want to predict the probabilities, you have to feed x
to the softmax
variable-> y_conv=tf.nn.softmax(tf.matmul(h_fc2_drop, W_fc3) + b_fc3)
like so-
ans = sess.run(y_conv, feed_dict={x: x_test_actual, keep_prob: 1.0})
You can alter the keep_prob
variable, but keeping it at a 1.0
always produces the best results. Now, if you print out ans
you'll have something that looks like this-
[[ 0.90855026 0.09144982][ 0.93020624 0.06979381][ 0.98385173 0.0161483 ][ 0.93948185 0.06051811][ 0.90705943 0.09294061][ 0.95702559 0.04297439][ 0.95543593 0.04456403][ 0.95944828 0.0405517 ][ 0.99154049 0.00845954][ 0.84375167 0.1562483 ][ 0.98449463 0.01550537][ 0.97772813 0.02227189][ 0.98341942 0.01658053][ 0.93026513 0.06973486][ 0.93376994 0.06623009][ 0.98026556 0.01973441][ 0.93210858 0.06789146]
Notice how the probabilities vary. Your training is now working properly.
Post a Comment for "What Is Going Wrong With The Training And Predictions Using Tensorflow?"