本文在 《TensorFlow 实战之Softmax Regression识别手写数字(一)》的基础上增加了一个隐藏层(200个节点),使用relu激活函数,将手写数字识别准确率从91.2%提升到了95.8%。
变化细节:
增加200个节点的隐藏层,使用relu作为隐藏层的激活函数,当然也尝试了使用sigmoid作为激活函数,但是效果没relu好。sigmoid和relu的比较,后续会用专门的章节进行介绍。
实战中如果遇到问题,请参考 http://www.sohu.com/a/125061373_465975 这篇博客。
代码:
# coding:utf-8
'''
简单softmax识别手写数字
'''
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
####################################################
#
# @brief 定义模型参数
#
####################################################
LEARNING_RATE = 0.01
BATCH_NUMBER = 1000
BATCH_SIZE = 50
HIDDEN_SIZE = 200
INPUT_SIZE = 784
CLASS_NUMBER = 10
####################################################
#
# @brief 定义输入
#
####################################################
# x: 输入向量, None*784维
x = tf.placeholder("float", [None, INPUT_SIZE])
# y: 输入标签
y = tf.placeholder("float", [None, CLASS_NUMBER])
####################################################
#
# @brief 定义求解参数
#
####################################################
# w: 权重, 784 * 10 维
# w = tf.Variable(tf.zeros([784, HIDDEN_SIZE]), name="weights")
w1 = tf.Variable(tf.truncated_normal([INPUT_SIZE, HIDDEN_SIZE], stddev=0.1))
# b: 截距, 10维
# b = tf.Variable(tf.zeros([HIDDEN_SIZE]), name="biases")
b1 = tf.Variable(tf.ones([HIDDEN_SIZE])/10, name="biases1")
# relu效果
# h1 = tf.sigmoid(tf.matmul(x, w1) + b1) # 0.9212
h1 = tf.nn.relu(tf.matmul(x, w1) + b1) # 0.9558
w2 = tf.Variable(tf.truncated_normal([HIDDEN_SIZE, CLASS_NUMBER], stddev=0.1))
b2 = tf.Variable(tf.ones([CLASS_NUMBER])/10, name="biases2")
####################################################
#
# @brief 定义优化过程
#
####################################################
# predict_y: 预测的结果 x * w + b, None * 10 维
predict_y = tf.nn.softmax(tf.matmul(h1, w2) + b2)
# 交叉熵损失函数
cross_entropy = - tf.reduce_sum(y * tf.log(predict_y))
# 梯度下降
train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)
####################################################
#
# @brief 定义衡量指标 op
#
####################################################
equal_op = tf.equal(tf.argmax(y, 1), tf.argmax(predict_y, 1))
accuracy_op = tf.reduce_mean(tf.cast(equal_op, "float"))
####################################################
#
# @brief 开始求解
#
####################################################
# 初始化 op
init_op = tf.initialize_all_variables()
# mnist数据输入
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
test_xs = mnist.test.images
test_ys = mnist.test.labels
# 创建一个保存变量的op
saver = tf.train.Saver()
with tf.Session() as sess:
sess = tf.Session()
sess.run(init_op)
# 迭代
for step in range(BATCH_NUMBER):
train_xs, train_ys = mnist.train.next_batch(BATCH_SIZE)
sess.run(train_op, feed_dict={x: train_xs, y: train_ys})
train_accuracy = sess.run(accuracy_op, feed_dict={x: train_xs, y: train_ys})
test_accuracy = sess.run(accuracy_op, feed_dict={x: test_xs, y: test_ys})
print("step-%d accuracy: train-%f test-%f" % \
(step, train_accuracy, test_accuracy))输出:
step-0 accuracy: train-0.240000 test-0.104600 step-1 accuracy: train-0.520000 test-0.312800 step-2 accuracy: train-0.300000 test-0.097400 step-3 accuracy: train-0.340000 test-0.415200 step-4 accuracy: train-0.620000 test-0.404300 step-5 accuracy: train-0.560000 test-0.486000 step-6 accuracy: train-0.720000 test-0.525000 step-7 accuracy: train-0.700000 test-0.604200 step-8 accuracy: train-0.640000 test-0.618300 step-9 accuracy: train-0.760000 test-0.607800 step-10 accuracy: train-0.740000 test-0.617200 step-11 accuracy: train-0.680000 test-0.617000 step-12 accuracy: train-0.800000 test-0.678400 step-13 accuracy: train-0.780000 test-0.656600 step-14 accuracy: train-0.920000 test-0.738200 step-15 accuracy: train-0.760000 test-0.716000 step-16 accuracy: train-0.840000 test-0.734400 step-17 accuracy: train-0.800000 test-0.763600 step-18 accuracy: train-0.920000 test-0.727200 step-19 accuracy: train-0.920000 test-0.800100 step-20 accuracy: train-0.840000 test-0.731300 ...... step-980 accuracy: train-0.960000 test-0.961400 step-981 accuracy: train-1.000000 test-0.959300 step-982 accuracy: train-1.000000 test-0.963400 step-983 accuracy: train-1.000000 test-0.964900 step-984 accuracy: train-1.000000 test-0.964200 step-985 accuracy: train-1.000000 test-0.962800 step-986 accuracy: train-1.000000 test-0.964000 step-987 accuracy: train-0.980000 test-0.961000 step-988 accuracy: train-1.000000 test-0.958100 step-989 accuracy: train-1.000000 test-0.959300 step-990 accuracy: train-1.000000 test-0.959900 step-991 accuracy: train-1.000000 test-0.960400 step-992 accuracy: train-1.000000 test-0.962100 step-993 accuracy: train-0.980000 test-0.964100 step-994 accuracy: train-1.000000 test-0.965600 step-995 accuracy: train-1.000000 test-0.959000 step-996 accuracy: train-1.000000 test-0.961000 step-997 accuracy: train-1.000000 test-0.963800 step-998 accuracy: train-0.980000 test-0.963600 step-999 accuracy: train-1.000000 test-0.957700