In this chapter, we will introduce the use of TensorFlow for building a Multi Layer Perceptron, train the model, evaluate the accuracy of the model, use the trained model to identify MNIST handwritten numbers, and try widening and deepening the model to improve accuracy.

From the figure above:
# Training, MNIST train data set total training materials 60000 pen, which will be generated after data preprocessing feature (digital image feature) and label (the true value of the number), and then input the multi-layer perceptron model for training. The trained model can be used as the next stage of prediction.
# Predicting, Input digital image, preprocessing will produce features (Converting digital images to features), using the trained multilayer perceptron model is carried outforecast, finally generate predictions result.

Data Preparation

Now we have to change the new file’s name and import the desired model group, read data–read MNIST data set data, read information material

  • import sys
    import tensorflow as tf
    import tensorflow as tf
    import tensorflow.compat.v1 as tf

    import tensorflow.examples.tutorials.mnist.input_data as input_data
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

    print('train images :', mnist.train.images.shape, 'labels: ', mnist.train.labels.shape)
    print('validation images: ', mnist.validation.images.shape, 'labels: ', mnist.validation.labels.shape)
    print('test images: ', mnist.test.images.shape, 'labels: ', mnist.test.labels.shape)


Keras and TensorFlow model are differences, such as: Keras establish model: just need to use model=Sequential() build a linear model, then use model.add 0 method, each neural network layer can be added to the model. TensorFlow establish model: must self-definition layer function (Handling tensor operations), then use layer function to construct a multilayer perceptron model.

In the subsequent code, we will define the layer function in TensorFlow, and then construct many layer perception model is as follows:

Now, first, establish layer function, follow the code below:

Then see the meaning:
(1) create an input layer x, we use tf.placeholder method to build the input layer(x), placeholder is TensorFlow of the input of the "computation graph" will be passed in digital image data during subsequent training. "float" : The data type is float. [None, 784] : # the first 1 dimension: set as none, because we will transmit a lot of digital images in the subsequent training, and the number of strokes is not fixed, so set it as none. # the first 2 dimension: set as 784, because the input digital image pixel is 784 point.
(2) build hidden layer h1, the code meaning is output_dim=256 : Build the number of neurons in the hidden layer 256. input_dim=784 : The X (input layer) number of neurons, that is, the input digital image pixels 784. inputs=x : X (input layer). activation : Define activation function tf.nn.relu.
(3) build the output layery, the code meaning is output_dim=10 : Build the number of neurons in the output layer 10. input_dim=256 : The h1 (hidden layer) number of neurons, that is, the input digital image pixels 784. inputs=h1 : h1 (hidden layer). activation=None : No activation function required. return:y_predict : Forecast result.

  • x = tf.placeholder("float", [None, 784])
    h1=layer(output_dim = 256, input_dim = 784, inputs = x, activation = tf.nn.relu)
    y_predict=layer(output_dim = 10, input_dim = 256, inputs = h1, activation = None)

Define Training Style

Keras and TensorFlow differences in training methods are defined as follows: Keras define training method: just need to use model.compile, set the loss function, optimal method (optimizer), as well as metrics set how to evaluate the model. TensorFlow define training style: must on their own definition loss function, optimal method (optimizer) and set parameters, and define the accuracy formula for evaluating the model.

See the flow and meaning of the code:
(1) create training data label true value placeholder, "float" : The data type is float, [None, 10] : # The first 1dimension: set as none, because we will transmit a lot of digital images in the subsequent training, and the number of strokes is not fixed, so set it as none. # The first 2 dimension: set as 10, because the real value of the entered number is used as an Onehot encoding conversion, total 10 indivual 0 or 1, corresponding to 0~9 number.
(2) definition loss function, Loss_function=tf.reduce_mean( : Put the following cross_entropy calculate the result average, tf.nn.softmax_cross_entropy_with_logits : Calculate cross_entropy enter the following parameters, (logits=y_predict, : Log its parameter is set to y_predict predictive value, Labels=y_label)) : Labels parameter is set toy_label actual value.
(3) definition optimizer optimization method, optimizer=tf.train : Call tf.train module definition optimizer, .AdamOptimizer(learning_rate=0.001) : Use Adam Optimizer and set learning_rate=0.001, .minimize(loss_function) : Optimizer use loss_function to calculate loss (error), and according to loss (error) and update model weights with deviation (Bias), make loss (error) minimize.

  • y_label = tf.placeholder("float", [None, 10])
    loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_predict, labels=y_label))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss_function)

Define How To Evaluate The Accuracy Of The Model

The flow to evaluate the accuracy of the model:
(1) Calculate whether each piece of data is predicted to be positive sure, correct_prediction : The results of the following operations are stored incorrect_prediction, tf.equal( : By tf.equal judge the following (actual value) and (predictive value) are they equal ? If equal return 1, returned unequal 0.
(2) Calculate the predicted correct result, accuracy = : The results of the following operations are stored inaccuracy, tf.reduce_mean(tf.cast(correct_prediction,"float")) : Correct prediction use first tf.cast convert to "float", reuse tf.reduce_mean average all values.

  • correct_prediction=tf.equal(tf.argmax(y_label, 1), tf.argmax(y_predict, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))


The Keras and TensorFlow training differences are as follows: Keras enter training: just need to use, you can start training. And, TensorFlow to train: must write code to control every process of training.

Now, we use TensorFlow conduct training: The following training materials total 55000 pens, divided into batches 100 pen, to complete the training of all the data, you must execute 550 batch (55000/100=550 batch), when all data training is completed, we call it complete an epoch (training period). We will execute 15 second-rate epoch (training period), try to make the error reduce and try to reduce accuracy improve. Then, we organize the training process, such as the flow chart on the right.

Then, we define training parameter :

  • trainEpochs = 15
    batchSize = 100
    totalBatchs = int(mnist.train.num_examples/batchSize)
    from time import time
    sess = tf.Session()

The code description is as follows:
(1) trainEpochs= 15 : Set execution 15 training cycles, batchSize= 100 : Each batch number 100, totalBatchs=Int(mnist.train.num_examples/batchSize : Calculate each training epoch, (Required execution batch 550) = (Number of training data 55000)/(each batch 100), epoch_list=[]; loss_list=[]; accuracy_list=[] : Initiate epoch_list (training period), loss_list (error), acc_list (accuracy), the error and accuracy will be recorded after each subsequent training cycle is completed. It will be displayed graphically in the next step.
(2) from time import time startTime=time() : Import time module start time, sess=tf.Session() : Establish TensorFlow session, : Enlightenment of TensorFlow global variable.

Then, we conduct train :

  • for epoch in range(trainEpochs):
        for i in range(totalBatchs):
            batch_x, batch_y = mnist.train.next_batch(batchSize)
  ,feed_dict={x: batch_x, y_label: batch_y})

        loss,acc =[loss_function,accuracy],
                            feed_dict={x: mnist.validation.images,
                                       y_label: mnist.validation.labels})

        print("Train Epoch:", '%02d' % (epoch+1), "Loss=", "{:.9f}".format(loss), "Accuracy=", acc)

    print("Train Finished takes:", duration)

Then, here is the explanation the code above:

Code detailed illustrate for i in range (total batchs) performs 550 batches of training:

Then, the next one draw loss and accurcacy result, as the figure below:

Evaluate Model Accuracy

Then, type the code below:

  • print("Accuracy:",, feed_dict={x: mnist.test.images, y_label: mnist.test.labels}))

Make Predictions

Then, type the code below:

  • prediction_result=, 1), feed_dict={x: mnist.test.images})

    import matplotlib.pyplot as plt
    import numpy as np
    def plot_images_labels_prediction(images, labels, prediction, idx, num=10):
        fig = plt.gcf()
        fig.set_size_inches(12, 14)
        if num>25: num=25
        for i in range(0, num):
            ax=plt.subplot(5,5, 1+i)
            ax.imshow(np.reshape(images[idx], (28, 28)), cmap='binary')
            title= "label=" + str(np.argmax(labels[idx]))
            if len(prediction)>0:

    plot_images_labels_prediction(mnist.test.images, mnist.test.labels, prediction_result, 0)

Then, the final result will show like below:


See the full code below, to increase your understanding: