Custom Layer in TensorFlow using Keras API

The majority of the people interested in deep learning must have used the TensorFlow library. It is the most popular and widely used deep learning framework. We have used the different layers provided by the tf.keras API to build different types of deep neural networks. But, there are many times when some layer is not available in the library and we need to build our own custom layer.

In this tutorial, we will explore and implement a custom layer in TensorFlow using the Keras high-level API. The following steps will be followed:

  1. We would implement a custom dense or a fully-connected layer.
  2. We are going to use the custom layer to build a simple feed-forward neural network.
  3. At last, we would train the model on the MNIST dataset.

RELATED TUTORIALS:

A Little Understanding About Custom Layers

TensorFlow library provides a simple way to build a custom layer by using tf.keras.Layer class. In the custom layer, we need to implement the following methods:

  1. __init__: Inside the __init__ function, we do initialization of parameters that depend upon the arguments provided.
  2. build: The build method provides us with the input_shape as the argument. This argument helps us to initialize the weight required to perform the operation.
  3. call: This is the function where the actual computation takes place.

Template for a Custom Layer

class CustomLayer(tf.keras.layers.Layer):
    def __init__(self,):
        super(CustomLayer, self).__init__()

    def build(self, input_shape):
       pass

    def call(self, inputs):
        pass

Implementation

Now, we are going to implement the custom layer and train it on the MNIST dataset. You can download the complete code from the link below.

Implementing Custom Layer

We begin by importing all the related classes and layers.

import tensorflow as tf
from tensorflow.keras.layers import Flatten, Dense, Dropout, Activation

Complete Code for Custom Layer

class CustomDense(tf.keras.layers.Layer):
    def __init__(self, num_units, activation="relu"):
        super(CustomDense, self).__init__()

        self.num_units = num_units
        self.activation = Activation(activation)

    def build(self, input_shape):
        ## (32, 784) * (784, 10) + (10)

        self.weight = self.add_weight(shape=[input_shape[-1], self.num_units])
        self.bias = self.add_weight(shape=[self.num_units])

    def call(self, input):
        y = tf.matmul(input, self.weight) + self.bias
        y = self.activation(y)
        return y

__init__

Next, we define the CustomDense class and build the __init__ methods. The methods take two arguments:

  1. num_units: These are the number of neurons or nodes in this layer.
  2. activation: This is the activation function applied to the output. Its default value is relu.
class CustomDense(tf.keras.layers.Layer):
    def __init__(self, num_units, activation="relu"):
        super(CustomDense, self).__init__()

        self.num_units = num_units
        self.activation = Activation(activation)

build

Now we are going to define the build method and initialize the required weights and bias tensor.

    def build(self, input_shape):
        self.weight = self.add_weight(shape=[input_shape[-1], self.num_units])
        self.bias = self.add_weight(shape=[self.num_units])

Here, the shape of the input_shape is batch size x number of input features. For example:

  • INPUT_SHAPE = (32 x 768).

Let, say that NUM_UNITS = 128.

So, the shape of

  • SELF.WEIGHT = (768 x 128)
  • SELF.BIAS = (128)

The knowledge of all these shapes is important as we would require them in the call method.

call

Now, we are going to implement the functionality of the call method.

    def call(self, input):
        y = tf.matmul(input, self.weight) + self.bias
        y = self.activation(y)
        return y

Here, we perform a matrix multiplication between the input and the weights, then we add the bias.

  • INPUT = (32 x 768)
  • WEIGHT = (786 x 128)
  • BIAS = (10)

  • Y = (INPUT * WEIGHT) = (32 x 768) * (786 x 128) = (32 x 128)
  • Y + BIAS = (32 x 128) + (128) = (32 x 128).

So, the shape of the output of this method would be (32 x 128).

The output of this operation is passed through the activation function which would introduce the non-linearity into the network.

Building Model using Custom Layer

We are building a two-layer feed-forward neural network using the CustomDense layer.

model = tf.keras.models.Sequential([
        Flatten(input_shape=(28, 28)),
        CustomDense(128, activation="relu"),
        Dropout(0.3),
        CustomDense(10, activation="softmax")
])
model.summary()

Model summary

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
custom_dense (CustomDense)   (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
custom_dense_1 (CustomDense) (None, 10)                1290      
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0

Training the Model

if __name__ == "__main__":
    mnist = tf.keras.datasets.mnist

    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train = x_train/255.0
    x_test = x_test/255.0

    model = tf.keras.models.Sequential([
        Flatten(input_shape=(28, 28)),
        CustomDense(128, activation="relu"),
        Dropout(0.3),
        CustomDense(10, activation="softmax")
    ])

    model.compile(optimizer="adam", loss="sparse_categorical_crossentropy",
        metrics=["acc"])
    model.fit(x_train, y_train, epochs=5, batch_size=32)
    model.evaluate(x_test, y_test)

Training the custom model on 5 epochs with a batch size of 32.

Epoch 1/5
1875/1875 [==============================] - 1s 573us/step - loss: 0.3208 - acc: 0.9059
Epoch 2/5
1875/1875 [==============================] - 1s 550us/step - loss: 0.1650 - acc: 0.9509
Epoch 3/5
1875/1875 [==============================] - 1s 548us/step - loss: 0.1277 - acc: 0.9614
Epoch 4/5
1875/1875 [==============================] - 1s 547us/step - loss: 0.1083 - acc: 0.9672
Epoch 5/5
1875/1875 [==============================] - 1s 551us/step - loss: 0.0926 - acc: 0.9710
313/313 [==============================] - 0s 531us/step - loss: 0.0774 - acc: 0.9776

We have achieved an accuracy of 97.76 % on the test dataset.

Summary

In this tutorial, you have learn to build a custom layer in TensorFlow using the Keras high-level API. If I was able to give you some new knowledge.

For any question or query reach me:

Nikhil Tomar

I am an independent researcher in the field of Artificial Intelligence. I love to write about the technology I am working on.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *