The majority of the people interested in deep learning must have used the TensorFlow library. It is the most popular and widely used deep learning framework. We have used the different layers provided by the tf.keras API to build different types of deep neural networks. But, there are many times when some layer is not available in the library and we need to build our own custom layer.
In this tutorial, we will explore and implement a custom layer in TensorFlow using the Keras high-level API. The following steps will be followed:
- We would implement a custom dense or a fully-connected layer.
- We are going to use the custom layer to build a simple feed-forward neural network.
- At last, we would train the model on the MNIST dataset.
RELATED TUTORIALS:
- Dog Breed Classification using Transfer Learning in TensorFlow
- Building Convolutional Autoencoder using TensorFlow 2.0
A Little Understanding About Custom Layers
TensorFlow library provides a simple way to build a custom layer by using tf.keras.Layer class. In the custom layer, we need to implement the following methods:
- __init__: Inside the __init__ function, we do initialization of parameters that depend upon the arguments provided.
- build: The build method provides us with the input_shape as the argument. This argument helps us to initialize the weight required to perform the operation.
- call: This is the function where the actual computation takes place.
Template for a Custom Layer
class CustomLayer(tf.keras.layers.Layer):
def __init__(self,):
super(CustomLayer, self).__init__()
def build(self, input_shape):
pass
def call(self, inputs):
pass
Implementation
Now, we are going to implement the custom layer and train it on the MNIST dataset. You can download the complete code from the link below.
Implementing Custom Layer
We begin by importing all the related classes and layers.
import tensorflow as tf
from tensorflow.keras.layers import Flatten, Dense, Dropout, Activation
Complete Code for Custom Layer
class CustomDense(tf.keras.layers.Layer):
def __init__(self, num_units, activation="relu"):
super(CustomDense, self).__init__()
self.num_units = num_units
self.activation = Activation(activation)
def build(self, input_shape):
## (32, 784) * (784, 10) + (10)
self.weight = self.add_weight(shape=[input_shape[-1], self.num_units])
self.bias = self.add_weight(shape=[self.num_units])
def call(self, input):
y = tf.matmul(input, self.weight) + self.bias
y = self.activation(y)
return y
__init__
Next, we define the CustomDense class and build the __init__ methods. The methods take two arguments:
- num_units: These are the number of neurons or nodes in this layer.
- activation: This is the activation function applied to the output. Its default value is relu.
class CustomDense(tf.keras.layers.Layer):
def __init__(self, num_units, activation="relu"):
super(CustomDense, self).__init__()
self.num_units = num_units
self.activation = Activation(activation)
build
Now we are going to define the build method and initialize the required weights and bias tensor.
def build(self, input_shape):
self.weight = self.add_weight(shape=[input_shape[-1], self.num_units])
self.bias = self.add_weight(shape=[self.num_units])
Here, the shape of the input_shape is batch size x number of input features. For example:
- INPUT_SHAPE = (32 x 768).
Let, say that NUM_UNITS = 128.
So, the shape of
- SELF.WEIGHT = (768 x 128)
- SELF.BIAS = (128)
The knowledge of all these shapes is important as we would require them in the call method.
call
Now, we are going to implement the functionality of the call method.
def call(self, input):
y = tf.matmul(input, self.weight) + self.bias
y = self.activation(y)
return y
Here, we perform a matrix multiplication between the input and the weights, then we add the bias.
- INPUT = (32 x 768)
- WEIGHT = (786 x 128)
- BIAS = (10)
- Y = (INPUT * WEIGHT) = (32 x 768) * (786 x 128) = (32 x 128)
- Y + BIAS = (32 x 128) + (128) = (32 x 128).
So, the shape of the output of this method would be (32 x 128).
The output of this operation is passed through the activation function which would introduce the non-linearity into the network.
Building Model using Custom Layer
We are building a two-layer feed-forward neural network using the CustomDense layer.
model = tf.keras.models.Sequential([
Flatten(input_shape=(28, 28)),
CustomDense(128, activation="relu"),
Dropout(0.3),
CustomDense(10, activation="softmax")
])
model.summary()
Model summary
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= flatten (Flatten) (None, 784) 0 _________________________________________________________________ custom_dense (CustomDense) (None, 128) 100480 _________________________________________________________________ dropout (Dropout) (None, 128) 0 _________________________________________________________________ custom_dense_1 (CustomDense) (None, 10) 1290 ================================================================= Total params: 101,770 Trainable params: 101,770 Non-trainable params: 0
Training the Model
if __name__ == "__main__":
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train/255.0
x_test = x_test/255.0
model = tf.keras.models.Sequential([
Flatten(input_shape=(28, 28)),
CustomDense(128, activation="relu"),
Dropout(0.3),
CustomDense(10, activation="softmax")
])
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy",
metrics=["acc"])
model.fit(x_train, y_train, epochs=5, batch_size=32)
model.evaluate(x_test, y_test)
Training the custom model on 5 epochs with a batch size of 32.
Epoch 1/5 1875/1875 [==============================] - 1s 573us/step - loss: 0.3208 - acc: 0.9059 Epoch 2/5 1875/1875 [==============================] - 1s 550us/step - loss: 0.1650 - acc: 0.9509 Epoch 3/5 1875/1875 [==============================] - 1s 548us/step - loss: 0.1277 - acc: 0.9614 Epoch 4/5 1875/1875 [==============================] - 1s 547us/step - loss: 0.1083 - acc: 0.9672 Epoch 5/5 1875/1875 [==============================] - 1s 551us/step - loss: 0.0926 - acc: 0.9710 313/313 [==============================] - 0s 531us/step - loss: 0.0774 - acc: 0.9776
We have achieved an accuracy of 97.76 % on the test dataset.
Summary
In this tutorial, you have learn to build a custom layer in TensorFlow using the Keras high-level API. If I was able to give you some new knowledge.
For any question or query reach me: