Semantic segmentation, a crucial task in computer vision, plays a pivotal role in various applications such as medical image analysis, autonomous driving, and object recognition. In this tutorial, we will delve into the implementation of ResNet50 UNET using TensorFlow – a powerful combination that leverages the strengths of both the ResNet50 and UNET architectures for semantic segmentation tasks.
Importing Libraries
The necessary libraries are imported, including TensorFlow layers and models and the ResNet50 architecture from the Keras applications module.
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Conv2DTranspose, Concatenate, Input
from tensorflow.keras.models import Model
from tensorflow.keras.applications import ResNet50
ResNet50 Architecture
Before diving into the implementation, it’s crucial to understand the ResNet50 architecture. ResNet50, short for Residual Network with 50 layers, is a deep convolutional neural network. It is renowned for its skip connections or residual connections, which enable the network to learn and optimize more effectively. The ResNet50 model is pre-trained on ImageNet and employed as our implementation’s base model.
resnet50 = ResNet50(include_top=False, weights="imagenet", input_tensor=inputs)
Read More:
Convolutional Block
The conv_block
function defines a basic convolutional block comprising two convolution layers, each followed by batch normalization and ReLU activation.
def conv_block(input, num_filters):
x = Conv2D(num_filters, 3, padding="same")(input)
x = BatchNormalization()(x)
x = Activation("relu")(x)
x = Conv2D(num_filters, 3, padding="same")(x)
x = BatchNormalization()(x)
x = Activation("relu")(x)
return x
Decoder Block
The decoder_block
function implements a decoding block using Conv2DTranspose for upsampling, concatenation with skip features, and the convolutional block.
def decoder_block(input, skip_features, num_filters):
x = Conv2DTranspose(num_filters, (2, 2), strides=2, padding="same")(input)
x = Concatenate()([x, skip_features])
x = conv_block(x, num_filters)
return x
ResNet50 UNET Model
The build_resnet50_unet
the function creates the ResNet50 UNET model. It includes the ResNet50 base model, encoder blocks, a bridge, and decoder blocks, ultimately producing the segmentation output.
def build_resnet50_unet(input_shape):
""" Input """
inputs = Input(input_shape)
""" Pre-trained ResNet50 Model """
resnet50 = ResNet50(include_top=False, weights="imagenet", input_tensor=inputs)
""" Encoder """
s1 = resnet50.get_layer("input_1").output ## (512 x 512)
s2 = resnet50.get_layer("conv1_relu").output ## (256 x 256)
s3 = resnet50.get_layer("conv2_block3_out").output ## (128 x 128)
s4 = resnet50.get_layer("conv3_block4_out").output ## (64 x 64)
""" Bridge """
b1 = resnet50.get_layer("conv4_block6_out").output ## (32 x 32)
""" Decoder """
d1 = decoder_block(b1, s4, 512) ## (64 x 64)
d2 = decoder_block(d1, s3, 256) ## (128 x 128)
d3 = decoder_block(d2, s2, 128) ## (256 x 256)
d4 = decoder_block(d3, s1, 64) ## (512 x 512)
""" Output """
outputs = Conv2D(1, 1, padding="same", activation="sigmoid")(d4)
model = Model(inputs, outputs, name="ResNet50_U-Net")
return model
Model Summary
Finally, the model is built using the specified input shape, and a summary is displayed.
if __name__ == "__main__":
input_shape = (512, 512, 3)
model = build_resnet50_unet(input_shape)
model.summary()
Conclusion
By combining the ResNet50 architecture with the UNET architecture, we’ve created a powerful semantic segmentation model capable of understanding intricate details in images. This tutorial provides a comprehensive guide, explaining each block of code in detail. Now, armed with this knowledge, you can confidently dive into semantic segmentation tasks using ResNet50 UNET in TensorFlow, opening doors to a wide range of applications in the field of computer vision. Understanding the ResNet50 architecture enhances your grasp of the underlying principles, making the implementation process more insightful and empowering.
Read More: