What is U2-Net or U-square Net

U2-Net is a simple and powerful architecture designed for the purpose of salient object detection (SOD). It is a two-level nested U-shaped architecture built using the proposed ReSidual U-blocks (RSU). The U2-Net does not use any pre-trained architecture and is trained from scratch. The architecture comes with two variants: U2-Net Continue Reading

Simple Object Detection with Bounding Box Regression in TensorFlow

Object detection is a fundamental task in computer vision that involves identifying and locating objects within an image or video. In this post, we will be discussing a simple method for object detection using bounding box regression in TensorFlow. Bounding box regression is a technique used to predict the location Continue Reading

Human Face Landmark Detection in TensorFlow using Pre-trained MobileNetv2

Today, in this blog post, we will learn how to train a Convolutional Neural Network (CNN) to detect human facial landmarks, such as eyes, mouth, nose, jawline and more. We will use the pre-trained MobileNetv2 from TensorFlow to build our model and then train it on Landmark Guided Face Parsing Continue Reading

Vision Transformer – An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

In this blog post, we are going to learn about the Vision Transformer (ViT). It is a pure Transformer based architecture used for image classification tasks. Vision Transformer (ViT) has the ability to replace the standard CNNs while achieving excellent results. The Vision Transformer (ViT) attains excellent results when pre-trained Continue Reading

MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition

In this work, we present a lightweight matting objective decomposition network (MODNet) for portrait matting in real-time with a single input image. MODNet inputs a single RGB image and applies explicit constraints to solve matting sub-objectives simultaneously in one stage. The research paper is accepted at AAAI 2022 conference. Research Continue Reading

Why Deep Learning is not Artificial General Intelligence (AGI)

With the development in the field of deep learning, it has become a frontier in solving multiple challenging problems in computer vision, games, self-driving cars and many more.  Deep learning has even achieved superhuman performance in some tasks, but still, it lacks some fundamental features which are required for a Continue Reading

PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model

PP-LiteSeg is a lightweight encoder-decoder architecture designed for real-time semantic segmentation. It consists of three modules: Encoder: Lightweight network Aggregation: Simple Pyramid Pooling Module (SPPM) Decoder: Flexible and Lightweight Decoder (FLD) and Unified Attention Fusion Module (UAFM) Encoder The STDCNet is the encoder for the proposed PP-LiteSeg for its high Continue Reading

Deep Learning based Background Removal from Images using TensorFlow and Python

In this tutorial, we are going to learn how to use deep learning to remove background from images with TensorFlow. In short, we’ll use DeepLabV3+, a semantic segmentation based model to extract the background and foreground mask from the image. We are going to use these masks to extract the Continue Reading

VGG16 UNET Implementation in TensorFlow

In this article, we are going to implement the most widely used image segmentation architecture called UNET. We are going to replace the UNET encoder with the VGG16 implementation from the TensorFlow library. The UNET encoder would learn the features from scratch, while the VGG16 is already trained on the Continue Reading

Squeeze and Excitation Implementation in TensorFlow and PyTorch

The Squeeze and Excitation network is a channel-wise attention mechanism that is used to improve the overall performance of the network. In today’s article, we are going to implement the Squeeze and Excitation module in TensorFlow and PyTorch. What is Squeeze and Excitation Network? The squeeze and excitation attention mechanism Continue Reading