Idiot Developer

2nd November 2022

Vision Transformer – An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

In this blog post, we are going to learn about the Vision Transformer (ViT). It is a pure Transformer based architecture used for image classification tasks. Vision Transformer (ViT) has the ability to replace the standard CNNs while achieving excellent results. The Vision Transformer (ViT) attains excellent results when pre-trained...

Computer Vision / Deep Learning

31st October 2022

MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition

In this work, we present a lightweight matting objective decomposition network (MODNet) for portrait matting in real-time with a single input image. MODNet inputs a single RGB image and applies explicit constraints to solve matting sub-objectives simultaneously in one stage. The research paper is accepted at AAAI 2022 conference. Research...

Computer Vision / TensorFlow

18th October 2022

VGG19 UNET Implementation in TensorFlow

In this tutorial, we are going to implement the U-Net architecture in TensorFlow, where we will replace its encoder with a pre-trained VGG19 architecture. The VGG19 is already trained on the ImageNet classification dataset. Therefore, it would have already learned the required features, which would help to boost the overall...

Deep Learning

7th October 2022

Why Deep Learning is not Artificial General Intelligence (AGI)

With the development in the field of deep learning, it has become a frontier in solving multiple challenging problems in computer vision, games, self-driving cars and many more. Deep learning has even achieved superhuman performance in some tasks, but still, it lacks some fundamental features which are required for a...

Computer Vision

12th September 2022

PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model

PP-LiteSeg is a lightweight encoder-decoder architecture designed for real-time semantic segmentation. It consists of three modules: Encoder: Lightweight network Aggregation: Simple Pyramid Pooling Module (SPPM) Decoder: Flexible and Lightweight Decoder (FLD) and Unified Attention Fusion Module (UAFM) Encoder The STDCNet is the encoder for the proposed PP-LiteSeg for its high...

Computer Vision / Deep Learning / TensorFlow

3rd January 2022

Deep Learning based Background Removal from Images using TensorFlow and Python

In this tutorial, we are going to learn how to use deep learning to remove background from images with TensorFlow. In short, we’ll use DeepLabV3+, a semantic segmentation based model to extract the background and foreground mask from the image. We are going to use these masks to extract the...

Socket Programming

5th December 2021

UDP Client-Server Implementation in C

There are two major communication protocols: TCP and UDP. These protocols are used to transport data between the client and the server. In one of my previous posts, we have implemented: TCP Client-Server Implementation in C In this tutorial, we are going to build a simple UDP client-server program in...

Deep Learning / TensorFlow

4th December 2021

Custom Layer in TensorFlow using Keras API

The majority of the people interested in deep learning must have used the TensorFlow library. It is the most popular and widely used deep learning framework. We have used the different layers provided by the tf.keras API to build different types of deep neural networks. But, there are many times...

Computer Vision / Deep Learning / TensorFlow

3rd December 2021

VGG16 UNET Implementation in TensorFlow

In this article, we are going to implement the most widely used image segmentation architecture called UNET. We are going to replace the UNET encoder with the VGG16 implementation from the TensorFlow library. The UNET encoder would learn the features from scratch, while the VGG16 is already trained on the...

Computer Vision / Deep Learning / Python / PyTorch / TensorFlow

1st December 2021

Squeeze and Excitation Implementation in TensorFlow and PyTorch

The Squeeze and Excitation network is a channel-wise attention mechanism that is used to improve the overall performance of the network. In today’s article, we are going to implement the Squeeze and Excitation module in TensorFlow and PyTorch. What is Squeeze and Excitation Network? The squeeze and excitation attention mechanism...

Palani on Simple Object Detection with Bounding Box Regression in TensorFlow10th February 2023
Hi Nikhil, Thanks for your dedications and explored very well. it is great helps to me. by Palani
aravind on Polyp Segmentation using UNET in TensorFlow 2.019th April 2022
ImportError: cannot import name 'load_data' from 'data' (C:\Users\Aravinda\anaconda3\lib\site-packages\data\__init__.py)
chinnu jacob on Squeeze and Excitation Implementation in TensorFlow and PyTorch4th December 2021
sir can you share a code of using squeeze and excitation network on custom CNN for a classification
Anand on What is Transfer Learning? – A Simple Introduction.25th October 2021
Hey Nikhil, Very nice, to-the-point article on transfer learning. For a small size of dataset, an image augmentation along with…
Candice Roe on cv2.resize() – Resizing Image using OpenCV Python5th October 2021
Truly good blog short article and also valuable.