vision transformer Archives

ViTPose: Human Pose Estimation with (ViT) Vision Transformers

Posted on 18th June 202518th June 2025 by Nikhil Tomar

Human pose estimation is one of the most critical tasks in computer vision. It aims to localize anatomical key points (like shoulders, knees, and wrists) on the human body. Traditional convolutional neural networks (CNNs) have long dominated this field, but a new horizon has emerged with the advent of transformers Continue Reading

[Paper Summary] Class-Aware Adversarial Transformers for Medical Image Segmentation

Posted on 3rd February 20233rd February 2023 by Nikhil Tomar

Transformer-based models have shown remarkable progress in the field of medical image segmentation. However, the existing methods still suffer from limitations such as loss of information and inaccurate segmentation label maps. In this article, we will discuss a new type of adversarial transformer, the Class-Aware Adversarial Transformer (CASTformer), which overcomes Continue Reading

What is MobileViT?

Posted on 15th November 202215th November 2022 by Nikhil Tomar

This article covers an overall summary of the MobileViT: Light-Weight, General-Purpose, and Mobile-Friendly Vision Transformers research paper. MobileViT is a lightweight and general-purpose vision transformer for mobile vision tasks. It combines the strength of the standard CNN (Convolutional Neural Network) and the Vision Transformers. It has outperformed several CNNs and Continue Reading

Vision Transformer – An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

Posted on 2nd November 20226th February 2023 by Nikhil Tomar

In this blog post, we are going to learn about the Vision Transformer (ViT). It is a pure Transformer based architecture used for image classification tasks. Vision Transformer (ViT) has the ability to replace the standard CNNs while achieving excellent results. The Vision Transformer (ViT) attains excellent results when pre-trained Continue Reading