Photo by Selina Bubendorfer on Unsplash

It seems that the title of the Transformer architecture paper is resonating more and more through our minds recently. Is attention really all we need? For some years now, it seems clear that the NLP community believes so, with transformers being the key component of SotA architectures, in language modeling, translation or summarization tasks.

A few months ago, a transformer-based architecture for image classification, proposed in this paper by Google, has outperformed SotA CNN-based architectures. One of the inconveniences of this architecture is the (understandably) large number of parameters and resources required to train.

In a fresh paper, Facebook has…


Photo by Simon Migaj on Unsplash

In this post, I will share my understanding of the Vision Transformer architecture. All the drawings in this post are original content, based on the knowledge from the paper and other tutorials which will be referred to where appropriate.

Transformer architectures have been a major breakthrough in Natural Language Processing (NLP) tasks since being proposed in 2016. Google’s BERT and Open AI’s GPT-2 / GPT-3 architectures have been the state-of-the-art solutions for various tasks, including language modeling, text summarization, and question answering.

The aim was to prove that the recurrent neural networks can be completely replaced, and solutions can be…

Andrei-Cristian Rad

Studying MSc in Artificial Intelligence 🤓

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store