Building an Object Detection Model with FastAI and IceVision

6 min readFeb 25, 2022

· Installation
· Data Overview
· Dataset Preparation
· Image Transformations
· Training using FastAI
· Inference on a batch of images
· Inference on a Single Image
· Future Work
∘ Reference

I started using IceVision recently as I needed to create an object detection model for the latest Computer Vision Kaggle competition. In this blog , I will explain few important concepts that I found particularly useful as I was getting my feet wet with the IceVision framework.

Happywhale - Whale and Dolphin Identification

Identify whales and dolphins by unique characteristics

www.kaggle.com

Installation

Installing IceVision is fairly easy. We can install just by running the following command in the terminal — pip install icevision[all] . For a more detailed information about the installation process, you can check out the IceVision documentation which is pretty straightforward.

Installation

pip install torch == 1.10.0+cu102 torchvision == 0.11.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html $…

airctic.com

Data Overview

We will be using Humpback Whale Fluke Key points data which has 1000 hand annotated whale images. From these key points, we can easily get the bounding box for a whale. Following code , shows how to visualise key points for a particular image.

Humpback Whale Fluke Keypoints

Keypoints To Aide in Fluke alignment, scaling and segmentation

www.kaggle.com

Dataset Preparation

We will be using IceVision’s custom parser to prepare our datasets and dataloaders. To know more about the details of custom parser, we can use the following command to get a better and deeper understanding of the working of our custom parser.

We will create a custom parser class which will inherit the IceVision’s Parser class. Following is the code for it —

We created WhaleParser class and we’ll pass the template record i.e object detection record and the data directory where all our images are present in it. The WhaleParser class will add bounding boxes and labels to every image. The details for them will be passed through another data frame where all the information such as labels, bounding boxes will be present for every image.

Following image shows a single record. Whale is covered by a bounding box and it is also labeled as Whale.

Image Transformations

IceVision lays the foundation to easily integrate different augmentation libraries by using adapters. It implements an adapter for the popular Albumentations library. Following are the transformations which we’ll do on our images after we label them and get their bounding boxes .

Image Presizing — It is an image augmentation technique that is used to data destruction while maintaining good performance. It adopts two strategies — Resize images to relatively large dimensions that is significantly larger than the target training dimension. In the second step, it composes all of the common augmentation operations (including a resize to the final target size) into one and perform the combined operation on GPU once at the end of processing.
Normalization — It is important to do normalization when we are using pre-trained models (in this case we will be using pre-trained YOLOv5). The pre-trained model knows best to work on the data it has seen. If 0 is the minimum value in it but 0 is the average value in our dataset, then the distribution will be very different.

Following lines apply these transformations to our train and validation datasets.

We can also have a look at some records after applying these transformations.

We have prepared our final dataset. I chose YOLOv5 for training for this particular task but there are tons of models available in the IceVision library. You can find out and experiment with them.

Models

IceVision offers a large number of models by supporting the following Object Detection Libraries: You will enjoy using…

airctic.com

Training using FastAI

One of the major reasons, I like IceVision is because it allows to train deep neural networks with easy-to-use robust high performance libraries such as FastAI. To train the model using FastAI, first we will first find the perfect learning rate using lr finder.

A brief explanation about what lr finder does in FastAI — It starts with a very small learning rate. It uses that for one mini batch and finds what the losses are afterwards and then increase the learning rate by a certain percentage (doubling every time). We keep doing this until the loss gets worse . This is the point where we know we have gone too far. Then we select the learning rate one order of magnitude less than where minimum loss was achieved (minimum divided by 10 ). After training for more than 30 epochs , we get COCOMetric score around 0.84 and following is the plot between no of iterations and the loss for training and validation.

As we can see , the loss continues to decrease for the validation set.

Inference on a batch of images

It is very easy to get predictions on a batch of images. Below are a few predictions. We can see that our model has been pretty spot on most of the times in detecting whales and their bounding boxes.

Inference on a Single Image

IceVision makes it very easy to get a prediction on a single image. Not only it helps to compute prediction but also automatically adjusts predicted boxes to the original image size. The output for a single image prediction is a dictionary which even helps us to access the bounding box coordinates. IMO, this is super useful.

Bounding box coordinates predicted by our model for a single image

Future Work

I will be posting more tutorials and blogs related to Computer Vision and Machine Learning. I will also be exploring IceVision further for tasks related to Image Segmentation.

Reference

You can check out all the details and code from my github repo -

GitHub - msadiva/happy-whale

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

Connect with me on Linkedin —

Building an Object Detection Model with FastAI and IceVision

Happywhale - Whale and Dolphin Identification

Identify whales and dolphins by unique characteristics

Installation

Installation

pip install torch == 1.10.0+cu102 torchvision == 0.11.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html $…

Data Overview

Humpback Whale Fluke Keypoints

Keypoints To Aide in Fluke alignment, scaling and segmentation

Dataset Preparation

Image Transformations

Models

IceVision offers a large number of models by supporting the following Object Detection Libraries: You will enjoy using…

Training using FastAI

Inference on a batch of images

Inference on a Single Image

Future Work

Reference

GitHub - msadiva/happy-whale

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

Sadiva Madaan - Junior Data Scientist - Applied Data Finance | LinkedIn

View Sadiva Madaan's profile on LinkedIn, the world's largest professional community. Sadiva has 1 job listed on their…

Written by Sadiva Madaan

No responses yet