Birdwatching with Convolutional Neural Networks

Leveraging fastai and the Bing Image Search API to perform image classification on birds of prey

David Dancis
7 min readMar 1, 2021
Photo by Mathew Schwartz on Unsplash

Introduction

I grew up in a family of avid birdwatchers, and although I haven’t been birding in years, I do have fond memories of birdwatching with my family growing up. As I’ve recently been experimenting with the fast.ai Deep Learning for Coders course, I wanted to see if I could build a deep learning model that could accurately classify different birds of prey.

Data Collection

I decided to limit my model to 3 types of birds: eagles, ospreys, and red-tailed hawks. The 3 are similar enough in nature that an inexperienced birder could confuse them, although they are distinct enough that a deep learning model should be able to differentiate each species.

To build the model, I began by importing the required libraries, including fastai for model training, as well as IPython and matplotlib to display the images.

Required libraries for image classification model

With the libraries imported, I then set up the Bing Image Search API to gather the required images for my training dataset. A prerequisite for this step is to have an existing Microsoft Azure account, however the account is free to set up and there is also a free tier available to access the Image Search API. With this resource enabled, I then utilized my personal key to access the API via my Jupyter notebook. To check that I was able to gather the required images from Bing, I performed a test search for “eagle”, and set the search parameters to include 150 publicly available photos.

Setting the Bing Image Search term parameters

With the search parameters defined, I then ran a sanity test to see what types of images were returned through my “eagle” search.

Bald eagle images gathered through the Bing Image Search API

As expected, the search returned a number of images containing bald eagles. Since this code appeared to be working, the next step was to expand my search to include 150 images for each bird type in the model: eagles, ospreys, and red-tailed hawks.

In addition to expanding the search terms, I also needed to save the images to the appropriate folder for model training — here I’ve set the path to be in the “birds_of_prey” folder.

Setting search terms and saving images to the appropriate folder

Inevitably, some of the images will have bad links, so I ran the verify_images function from fastai to identify all of the failed urls, and then removed them to enable the model to run smoothly.

Removing all the failed urls from the dataset

Training the Model

Now that I had a full dataset of functioning urls, I could begin to train the model using fastai’s DataBlock API which requires several key arguments:

  • Blocks: First, I needed to provide the independent variable, which in this case is the ImageBlock (e.g. a picture of a bird). Next, I need to provide the dependent variable, which in this case would be the type of bird (e.g. “eagle”).
  • Get_Items: Next, I accessed the dataset using the get_image_files function which returns a list of all the images in a selected path.
  • Splitter: I divided the data into test and validation sets. The “0.2” shows that 20% of the data will be used for validation, while the remaining 80% of the data will be used as training data for the model. The “seed” is used to randomize the data, and setting it to any integer will help with reproducing results as long as that number is consistent each time I re-run the model.
  • Get_y: This argument tells the model how to identify the dependent variable. Since all of the images for eagles, ospreys, and red-tailed hawks are saved in a folder bearing their respective name, I’m using the fastai function parent_label which can then label each image based upon the folder name.
  • Item_tfms: All of the images in the dataset are different sizes, and so I’m using item transforms to resize all the images to a consistent size of 128 pixels.
Leveraging Datablock for data preparation

With all of the data prepared, I could now begin to train the model using a convolutional neural network. I used fastai’s cnn_learner for this purpose, and set the architecture type as resnet34, meaning the neural network was 34 layers deep. I then set the model to fine_tune because I’m leveraging a pre-trained model, so there’s no need to begin from scratch, and in fact doing so could hurt performance.

After training the model for 4 epochs, I had an error rate of 19%, giving the model an accuracy of 81%. This wasn’t a bad success rate for a first run, especially since the data hadn’t been cleaned yet. To dig a bit deeper, into the model’s performance, I created the confusion matrix shown below.

Model appears to be having difficulty classifying red-tailed hawks

The model apparently struggled with identifying red-tailed hawks, as it correctly identified 9 of them, but failed to identify another 6, and incorrectly classified 3 ospreys and 1 bald eagle as a red-tailed hawk.

To try and improve the model performance, I used fastai’s ImageClassifier function which provides a GUI to help in manually cleaning my dataset. I then went through both the training and validation data for all 3 bird types and either corrected or removed any images that appeared to be mislabeled.

Removing the Bell Boeing V-22 Osprey from my dataset

There were a handful of mislabeled images across all 3 bird species, but the osprey dataset in particular had the largest number of mislabeled images. As you can see above, this search returned a number of military aircraft, the Bell Boeing V-22 Osprey, rather than the bird, so I manually deleted all these images with the hopes of improving the model’s accuracy.

After taking another look at the confusion matrix, I could see that the osprey predictions had actually performed quite well, despite the presence of the Boeing aircraft in both my training and validation datasets. This is likely because any images of the aircraft are not similar to the other bird images, and are therefore unlikely to be confused for something else, which actually skewed the results into a more accurate prediction rate rather than reducing the model’s accuracy.

After cleaning the data, I re-ran the model using the same CNN architecture for 4 epochs and received the results below.

Slight improvement in model performance

The model is showing a slight improvement in performance, with a 17% error rate opposed to the 19% rate from the initial pass. While the prediction accuracy has only shown a marginal improvement, the confusion matrix provides some cause for optimism.

Slight improvement in red-tailed hawk prediction accuracy

Perhaps the key difference from our initial iteration is the overall confidence that we can have in the results. This time around, we no longer have the osprey dataset being skewed by Boeing aircraft, so the results are much are representative of the model’s actual performance. Additionally, we’ve seen a marginal improvement in the model’s ability to accurately classify red-tailed hawks, as it correctly classified 12 of them, while only missing 3 (opposed to 6 in the first iteration) and misclassifying another 4.

Model Inference for New Images

Using fastai’s export method, I then created an app to see how it would classify new images which were previously unseen by the model. I then manually fed the model a clear picture of a bald eagle, and was pleased to see that it accurately classified the image with a probability of 100%.

Model accurately predicts clear bald eagle picture with 100% certainty

Conclusion

Although I would have liked to see an overall model accuracy a bit higher than 83%, I was heartened by the fact that the model could correctly classify a clear image of an eagle with 100% probability. As potential next steps to further improve model performance, I may need to alter the neural network architecture, or potentially look at cleaning the data even more, as I’m sure there were some misclassified birds that slipped past my radar. However, I’m generally pleased with the model’s functionality, as it has shown the ability to classify 3 different bird types with a reasonable degree of accuracy using only limited dataset of 150 images for each bird.

You can find my code here on Github.

You can find me here on LinkedIn.

--

--