Bird Species Identification using Convolutional Neural Networks
J. Martinsson
Master's Thesis, Chalmers University of Technology, Jul 2017.
An area of interest in ecology is monitoring animal populations to better understand their behavior, biodiversity, and population dynamics. Acoustically active animals can be automatically classified by their sounds, and a particularly useful ecological indicator is the bird, as it responds quickly to changes in its environment. The aim of this study is to improve upon the state-of-the-art bird species classifier [1], which is implemented and used as a baseline. The questions asked are: Can deep residual neural networks learn to classify bird species based on bird song and how well to they perform? Do multiple-width frequency-delta data augmentation or meta-data fusion further increase the accuracy of the model? The questions are answered by training a deep residual neural network on one of the largest bird song data sets in the world, with and without the use of multiplewidth frequency-delta data augmentation and meta-data fusion, and by comparing the results with the baseline. The study shows that deep residual neural networks can learn to classify bird species based on bird song and that the mean average precision of the classifier nearly matches the state-of-the-art. We further develop a proof of concept for meta-data fusion which indicates that fusion of elevation data can be used to increase the accuracy of the model, and in particular decrease its coverage error. Possible ways forward are to tune the hyper parameters of the deep residual neural network, fuse time of recording and geological location data into the model, or to move towards the more realistic, but less studied, open set problem of continuous classification rather than the N-class problem which is studied in this thesis.
A reprint is available as PDF.
Further publications by John Martinsson.