Skip to main content Skip to main navigation


Leveraging Sound Collections for Animal Species Classification with Weakly Supervised Learning

Ilira Troshani; Thiago Gouvea; Daniel Sonntag
In: 3rd Annual AAAI Workshop on AI to Accelerate Science and Engineering. AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE-2024), located at AAAI, February 26, Vancouver, BC, Canada, 2024.


The utilization of Passive Acoustic Monitoring (PAM) for wildlife monitoring remains hindered by the challenge of data analysis. While numerous supervised ML algorithms exist, their application is constrained by the scarcity of annotated data. Expert-curated sound collections are valuable knowl- edge sources that could bridge this gap. However, their uti- lization is hindered by the sporadic sounds to be identified in these recordings. In this study, we propose a weakly su- pervised approach to tackle this challenge and assess its per- formance using the AnuraSet dataset. We employ TALNet, a Convolutional Recurrent Neural Network (CRNN) model and train it on 60-second sound recordings labeled for the presence of 42 different anuran species. We conduct the eval- uation on 1-second segments, enabling precise sound event localization. Furthermore, we investigate the impact of vary- ing the length of the training input and explore different pool- ing functions’ effects on TALNet’s performance on AnuraSet. Our findings demonstrate the effectiveness of TALNet in har- nessing weakly annotated sound collections for wildlife mon- itoring.

Weitere Links