We propose an approach for the classification of audio concepts in sport videos using deep belief networks (DBNs), a probabilistic neural network with several hidden layers. Comparison with support vector machine (SVM) classifiers has been carried on, showing that our preliminary results are promisingly comparable to the state-of-the-art.
Dataset description
1284 .wav files: 1ch 16k 2s clips
[filename].label files: label for [filename].wav clip
label values:
- 0: SILENCE
- 1: SPEECH_ONLY
- 2: SPEECH_OVER_CROWD
- 3: CROWD_ONLY
- 4: EXCITED
Please, if you use the dataset cite our papers as follows:
@inproceedings {icme2009, Acceptance = {Oral Acceptance Rate 22\%}, Address = {New York, NY, USA}, Author = {Ballan,Lamberto and Bazzica,Alessio and Bertini, Marco and Del Bimbo, Alberto and Serra,Giuseppe}, Booktitle = {Proc. of {IEEE} International Conference on Multimedia & Expo (ICME)}, Date-Added = {2009-07-09 09:28:17 +0200}, Date-Modified = {2010-09-13 11:21:06 +0200}, Doi = {http://dx.doi.org/10.1109/ICME.2009.5202537}, Keywords = {Audio analysis, Deep Belief Networks}, Month = {July}, Pages = {474--477}, Title = {Deep Networks for Audio Event Classification in Soccer Videos}, Url = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5202537}, Year = {2009}, Bdsk-Url-1 = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5202537}, Bdsk-Url-2 = {http://dx.doi.org/10.1109/ICME.2009.5202537} }