[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the Advances in Neural Information Processing Systems (NIPS), Nevada, USA, 2012:1097-1105. [2] Piczak K J. Environmental sound classification with convolutional neural networks[C]//Proceedings of the 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA, 2015:1-6. [3] Zhang X, Zou Y, Shi W. Dilated convolution neural network with LeakyReLU for environmental sound classification[C]//Proceedings of the 22nd International Conference on Digital Signal Processing, London, UK, 2017:1-5 [4] Zhrer M, Pernkopf F. Gated recurrent networks applied to acoustic scene classification and acoustic event detection[C]//European Signal Processing Conference, 2016. [5] Sitzmann V, Martel J, Bergman A, et al. Implicit neural representations with periodic activation functions[C]//The 34th Conference on Neural Information Processing Systems, 2020. [6] Tokozume Y, Harada T. Learning environmental sounds with end-to-end convolutional neural network[C]//Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, 2017:2721-2725. [7] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. [8] Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014[2020-08-15]. https://arxiv.org/abs/1412.3555. [9] Chan W, Jaitly N, Le Q V, et al. Listen, attend and spell:a neural network for large vocabulary conversational speech recognition[C]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Shanghai, 2016:4960-4964. [10] Salamon J, Jacoby C, Bello J P. A dataset and taxonomy for urban sound research[C]//Proceedings of the 22nd ACM International Conference on Multimedia, Xiamen, 2014:1041-1044. [11] Mcfee B, Raffel C, Liang D, et al. Librosa:audio and music signal analysis in Python[C]//Proceedings of the 14th Python in Science Conference, Austin, Texas, 2015:18-25. [12] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:2980-2988. [13] Chen Y, Guo Q, Liang X, et al. Environmental sound classification with dilated convolutions[J]. Applied Acoustics, 2019, 148:123-132. [14] 张科, 苏雨, 王靖宇, 等. 基于融合特征以及卷积神经网络的环境声音分类系统研究[J]. 西北工业大学学报, 2020, 38(1):162-169. Zhang K, Su Y, Wang J Y, et al. Environment sound classification system based on hybrid feature and convolutional neural network[J]. Journal of Northwestern Polytechnical University, 2020, 38(1):162-169. (in Chinese) [15] Zhang Z, Xu S, Cao S, et al, Deep convolutional neural network with mixup for environmental sound classification[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Cham:Springer, 2018:356-367. [16] Lim M, Lee D, Park H, et al. Convolutional neural network based audio event classification[J]. KSII Transactions on Internet and Information Systems, 2018, 12(6):2748-2760. |