Background
Global deaf population is roughly estimated to be 0.1% of the total population. There are various causes of deafness, but in a considerable part of the cases the inner ear, i. e., the cochlear structure is damaged. Nowadays, however, there is a way to bypass the peripheral auditory system and directly stimulate auditory nerve fibers by employing cochlear implants (CIs). CIs have been the target of intensive research for over 50 years by now.
Even though CIs are the most successful neural prostheses ever, hearing can only be partially restored by them. Patients achieve an average of about 80% in speech recognition tests under quiet conditions (without lip-reading) until the end of the second year after implantation [1], but most cochlear implant recipients remain unable to enjoy music or to distinguish among complex sounds, especially in noisy environments.
Interestingly, computational strategies of today’s modern CI systems are still based on simple algorithms, which can hardly mimic the complex functionality of the human auditory system. On the other hand, numerous biologically motivated models of cochlear processing (and of auditory structures beyond) have been developed during the last 20 years. These bio-motivated models have several advantages over common filterbanks: they have great spectrotemporal resolution, they often include adaptation mechanisms that are shown to have a positive effect on speech recognition [2], and finally, spectral delays [3] introduced by the traveling wave on the basilar membrane are also mimicked to some extent. Importance of the latter is shown e. g. in [4] and [5].
References
[1] J. Rouger, S. Lagleyre, B. Fraysse, S. Deneve, O. Deguine and P. Barone, “Evidence that cochlear-implanted deaf patients are better multisensory integrators,” Proc. Natl. Acad. Sci. USA, vol. 104 (17), pp. 7295-7300, 2007.
[2] M. Holmberg, D. Gelbart, and W. Hemmert, “Automatic Speech Recognition with an Adaptation Model Motivated by Auditory Processing,” IEEE Trans. Audio, Speech and Lang. Process., vol. 14 (1), pp. 43-49, 2006.
[3] S. Greenberg, D. Poeppel, and T. Roberts, “A space-time theory of pitch and timbre based on cortical expansion of the cochlear traveling wave delay,” in Proc. 11th Int. Symp. on Hearing, Grantham, 1997.
[4] T. Harczos, G. Szepannek, A. Kátai, and F. Klefenz, “An auditory model based vowel classification,” in Proc. IEEE Biomed. Circuits and Systems Conf., London, UK, pp. 69-72, 2006.
[5] D. A. Taft, D. B. Grayden, and A. N. Burkitt, “Speech coding with traveling wave delays: Desynchronizing cochlear implant frequency bands with cochlea-like group delays,” Speech Communication, vol. 51 (11), pp. 1114-1123, 2009.