Assessment of the possibility of acoustic voice analysis
https://doi.org/10.21518/ms2025-029
Abstract
Introduction. Voice disorders occur in approximately 30% of the country’s population. The most studied characteristics of the voice include fundamental frequency, pitch and amplitude, harmonic-to-noise ratio, cepstral peak severity, acoustic quality index of voice, maximum phonation time, variations in fundamental frequency and number of pauses in speech signals.
Aim. Literature review assessing the possibility of acoustic voice analysis in patients with dysphonia.
Materials and methods. The authors searched for publications in the electronic databases PubMed, Web of Science, Google Scholar and ELibrary. The search was carried out using the following keywords: “voice acoustic analysis”, “voice disorder”, “artificial neural network”, “dysphonia”, “standard deviation of fundamental frequency”, “voice quality”, “acoustic voice analysis”.
Results and discussion. Fundamental frequency may be more sensitive to objective clinical assessment of voice than pitch and amplitude. The severity of the cepstral peak is an integral part of the acoustic analysis of the voice, helping to determine the differences between dysphonic and normal voices. Cepstral analysis is more sensitive to subtle dysphonic changes than vowel analysis methods. Despite the high analytical accuracy, ease of use of machine learning, as well as the promise of this approach in the diagnosis of dysphonia, the clinical application of this technology requires further research
Conclusions. Acoustic Analysis of Voice offers numerous advantages such as non-invasiveness, cost-effectiveness, and ease of use, facilitating the acquisition of objective data for evaluating the severity of voice disorders and serving as an indispensable tool for identifying pathologies associated with phonation disturbances. According to the literature, the most informative Acoustic Analysis of Voice parameters include fundamental frequency metrics, pitch and amplitude indices, cepstral peak prominence, voice quality index, maximum phonation time, and the relative noise level in the speech signal.
About the Authors
I. S. TimerbulatovRussian Federation
Ilgiz S. Timerbulatov - Assistant of the Department of Otorhinolaryngology, Bashkir State Medical University.
3, Lenin St., Ufa, 45008
E. E. Savelieva
Russian Federation
Elena E. Savelieva - Dr. Sci. (Med.), Associate Professor, Head of the Department of Otorhinolaryngology with the course of Medical Education, Bashkir State Medical University.
3, Lenin St., Ufa, 45008
R. M. Pestova
Russian Federation
Rimma M. Pestova - Assistant at the Department of Otorhinolaryngology, Bashkir State Medical University.
3, Lenin St., Ufa, 45008
I. I. Zagidullina
Russian Federation
Ilzia I. Zagidullina - Assistant of the Department of Otorhinolaryngology, Bashkir State Medical University.
3, Lenin St., Ufa, 45008
R. S. Timerbulatov
Russian Federation
Rail S. Timerbulatov - Student, Bashkir State Medical University.
3, Lenin St., Ufa, 45008
References
1. Lenell C, Shao Q, Johnson AM. Identifying Concomitant Health Conditions in Individuals With Chronic Voice Problems. J Voice. 2021;35(5):810.e1–810.e5. https://doi.org/10.1016/j.jvoice.2020.01.007.
2. Cohen SM, Kim J, Roy N, Asche C, Courey M. Prevalence and causes of dysphonia in a large treatment-seeking population. Laryngoscope. 2012;122(2):343–348. https://doi.org/10.1002/lary.22426.
3. Gunjawate DR, Chacon AM, Nguyen DD, Madill C. Vocal tasks for acoustic and/or auditory perceptual analysis for discriminating individuals with and without voice disorders: a systematic review protocol. BMJ Open. 2023;13(12):e077398. https://doi.org/10.1136/bmjopen-2023-077398.
4. Gorris C, Ricci Maccarini A, Vanoni F, Poggioli M, Vaschetto R, Garzaro M et al. Acoustic Analysis of Normal Voice Patterns in Italian Adults by Using Praat. J Voice. 2020;34(6):961.e9–961.e18. https://doi.org/10.1016/j.jvoice.2019.04.016.
5. Lee SH, Hong KH, Kim JS, Hong YT. Perceptual and Acoustic Outcomes of Early-Stage Glottic Cancer After Laser Surgery or Radiotherapy: A Meta-Analysis. Clin Exp Otorhinolaryngol. 2019;12(3):241–248. https://doi.org/10.21053/ceo.2018.00990.
6. Yang Y, Wang YL, Wei LZ, Wang JX, Huang FT, Huang GW. Is CO2 laser microsurgery better than radiotherapy in early glottic cancer: a meta-analysis. Lasers Med Sci. 2023;38(1):223. https://doi.org/10.1007/s10103-023-03890-3.
7. Starostina SV, Svistushkin VM, Rakunova EB. Rehabilitation of a voice function in patients with benign laryngeal lesions after surgical treatment. Meditsinskiy Sovet. 2019;(8):122–126. (In Russ.) https://doi.org/10.21518/2079-701X-2019-8-122-126.
8. Karlsen T, Sandvik L, Heimdal JH, Aarstad HJ. Acoustic Voice Analysis and Maximum Phonation Time in Relation to Voice Handicap Index Score and Larynx Disease. J Voice. 2020;34(1):161.e27–161.e35. https://doi.org/10.1016/j.jvoice.2018.07.002.
9. Lopes LW, Batista Simões L, Delfino da Silva J, da Silva Evangelista D, da Nóbrega E Ugulino AC, Oliveira Costa Silva P et al. Accuracy of Acoustic Analysis Measurements in the Evaluation of Patients With Different Laryngeal Diagnoses. J Voice. 2017;31(3):382.e15–382.e26. https://doi.org/10.1016/j.jvoice.2016.08.015.
10. Ayoub MR, Larrouy-Maestri P, Morsomme D. The Effect of Smoking on the Fundamental Frequency of the Speaking Voice. J Voice. 2019;33(5):802. e11–802.e16. https://doi.org/10.1016/j.jvoice.2018.04.001.
11. Searl J, Wilson K, Haring K, Dietsch A, Lyons K, Pahwa R. Feasibility of group voice therapy for individuals with Parkinson’s disease. J Commun Disord. 2011;44(6):719–732. https://doi.org/10.1016/j.jcomdis.2011.05.001.
12. Kang YA, Kim J, Jee SJ, Jo CW, Koo BS. Detection of voice changes due to aspiration via acoustic voice analysis. Auris Nasus Larynx. 2018;45(4):801–806. https://doi.org/10.1016/j.anl.2017.10.007.
13. Takatsu J, Higaki E, Abe T, Fujieda H, Yoshida M, Yamamoto M et al. Critical swallowing functions contributing to dysphagia in patients with recurrent laryngeal nerve paralysis after esophagectomy. Esophagus. 2024;21(2):111–119. https://doi.org/10.1007/s10388-023-01041-9.
14. Lim JY, Lim SE, Choi SH, Kim JH, Kim KM, Choi HS. Clinical characteristics and voice analysis of patients with mutational dysphonia: clinical significance of diplophonia and closed quotients. J Voice. 2007;21(1):12–19. https://doi.org/10.1016/j.jvoice.2005.10.002.
15. Lim JY, Choi JN, Kim KM, Choi HS. Voice analysis of patients with diverse types of Reinke’s edema and clinical use of electroglottographic measurements. Acta Otolaryngol. 2006;126(1):62–69. https://doi.org/10.1080/00016480510043927.
16. Singh H, Maurya RK, Sharma P, Kapoor P, Mittal T, Atri M. Effects of maxillary expansion on hearing and voice function in non-cleft lip palate and cleft lip palate patients with transverse maxillary deficiency: a multicentric randomized controlled trial. Braz J Otorhinolaryngol. 2021;87(3):315–325. https://doi.org/10.1016/j.bjorl.2019.09.010.
17. Segura-Hernández M, Valadez-Jiménez VM, Ysunza PA, Sánchez-Valerio AP, Arch-Tirado E, Lino-González AL et al. Acoustic analysis of voice in children with cleft lip and palate following vocal rehabilitation. Preliminary report. Int J Pediatr Otorhinolaryngol. 2019;126:109618. https://doi.org/10.1016/j.ijporl.2019.109618.
18. Chen S, Han C, Wang S, Liu X, Wang B, Wei R et al. Hearing the physical condition: The relationship between sexually dimorphic vocal traits and underlying physiology. Front Psychol. 2022;13:983688. https://doi.org/10.3389/fpsyg.2022.983688.
19. Patel RR, Awan SN, Barkmeier-Kraemer J, Courey M, Deliyski D, Eadie T et al. Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function. Am J Speech Lang Pathol. 2018;27(3):887–905. https://doi.org/10.1044/2018_AJSLP-17-0009.
20. Delgado-Hernández J, León-Gómez N, Jiménez-Álvarez A. Diagnostic accuracy of the smoothed cepstral peak prominence (CPPS) in the detection of dysphonia in the Spanish language. Loquens. 2019;6(1):e058–e058.
21. Lee Y, Kim G, Kwon S. The Usefulness of Auditory Perceptual Assessment and Acoustic Analysis for Classifying the Voice Severity. J Voice. 2020;34(6):884–893. https://doi.org/10.1016/j.jvoice.2019.04.013.
22. Esen Aydinli F, Özcebe E, İncebay Ö. Use of cepstral analysis for differentiating dysphonic from normal voices in children. Int J Pediatr Otorhinolaryngol. 2019;116:107–113. https://doi.org/10.1016/j.ijporl.2018.10.029.
23. Randall RB. A history of cepstrum analysis and its application to mechanical problems. Mech Syst Signal Process. 2017;97:3–19.
24. Chernobelsky SI, Petrova IA. Evaluation of the results of treatment of patients with functional dysphonia using a cepstral test. Vestn Otorinolaringol. 2023;88(5):23–26. https://doi.org/10.17116/otorino20238805123.
25. Aghadoost S, Jalaie S, Dabirmoghaddam P, Khoddami SM. Effect of Muscle Tension Dysphonia on Self-perceived Voice Handicap and Multiparametric Measurement and Their Relation in Female Teachers. J Voice. 2022;36(1):68–75. https://doi.org/10.1016/j.jvoice.2020.04.011.
26. Englert M, Latoszek BBV, Behlau M. Exploring The Validity of Acoustic Measurements and Other Voice Assessments. J Voice. 2024;38(3):567–571. https://doi.org/10.1016/j.jvoice.2021.12.014.
27. Lee JM, Roy N, Peterson E, Merrill RM. Comparison of Two Multiparameter Acoustic Indices of Dysphonia Severity: The Acoustic Voice Quality Index and Cepstral Spectral Index of Dysphonia. J Voice. 2018;32(4):515.e1–515.e13. https://doi.org/10.1016/j.jvoice.2017.06.012.
28. Yu M. Predicting normal and pathological voice using a cepstral based acoustic index in sustained vowels versus connected speech. Commun Sci Disord. 2018;23(4):1055–1064. https://doi.org/10.12963/csd.18550.
29. Delgado-Hernández J, León-Gómez N, Jiménez-Álvarez A. Diagnostic accuracy of the smoothed cepstral peak prominence (CPPS) in the detection of dysphonia in the Spanish language. Loquens. 2019;6(1):e058–e058.
30. Gupta R, Gunjawate DR, Nguyen DD, Jin C, Madill C. Voice disorder recognition using machine learning: a scoping review protocol. BMJ Open. 2024;14(2):e076998. https://doi.org/10.1136/bmjopen-2023-076998.
31. Al-Hussain G, Shuweihdi F, Alali H, Househ M, Abd-Alrazaq A. The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: Systematic Review and Meta-analysis. J Med Internet Res. 2022;24(10):e38472. https://doi.org/10.2196/38472.
32. Bakhtiar M, Zhang C, Sze Ki S. Impaired processing speed in categorical perception: Speech perception of children who stutter. PLoS ONE. 2019;14(4):e0216124. https://doi.org/10.1371/journal.pone.0216124.
33. Maruthy S, Feng Y, Max L. Spectral Coefficient Analyses of Word-Initial Stop Consonant Productions Suggest Similar Anticipatory Coarticulation for Stuttering and Nonstuttering Adults. Lang Speech. 2018;61(1):31–42. https://doi.org/10.1177/0023830917695853.
34. Guttormsen LS, Kefalianos E, Næss KA. Communication attitudes in children who stutter: A meta-analytic review. J Fluency Disord. 2015;46:1–14. https://doi.org/10.1016/j.jfludis.2015.08.001.
35. Hickok G, Venezia J, Teghipco A. Beyond Broca: neural architecture and evolution of a dual motor speech coordination system. Brain. 2023;146(5): 1775–1790. https://doi.org/10.1093/brain/awac454.
36. Asci F, Marsili L, Suppa A, Saggio G, Michetti E, Di Leo P et al. Acoustic analysis in stuttering: a machine-learning study. Front Neurol. 2023;14:1169707. https://doi.org/10.3389/fneur.2023.1169707.
37. Mahajan P, Baths V. Acoustic and Language Based Deep Learning Approaches for Alzheimer’s Dementia Detection From Spontaneous Speech. Front Aging Neurosci. 2021;13:623607. https://doi.org/10.3389/fnagi.2021.623607.
38. Ren Z, Chang Y, Bartl-Pokorny KD, Pokorny FB, Schuller BW. The Acoustic Dissection of Cough: Diving Into Machine Listening-based COVID-19 Analysis and Detection. J Voice. 2024;38(6):1264–1277. https://doi.org/10.1016/j.jvoice.2022.06.011.
Review
For citations:
Timerbulatov IS, Savelieva EE, Pestova RM, Zagidullina II, Timerbulatov RS. Assessment of the possibility of acoustic voice analysis. Meditsinskiy sovet = Medical Council. 2025;(7):185-190. (In Russ.) https://doi.org/10.21518/ms2025-029