Journal of Radio Electronics. eISSN 1684-1719. 2025. №8
Full text in Russian (pdf)
DOI: https://doi.org/10.30898/1684-1719.2025.8.15
Detection of Abrupt Intensity Changes
in Speech Signals Based
on the Receptive Field Concept
M.M. Gutorov, V.E. Antsiperov
Kotelnikov IRE RAS
125009, Russia, Moscow, Mokhovaya str., 11, b.7
The paper was received September 10, 2025.
Abstract. This paper explores the feasibility of automatic detection of vowel and word boundaries in speech signals represented in a sampled form of a neuromorphic model of the peripheral stage of the biological auditory perception system. An algorithmic approach is proposed that performs sequential processing of speech signals to detect the temporal boundaries of speech elements. At the initial stage, a sharpness intensity index is calculated using a system of temporal receptive fields, allowing for the identification of candidate segment boundaries. This is followed by event selection for the onset and offset of speech fragments and subsequent verification of vowel boundaries based on the local average spiking activity. To enhance the detection of word boundaries, the method further incorporates envelope analysis with adaptive thresholding, ensuring robustness and reproducibility of the segmentation results. The algorithm's performance was objectively evaluated using manually annotated time-aligned vowel boundaries, enabling the calculation of root mean square error and coincidence metrics. The results demonstrate high detection accuracy, reaching precision on the order of tens of milliseconds for word boundaries, thereby confirming the practical viability of the proposed approach. The method shows elevated sensitivity at vowel offsets, occasionally resulting in false detections, which indicates the need for incorporating adaptive contextual rules. The proposed system is suitable for real-time speech signal analysis and segmentation tasks.
Key words: speech segmentation; vowel detection; word boundaries; neuromorphic processing; auditory perception model; receptive fields; spiking activity; event detection; speech signal processing; acoustic analysis.
Financing: This research was carried out within the state assignment (Registration No. AAAA-A19-119041590070-1) of the V.A. Kotelnikov Institute of Radio Engineering and Electronics, Russian Academy of Sciences.
Corresponding author: Gutorov Mikhail Mikhailovich gutorov.m.m@gmail.com
References
1. Bello J.P., Daudet L., Abdallah S., Duxbury C., Davies M., Sandler M. B.A Tutorial on Onset Detection in Music Signals // IEEE Transactions on Speech and Audio Processing. 2005. Vol. 13, No. 5. P. 1035–1047. https://doi.org/10.1109/TSA.2005.851998
2. Osses A., Varnet L., Carney L.H., Dau T., Bruce I.C., Verhulst S., Majdak P.A comparative study of eight human auditory models of monaural processing // Acta Acustica. 2022. Vol. 6. P. 17. https://doi.org/10.1051/aacus/2022008
3. de Cheveigné A. Simple and efficient auditory-nerve spike generation // bioRxiv. 2023. https://doi.org/10.1101/2023.05.02.539135
4. Land E.H., McCann J.J. Lightness and Retinex Theory // Journal of the Optical Society of America. 1971. Vol. 61, №1. P. 1–11. https://doi.org/10.1364/JOSA.61.000001
5. V.E. Antsiperov, M.M. Gutorov, Signal Intensity Change Point Detection by System of Overlapped Receptive Fields Based on Modeling Perception Mechanisms of Living Sensory Systems // Proc. 25th International Conference on Digital Signal Processing (DSP 2025), Costa Navarino, Greece. 2025. (to appear).
6. Boersma, P.; Weenink, D. Praat: doing phonetics by computer. Version 6.4.42 [Электронный ресурс]. — Amsterdam: University of Amsterdam, 1992–. — Режим доступа: http://www.fon.hum.uva.nl/praat/ (дата обращения 14.09.2025)
For citation:
Gutorov M.M., Antsiperov V.E. Detection of abrupt intensity changes in speech signals based on the receptive field concept. // Journal of Radio Electronics. – 2025. – №. 8. https://doi.org/10.30898/1684-1719.2025.8.15 (In Russian)