Machine-Based and Auditory Identification of Slavic Languages
Articles
Jacek Kudera
Trier University image/svg+xml
https://orcid.org/0000-0003-3678-1067
Jovana Stevanović
University of Nis image/svg+xml
Published 2024-10-02
https://doi.org/10.15388/SlavViln.2024.69(1).4
PDF
HTML

Keywords

machine identification of linguistic origin
auditory study
comparison
Slavic languages

How to Cite

Kudera, J. and Stevanović, J. (2024) “Machine-Based and Auditory Identification of Slavic Languages”, Slavistica Vilnensis, 69(1), pp. 56–66. doi:10.15388/SlavViln.2024.69(1).4.

Abstract

This paper presents a comparison of auditory and machine-based identification of linguistic origins. Two studies were conducted to assess the ability of lay listeners and a stateof-the-art machine approach to identify Slavic L1 from delexicalized speech samples. The first study involved 228 native speakers of the four Slavic languages (Bulgarian, Czech, Polish and Russian) who had not received any prior training in Slavic philology, phonetics, linguistics, or forensic science. Their task was to identify the linguistic origins of speakers when exposed to limited phonetic cues. The stimuli consisted of meaningless logatomes to control for the lexical information. The second study employed machine-based identification of a spoken language, based on two distinct approaches: (1) formant structure of phonetic signal and (2) a neural network and vector representation of speech samples. The data showed that Slavic native speakers, even when exposed to limited auditory cues, are able to identify speakers’ L1s. Interestingly, in the context of the Bulgarian language, the machine-based identification method performed better than the lay listeners. The results of the experiments provide insight into the advantages of hybrid approaches in investigations related to LADO (Language Analyses for the Determination of Origin). Furthermore, the outcomes of this comparison may contribute to the debate on the involvement of native speakers in L1 identification procedures for closely related languages.

PDF
HTML
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.