| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Search the digital library catalog Help

Query: search in
search in
search in
search in
* old and bologna study programme

Options:
  Reset


1 - 10 / 31
First pagePrevious page1234Next pageLast page
1.
Influence of highly inflected word forms and acoustic background on the robustness of automatic speech recognition for human–computer interaction
Andrej Žgank, 2022, original scientific article

Abstract: Automatic speech recognition is essential for establishing natural communication with a human–computer interface. Speech recognition accuracy strongly depends on the complexity of language. Highly inflected word forms are a type of unit present in some languages. The acoustic background presents an additional important degradation factor influencing speech recognition accuracy. While the acoustic background has been studied extensively, the highly inflected word forms and their combined influence still present a major research challenge. Thus, a novel type of analysis is proposed, where a dedicated speech database comprised solely of highly inflected word forms is constructed and used for tests. Dedicated test sets with various acoustic backgrounds were generated and evaluated with the Slovenian UMB BN speech recognition system. The baseline word accuracy of 93.88% and 98.53% was reduced to as low as 23.58% and 15.14% for the various acoustic backgrounds. The analysis shows that the word accuracy degradation depends on and changes with the acoustic background type and level. The highly inflected word forms’ test sets without background decreased word accuracy from 93.3% to only 63.3% in the worst case. The impact of highly inflected word forms on speech recognition accuracy was reduced with the increased levels of acoustic background and was, in these cases, similar to the non-highly inflected test sets. The results indicate that alternative methods in constructing speech databases, particularly for low-resourced Slovenian language, could be beneficial.
Keywords: human–computer interaction, automatic speech recognition, acoustic modeling, highly inflected word forms, acoustic background
Published in DKUM: 28.03.2025; Views: 0; Downloads: 2
.pdf Full text (1,12 MB)
This document has many files! More...

2.
Strategies for managing time and costs in speech corpus creation : insights from the Slovenian ARTUR corpus
Darinka Verdonik, Andreja Bizjak, Andrej Žgank, Mirjam Sepesy Maučec, Mitja Trojar, Jerneja Žganec Gros, Marko Bajec, Iztok Lebar Bajec, Simon Dobrišek, 2024, original scientific article

Abstract: Parliamentary debates represent an essential part of democratic discourse and provide insights into various socio-demographic and linguistic phenomena - parliamentary corpora, which contain transcripts of parliamentary debates and extensive metadata, are an important resource for parliamentary discourse analysis and other research areas. This paper presents the Slovenian parliamentary corpus siParl, the latest version of which contains transcripts of plenary sessions and other legislative bodies of the Assembly of the Republic of Slovenia from 1990 to 2022, comprising more than 1 million speeches and 210 million words. We outline the development history of the corpus and also mention other initiatives that have been influenced by siParl (such as the Parla-CLARIN encoding and the ParlaMint corpora of European parliaments), present the corpus creation process, ranging from the initial data collection to the structural development and encoding of the corpus, and given the growing influence of the ParlaMint corpora, compare siParl with the Slovenian ParlaMint-SI corpus. Finally, we discuss updates for the next version as well as the long-term development and enrichment of the siParl corpus.
Keywords: recording speech, transcribing speech, transcription guidelines, Less-resourced language
Published in DKUM: 04.02.2025; Views: 0; Downloads: 8
.pdf Full text (1,09 MB)
This document has many files! More...

3.
An end-to-end framework for extracting observable cues of depression from diary recordings
Izidor Mlakar, Umut Arioz, Urška Smrke, Nejc Plohl, Valentino Šafran, Matej Rojc, 2024, original scientific article

Abstract: Because of the prevalence of depression, its often-chronic course, relapse and associated disability, early detection and non-intrusive monitoring is a crucial tool for timely diagnosis and treatment, remission of depression and prevention of relapse. In this way, its impact on quality of life and well-being can be limited. Current attempts to use artificial intelligence for the early classification of depression are mostly data-driven and thus non-transparent and lack effective means to deal with uncertainties. Therefore, in this paper, we propose an end-to-end framework for extracting observable depression cues from diary recordings. Furthermore, we also explore its feasibility for automatic detection of depression symptoms using observable behavioural cues. The proposed end-to-end framework for extracting depression was used to evaluate 28 video recordings from the Symptom Media dataset and 27 recordings from the DAIC-WOZ dataset. We compared the presence of the extracted features between recordings of individuals with and without a depressive disorder. We identified several cues consistent with previous studies in terms of their differentiation between individuals with and without depressive disorder across both datasets among language (i.e., use of negatively valanced words, use of first-person singular pronouns, some features of language complexity, explicit mentions of treatment for depression), speech (i.e., monotonous speech, voiced speech and pauses, speaking rate, low articulation rate), and facial cues (i.e., rotational energy of head movements). The nature/context of the discourse, the impact of other disorders and physical/psychological stress, and the quality and resolution of the recordings all play an important role in matching the digital features to the relevant background. In this way, the work presented in this paper provides a novel approach to extracting a wide range of cues relevant to the classification of depression and opens up new opportunities for further research.
Keywords: digital biomarkers of depression, facial cues, speech cues, language cues, deep learning, end-to-end pipeline, artificial intelligence
Published in DKUM: 17.01.2025; Views: 0; Downloads: 5
.pdf Full text (2,34 MB)

4.
5.
Learning Chinese as a foreign language : an introduction
Juliane House, Dániel Z. Kádár, 2023, other scientific articles

Abstract: In this introductory paper, we first present the background of the present special issue dedicated to Willis Edmondson. We first point out why Edmondson provided a ground-breaking contribution to the field of applied linguistics and why it is particularly timely to edit a special issue centering on his framework. We also argue that Edmondson's bottom-up and strictly language-anchored view on speech acts and interaction is particularly useful to examine the learning of Chinese as a foreign language, by going beyond exoticizing and overgeneralizing views of the Chinese linguaculture. Second, we briefly present what can be regarded as the heart and soul of the Edmondsonian framework, that is, a typology of speech acts and a related procedure through which the relationship between speech acts in interaction can be captured. Third, we present a research procedure that we outlined in our previous work, and which helps implementing the Edmondsonian model in the pragmatic study of foreign language learning. Finally, we present the contents of the special issue.
Keywords: Chinese, interaction, speech acts, second language pragmatics
Published in DKUM: 21.02.2024; Views: 341; Downloads: 34
.pdf Full text (348,04 KB)
This document has many files! More...

6.
Learning Chinese in a study abroad context : the case of ritual congratulating
Juliane House, Dániel Z. Kádár, 2023, original scientific article

Abstract: In this paper, we explore how and why the realisation of the ritual act of congratulating may turn out to be challenging for foreign learners of Chinese. Congratulating and other ceremonial ritual acts are not only used when someone participates in an actual ritual event: they are also required when they are mentioned in mundane events, like casual interactions when someone talks about a family birthday or wedding. We define congratulation as an interactional move and systematically examine its conventional realisation patterns through a typology of speech acts. We approach congratulating moves by combining a speech act-anchored analysis with the field of study abroad and interaction ritual research. The results of our analysis show that various ritual occasions may trigger different types of inappropriate uses of congratulating by foreign learners of Chinese.
Keywords: Chinese, interaction rituals, speech acts, study abroad, congratulating
Published in DKUM: 20.02.2024; Views: 445; Downloads: 29
.pdf Full text (462,61 KB)
This document has many files! More...

7.
Methodology of immersive video application : the case study of a virtual tour
Jure Jazbinšek, Gorazd Hren, 2021, original scientific article

Abstract: A Virtual Tour is an interactive presentation of real places accessible directly with an Internet browser with no additional installations of apps of plugins. Once, 360° photos are recorded and processed (stitched into spherical panoramas), editing of a Virtual Tour (walk) enables connection of spherical panoramic photos (or videos) into interactive presentations. For an enhanced experience and stand-alone presenting ability, features are added, like natural-sounding voice for text-to-speech descriptions and embedded videos. During multiple virtual tour presentations, users, viewers and presenters reported exceptional usability and an immersive experience. Virtual Tours have great potential to reshape the future education process and establish a new benchmark for presentation. The Virtual Tours application is expected to be used in education, tourism and future building sites or industry, as a key component for workforce briefings, and “as build” documenting of various stages of build, with the possibilities to integrate into Building Information Modelling (BIM) models.
Keywords: virtual tour, 360 camera, RICOH THETA Z1, 3dVista, Text to Speech
Published in DKUM: 13.11.2023; Views: 443; Downloads: 5
.pdf Full text (2,21 MB)
This document has many files! More...

8.
Rhetorical Figures Between Traditional Poetry and Rap
Rok Klemenčič, 2020, master's thesis

Abstract: This Master’s thesis looks into figures of speech used in classical poetry as compared to rap. With the help of Heinrich F. Plett’s analytic scheme, we have established a model for future linguistic comparison which could be modified and applied to various fields and works. The results of the analysis have shown similarities and differences between poetry and rap from the point of view of figures as well as other rhetorical devices: rhythm, structure, motifs, etc. The figures are identified and their potential influence on the perception of the poem/rap song is analysed. Stylistic analysis serves as a basis for the comparison, and the thematically connected pairs allow us a deeper insight into the selected works. Some similarities beyond the mere use of figures are established, i.e., motifs, themes, rhythm and meter. This analysis allows us to compare the genres, while the results show tendencies and characteristics typical of either traditional poetry or rap, i.e., use of certain rhymes, orthography, and punctuation.
Keywords: Figures of speech, Stylistics, Poetry, Rap, Comparative approach
Published in DKUM: 25.01.2021; Views: 977; Downloads: 117
.pdf Full text (1,11 MB)

9.
A monosemic account of modality in speech act theory
Niko Šetar, 2020, master's thesis

Abstract: Connection between modality in the English language and pragmatics is a matter of extensive debate as it often seems there is no concrete way of establishing a sensible correlation between modality that an utterance contains and its pragmatic function, which is due to numerous issues pertaining to different accounts of both modality and speech act theory. Traditional view of modality splits modal verbs into three categories: epistemic, deontic and dynamic (also known as simple root modality). The problem with this view is that there is no way of determining whether a certain modal verb is used in epistemic, deontic, or dynamic sense as most modals can serve any of the three functions, therefore explaining modality within this framework is highly ambiguous even when relying on broader context of the utterance containing a certain modal. Traditional view of speech acts, on the other hand, divides them into locutionary, illocutionary and perlocutionary speech acts. Yet it would seem that all modalities pertain only to illocutionary speech acts, as they are the ones that express speaker's intentions, which are most heavily influenced by modality. The connection between traditionalist accounts is therefore quite impossible. A more contemporary view splits speech acts into assertive, commissive, constative, directive and imperative speech acts, while we may consider locution, illocution and perlocution to be aspects of these speech acts, rather than separate categories. In these case, different modalities may be connected to different speech acts, but the ambiguity that traditional view of modality contains persists into any attempt to draw the connection between modality and speech act. Therefore, an alternative account of modality is required. Two well-known such accounts are polysemic and monosemic views of modality. Polysemic views claim that every lexeme (in our case, a modal verb) may possess several semantic meanings, while monosemic views maintain that every lexeme can be defined in the sense of a single meaning. Reviewing polysemic accounts shows that their reliance on multiple meanings and definitions for every lexeme leads to similar ambiguities as the traditional view of modality, and can therefore not be used in our efforts. Monosemic views, however, differ greatly from one another. While some accounts have been shown to be inadequate, Groefsema’s 1995 account serves the required purpose. The author defines each modal verb in the sense of the proposition expressed by the modal and an additional minimal set of propositions that supports the use of that particular modal. Kissine (2013) similarly defines speech acts, thus a correlation between modal verbs and speech acts may be established. Finally, we attempt to show that each modal verb with a particular minimal set of supporting propositions can only feature in one type of speech act, thus also defining the speech act within which it is contained.
Keywords: Pragmatics, modality, speech acts, epistemic, deontic, dynamic, assertives, commissives, constatives, directives, imperatives, polysemy, monosemy
Published in DKUM: 16.09.2020; Views: 1231; Downloads: 124
.pdf Full text (355,48 KB)

10.
Gender-Based Conversational Styles in Reality TV
Vesna Videmšek, 2019, master's thesis

Abstract: This master’s thesis deals with differences in conversational styles between men and women in American reality TV show Big Brother. The main purpose of this study is to investigate the conversations in order to find out whether there exist any differences between male and female speakers in the use of minimal responses, hedges, questions, swear words, overlapping speech and topics in mixed-gender and same-gender conversations. Nine conversations between ten different contestants of the reality TV show Big Brother were randomly selected, transcribed and analyzed. The results of this master’s thesis reveal that women use more hedges and minimal responses than men. There are, however, no gender differences in the placement of minimal responses. Both men and women produce the majority of minimal responses at non-transition relevant places. Furthermore, women swear less than men and use milder swear words. Men, on the other hand, are more inclined to use competitive overlapping speech. They also ask more expressive style questions than women, whereas women ask more relational questions than men. Lastly, there are no significant differences in topic choice between men and women.
Keywords: gender, conversational style, reality TV, hedges, minimal responses, swear words, questions, overlapping speech, conversational topics
Published in DKUM: 19.07.2019; Views: 1266; Downloads: 136
.pdf Full text (2,27 MB)

Search done in 0.11 sec.
Back to top
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica