| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Search the digital library catalog Help

Query: search in
search in
search in
search in
* old and bologna study programme

Options:
  Reset


1 - 6 / 6
First pagePrevious page1Next pageLast page
1.
Machine translation of independent nominal phrases in technical texts
Simon Zupan, Zmago Pavličič, Melanija Fabčič, 2025, original scientific article

Abstract: This paper deals with machine translations of independent noun phrases in technical texts, which are not part of any sentence structure but function on their own, typically in tables and illustrations. Such nominal structures are common in technical texts because they allow technical writers to increase lexical density and precision in expression. On the other hand, these phrases pose a challenge for machine translation engines, as their meaning depends on the context. Independent noun phrases from a service manual, which were translated from English into Slovene by two different machine translators (DeepL and Google Translate), are considered in this paper. Their comparison with the original showed some limitations of machine translation engines in translating noun phrases, since approximately half of them showed a noticeable change in meaning.Prispevek obravnava strojne prevode samostojnih samostalniških besednih zvez v tehničnih besedilih, ki niso del stavčnih struktur, temveč se pojavljajo zunaj konteksta, najpogosteje v preglednicah in grafičnih prikazih. Tovrstne besedne zveze se pogosto pojavljajo v tehničnih besedilih, saj piscem omogočajo večjo leksikalno gostoto in konciznost pri izražanju. Po drugi strani predstavljajo izziv za strojne prevajalnike, saj je njihov pomen odvisen od sobesedila. V prispevku so obravnavane samostoječe samostalniške besedne zveze iz servisnega priročnika, ki so bile iz angleščine v slovenščino prevedene z dvema različnima strojnima prevajalnikoma (DeepL in Google Translate). Njihova primerjava z izvirnikom je pokazala nekatere omejitve strojnih prevajalnikov pri prevajanju samostalniških besednih zvez, saj se je pri približno polovici besednih zvez opazno spremenil njihov pomen.
Keywords: technical texts, machine translation, nominal phrases, translation shifts, technical translation
Published in DKUM: 08.07.2025; Views: 0; Downloads: 7
.pdf Full text (1,14 MB)
This document has many files! More...

2.
Weakly-supervised multilingual medical NER for symptom extraction for low-resource languages
Rigon Sallauka, Umut Arioz, Matej Rojc, Izidor Mlakar, 2025, original scientific article

Abstract: Patient-reported health data, especially patient-reported outcomes measures, are vital for improving clinical care but are often limited by memory bias, cognitive load, and inflexible questionnaires. Patients prefer conversational symptom reporting, highlighting the need for robust methods in symptom extraction and conversational intelligence. This study presents a weakly-supervised pipeline for training and evaluating medical Named Entity Recognition (NER) models across eight languages, with a focus on low-resource settings. A merged English medical corpus, annotated using the Stanza i2b2 model, was translated into German, Greek, Spanish, Italian, Portuguese, Polish, and Slovenian, preserving the entity annotations medical problems, diagnostic tests, and treatments. Data augmentation addressed the class imbalance, and the fine-tuned BERT-based models outperformed baselines consistently. The English model achieved the highest F1 score (80.07%), followed by German (78.70%), Spanish (77.61%), Portuguese (77.21%), Slovenian (75.72%), Italian (75.60%), Polish (75.56%), and Greek (69.10%). Compared to the existing baselines, our models demonstrated notable performance gains, particularly in English, Spanish, and Italian. This research underscores the feasibility and effectiveness of weakly-supervised multilingual approaches for medical entity extraction, contributing to improved information access in clinical narratives—especially in under-resourced languages.
Keywords: low-resource languages, machine translation, medical entity extraction, NER, NLP, patient-reported outcomes, weakly-supervised learning
Published in DKUM: 19.05.2025; Views: 0; Downloads: 4
.pdf Full text (338,94 KB)

3.
On the use of morpho-syntactic description tags in neural machine translation with small and large training corpora
Gregor Donaj, Mirjam Sepesy Maučec, 2022, original scientific article

Abstract: With the transition to neural architectures, machine translation achieves very good quality for several resource-rich languages. However, the results are still much worse for languages with complex morphology, especially if they are low-resource languages. This paper reports the results of a systematic analysis of adding morphological information into neural machine translation system training. Translation systems presented and compared in this research exploit morphological information from corpora in different formats. Some formats join semantic and grammatical information and others separate these two types of information. Semantic information is modeled using lemmas and grammatical information using Morpho-Syntactic Description (MSD) tags. Experiments were performed on corpora of different sizes for the English–Slovene language pair. The conclusions were drawn for a domain-specific translation system and for a translation system for the general domain. With MSD tags, we improved the performance by up to 1.40 and 1.68 BLEU points in the two translation directions. We found that systems with training corpora in different formats improve the performance differently depending on the translation direction and corpora size.
Keywords: neural machine translation, POS tags, MSD tags, inflected language, data sparsity, corpora size
Published in DKUM: 28.03.2025; Views: 0; Downloads: 11
.pdf Full text (448,16 KB)
This document has many files! More...

4.
Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
Jani Dugonik, Mirjam Sepesy Maučec, Domen Verber, Janez Brest, 2023, original scientific article

Abstract: This paper proposes a hybrid machine translation (HMT) system that improves the quality of neural machine translation (NMT) by incorporating statistical machine translation (SMT). Therefore, two NMT systems and two SMT systems were built for the Slovenian-English language pair, each for translation in one direction. We used a multilingual language model to embed the source sentence and translations into the same vector space. From each vector, we extracted features based on the distances and similarities calculated between the source sentence and the NMT translation, and between the source sentence and the SMT translation. To select the best possible translation, we used several well-known classifiers to predict which translation system generated a better translation of the source sentence. The proposed method of combining SMT and NMT in the hybrid system is novel. Our framework is language-independent and can be applied to other languages supported by the multilingual language model. Our experiment involved empirical applications. We compared the performance of the classifiers, and the results demonstrate that our proposed HMT system achieved notable improvements in the BLEU score, with an increase of 1.5 points and 10.9 points for both translation directions, respectively.
Keywords: neural machine translation, statistical machine translation, sentence embedding, similarity, classification, hybrid machine translation
Published in DKUM: 20.02.2024; Views: 322; Downloads: 40
.pdf Full text (400,40 KB)
This document has many files! More...

5.
Applicability and challenges of using machine translation in translator training
Melita Koletnik, 2011, professional article

Abstract: During the last decade, translation as well as translator training have experienced a significant change. This change has been significantly influenced by the development of the Internet and the successive availability of web-based translation resources, such as Google Translate. Their introduction into the translation didactic process and training is no longer a matter of a teacher’s personal preference and IT skills, but a necessity imposed by the ever-swifter advancement of technology. This article presents the experimental results of an ongoing broader research study focusing on the modes and frequency of use of the Internet, Google Translate and Google Translator Toolkit among translation students at the undergraduate level. The preliminary results, presented in this article, are based on a questionnaire which was prepared in relation to the use of Google Translate while considering the latest professional findings. The article concludes with the author’s observations as to the applicability of these resources in translator training and the challenges thereof.
Keywords: machine translation, teaching methodology, internet, Google Translate, machine translation systems, translator training, translation didactics, Internet, Google Translate
Published in DKUM: 12.05.2017; Views: 2030; Downloads: 268
.pdf Full text (269,02 KB)
This document has many files! More...

6.
SUMAT : data collection and parallel corpus compilation for machine translation of subtitles
Volha Petukhova, Mirjam Sepesy Maučec, 2012, published scientific conference contribution

Abstract: This paper describes the data collection and parallel corpus compilation activities carried out in the FP7 EU-funded SUMAT project. This project aims to develop an online subtitle translation service for nine European languages combined into 14 different language pairs. This data provides bilingual and monolingual training data for statistical machine translation engines which will semi-automate the subtitle translation processes of subtitling companies on a large scale.
Keywords: parallel multilingua corpora, statistical machine translation, subtitle translation service
Published in DKUM: 10.07.2015; Views: 3057; Downloads: 64
URL Link to full text

Search done in 0.07 sec.
Back to top
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica