| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Show document Help

Title:Weakly-supervised multilingual medical NER for symptom extraction for low-resource languages
Authors:ID Sallauka, Rigon (Author)
ID Arioz, Umut (Author)
ID Rojc, Matej (Author)
ID Mlakar, Izidor (Author)
Files:.pdf applsci-15-05585-v2.pdf (338,94 KB)
MD5: 9E3606C205F09FCCA4B26DDF5C379DCF
 
Language:English
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FERI - Faculty of Electrical Engineering and Computer Science
Abstract:Patient-reported health data, especially patient-reported outcomes measures, are vital for improving clinical care but are often limited by memory bias, cognitive load, and inflexible questionnaires. Patients prefer conversational symptom reporting, highlighting the need for robust methods in symptom extraction and conversational intelligence. This study presents a weakly-supervised pipeline for training and evaluating medical Named Entity Recognition (NER) models across eight languages, with a focus on low-resource settings. A merged English medical corpus, annotated using the Stanza i2b2 model, was translated into German, Greek, Spanish, Italian, Portuguese, Polish, and Slovenian, preserving the entity annotations medical problems, diagnostic tests, and treatments. Data augmentation addressed the class imbalance, and the fine-tuned BERT-based models outperformed baselines consistently. The English model achieved the highest F1 score (80.07%), followed by German (78.70%), Spanish (77.61%), Portuguese (77.21%), Slovenian (75.72%), Italian (75.60%), Polish (75.56%), and Greek (69.10%). Compared to the existing baselines, our models demonstrated notable performance gains, particularly in English, Spanish, and Italian. This research underscores the feasibility and effectiveness of weakly-supervised multilingual approaches for medical entity extraction, contributing to improved information access in clinical narratives—especially in under-resourced languages.
Keywords:low-resource languages, machine translation, medical entity extraction, NER, NLP, patient-reported outcomes, weakly-supervised learning
Publication status:Published
Publication version:Version of Record
Submitted for review:01.05.2025
Article acceptance date:13.05.2025
Publication date:16.05.2025
Publisher:MDPI
Year of publishing:2025
Number of pages:18 str.
Numbering:Vol. 15, iss. 10, [article no.] 5585
PID:20.500.12556/DKUM-92857 New window
UDC:004.8:61
ISSN on article:2076-3417
COBISS.SI-ID:236281347 New window
DOI:10.3390/app15105585 New window
Copyright:© 2025 by the authors
Publication date in DKUM:19.05.2025
Views:0
Downloads:4
Metadata:XML DC-XML DC-RDF
Categories:Misc.
:
Copy citation
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Record is a part of a journal

Title:Applied sciences
Shortened title:Appl. sci.
Publisher:MDPI
ISSN:2076-3417
COBISS.SI-ID:522979353 New window

Document is financed by a project

Funder:Other - Other funder or multiple funders
Funding programme:European Union’s Horizon Europe Research and Innovation Program
Project number:101080923
Acronym:Project SMILE

Funder:Other - Other funder or multiple funders
Funding programme:Marie Skłodowska-Curie Doctoral Networks Actions
Acronym:HORIZON-MSCA-2021-DN-01-01

Funder:Other - Other funder or multiple funders
Project number:101073222
Acronym:BosomShiel

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.

Secondary language

Language:Slovenian
Keywords:strojno prevajanje, medicinska entiteta, rezultati poročil bolnikov, slabo nadzorovano učenje


Comments

Leave comment

You must log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica