1. LLM in the loop: a framework for contextualizing counterfactual segment perturbations in point cloudsVeljka Kočić, Niko Lukač, Dzemail Rozajac, Stefan Schweng, Christoph Gollob, Arne Nothdurft, Karl Stampfer, Javier Del Ser, Andreas Holzinger, 2025, izvirni znanstveni članek Opis: Point Cloud Data analysis has seen a major leap forward with the introduction of PointNet algorithms, revolutionizing how we process 3D environments. Yet, despite these advancements, key challenges remain, particularly in optimizing segment perturbations to influence model outcomes in a controlled and meaningful way. Traditional methods struggle to generate realistic and contextually appropriate perturbations, limiting their effectiveness in critical applications like autonomous systems and urban planning. This paper takes a bold step by integrating Large Language Models into the counterfactual reasoning process, unlocking a new level of automation and intelligence in segment perturbation. Our approach begins with semantic segmentation, after which LLMs intelligently select optimal replacement segments based on features such as class label, color, area, and height. By leveraging the reasoning capabilities of LLMs, we generate perturbations that are not only computationally efficient but also semantically meaningful. The proposed framework undergoes rigorous evaluation, combining human inspection of LLM-generated suggestions with quantitative analysis of semantic classification model performance across different LLM variants. By bridging the gap between geometric transformations and high-level semantic reasoning, this research redefines how we approach perturbation generation in Point Cloud Data analysis. The results pave the way for more interpretable, adaptable, and intelligent AI-driven solutions, bringing us closer to realworld applications where explainability and robustness are paramount. Ključne besede: explainable AI, point cloud data, counterfactual reasoning, LiDAR, 3D point cloud data, interpretability, human-centered AI, large language models, K-nearest neighbors Objavljeno v DKUM: 19.05.2025; Ogledov: 0; Prenosov: 0
Celotno besedilo (7,24 MB) |
2. A brief review on benchmarking for large language models evaluation in healthcareLeona Cilar Budler, Hongyu Chen, Aokun Chen, Maxim Topaz, Wilson Tam, Jiang Bian, Gregor Štiglic, 2025, pregledni znanstveni članek Opis: This paper reviews benchmarking methods for evaluating large language models (LLMs) in healthcare settings. It highlights the importance of rigorous benchmarking to ensure LLMs' safety, accuracy, and effectiveness in clinical applications. The review also discusses the challenges of developing standardized benchmarks and metrics tailored to healthcare-specific tasks such as medical text generation, disease diagnosis, and patient management. Ethical considerations, including privacy, data security, and bias, are also addressed, underscoring the need for multidisciplinary collaboration to establish robust benchmarking frameworks that facilitate LLMs' reliable and ethical use in healthcare. Evaluation of LLMs remains challenging due to the lack of standardized healthcare-specific benchmarks and comprehensive datasets. Key concerns include patient safety, data privacy, model bias, and better explainability, all of which impact the overall trustworthiness of LLMs in clinical settings. Ključne besede: artificial intelligence, benchmarking, chatbots, healthcare, large language models, natural language processing Objavljeno v DKUM: 12.05.2025; Ogledov: 0; Prenosov: 0
Povezava na datoteko |
3. New approach for automated explanation of material phenomena (AA6082) using artificial neural networks and ChatGPTTomaž Goričan, Milan Terčelj, Iztok Peruš, 2024, izvirni znanstveni članek Opis: Artificial intelligence methods, especially artificial neural networks (ANNs), have increasingly been utilized for the mathematical description of physical phenomena in (metallic) material
processing. Traditional methods often fall short in explaining the complex, real-world data observed
in production. While ANN models, typically functioning as “black boxes”, improve production
efficiency, a deeper understanding of the phenomena, akin to that provided by explicit mathematical
formulas, could enhance this efficiency further. This article proposes a general framework that
leverages ANNs (i.e., Conditional Average Estimator—CAE) to explain predicted results alongside
their graphical presentation, marking a significant improvement over previous approaches and those
relying on expert assessments. Unlike existing Explainable AI (XAI) methods, the proposed framework mimics the standard scientific methodology, utilizing minimal parameters for the mathematical
representation of physical phenomena and their derivatives. Additionally, it analyzes the reliability
and accuracy of the predictions using well-known statistical metrics, transitioning from deterministic
to probabilistic descriptions for better handling of real-world phenomena. The proposed approach
addresses both aleatory and epistemic uncertainties inherent in the data. The concept is demonstrated through the hot extrusion of aluminum alloy 6082, where CAE ANN models and predicts
key parameters, and ChatGPT explains the results, enabling researchers and/or engineers to better
understand the phenomena and outcomes obtained by ANNs. Ključne besede: artificial neural networks, automatic explanation, hot extrusion, aluminum alloy, large language models, ChatGPT Objavljeno v DKUM: 27.02.2025; Ogledov: 0; Prenosov: 5
Celotno besedilo (3,18 MB) Gradivo ima več datotek! Več... |
4. Exploring the feasibility of generative AI in persona research : a omparative analysis of large language model-generated and human-crafted personas in obesity researchUrška Smrke, Ana Rehberger, Nejc Plohl, Izidor Mlakar, 2025, izvirni znanstveni članek Opis: This study investigates the perceptions of Persona descriptions generated using three different large language models (LLMs) and qualitatively developed Personas by an expert panel involved in obesity research. Six different Personas were defined, three from the clinical domain and three from the educational domain. The descriptions of Personas were generated using qualitative methods and the LLMs (i.e., Bard, Llama, and ChatGPT). The perception of the developed Personas was evaluated by experts in the respective fields. The results show that, in general, the perception of Personas did not significantly differ between those generated using LLMs and those qualitatively developed by human experts. This indicates that LLMs have the potential to generate a consistent and valid representation of human stakeholders. The LLM-generated Personas were perceived as believable, relatable, and informative. However, post-hoc comparisons revealed some differences, with descriptions generated using the Bard model being in several Persona descriptions that were evaluated most favorably in terms of empathy, likability, and clarity. This study contributes to the understanding of the potential and challenges of LLM-generated Personas. Although the study focuses on obesity research, it highlights the importance of considering the specific context and the potential issues that researchers should be aware of when using generative AI for generating Personas. Ključne besede: user personas, obesity, large language models, value sensitive design, digital health interventions Objavljeno v DKUM: 14.02.2025; Ogledov: 0; Prenosov: 5
Celotno besedilo (812,18 KB) |
5. Computer science education in ChatGPT Era: experiences from an experiment in a programming course for novice programmersTomaž Kosar, Dragana Ostojić, Yu David Liu, Marjan Mernik, 2024, izvirni znanstveni članek Opis: The use of large language models with chatbots like ChatGPT has become increasingly popular among students, especially in Computer Science education. However, significant debates exist in the education community on the role of ChatGPT in learning. Therefore, it is critical to understand the potential impact of ChatGPT on the learning, engagement, and overall success of students in classrooms. In this empirical study, we report on a controlled experiment with 182 participants in a first-year undergraduate course on object-oriented programming. Our differential study divided students into two groups, one using ChatGPT and the other not using it for practical programming assignments. The study results showed that the students’ performance is not influenced by ChatGPT usage (no statistical significance between groups with a p-value of 0.730), nor are the grading results of practical assignments (p-value 0.760) and midterm exams (p-value 0.856). Our findings from the controlled experiment suggest that it is safe for novice programmers to use ChatGPT if specific measures and adjustments are adopted in the education process. Ključne besede: large language models, ChatGPT, artificial intelligence, controlled experiment, object-oriented programming, software engineering education Objavljeno v DKUM: 12.08.2024; Ogledov: 59; Prenosov: 7
Celotno besedilo (492,37 KB) |