1. Improving personalized meal planning with large language models: identifying and decomposing compound ingredientsLeon Kopitar, Leon Bedrač, Larissa Jane Strath, Jiang Bian, Gregor Štiglic, 2025, original scientific article Abstract: Background/Objectives: Identifying and decomposing compound ingredients within meal plans presents meal customization and nutritional analysis challenges. It is essential for accurately identifying and replacing problematic ingredients linked to allergies or intolerances and helping nutritional evaluation. Methods: This study explored the effectiveness of three large language models (LLMs)—GPT-4o, Llama-3 (70B), and Mixtral (8x7B), in decomposing compound ingredients into basic ingredients within meal plans. GPT-4o was used to generate 15 structured meal plans, each containing compound ingredients. Each LLM then identified and decomposed these compound items into basic ingredients. The decomposed ingredients were matched to entries in a subset of the USDA FoodData Central repository using API-based search and mapping techniques. Nutritional values were retrieved and aggregated to evaluate accuracy of decomposition. Performance was assessed through manual review by nutritionists and quantified using accuracy and F1-score. Statistical significance was tested using paired t-tests or Wilcoxon signed-rank tests based on normality. Results: Results showed that large models—both Llama-3 (70B) and GPT-4o—outperformed Mixtral (8x7B), achieving average F1-scores of 0.894 (95% CI: 0.84–0.95) and 0.842 (95% CI: 0.79–0.89), respectively, compared to an F1-score of 0.690 (95% CI: 0.62–0.76) from Mixtral (8x7B). Conclusions: The open-source Llama-3 (70B) model achieved the best performance, outperforming the commercial GPT-4o model, showing its superior ability to consistently break down compound ingredients into precise quantities within meal plans and illustrating its potential to enhance meal customization and nutritional analysis. These findings underscore the potential role of advanced LLMs in precision nutrition and their application in promoting healthier dietary practices tailored to individual preferences and needs. Keywords: artificial intelligence, food analysis, LLM, Ilama, GPT, mixtral, ingredient identification, ingredient decomposition, personalized nutrition, meal customization, nutritional analysis, dietary planning Published in DKUM: 08.05.2025; Views: 0; Downloads: 1
Full text (684,70 KB) This document has many files! More... |
2. Using generative AI to improve the performance and interpretability of rule-based diagnosis of Type 2 diabetes mellitusLeon Kopitar, Iztok Fister, Gregor Štiglic, 2024, original scientific article Abstract: Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to improve both diagnostic accuracy and interpretability. This novel approach has not been explored before in using pretrained transformers for diabetes classification on tabular data. Methods: The study used the Pima Indians Diabetes dataset to investigate Type 2 diabetes mellitus. Python and Jupyter Notebook were employed for analysis, with the NiaARM framework for association rule mining. LightGBM and the dalex package were used for performance comparison and feature importance analysis, respectively. SHAP was used for local interpretability. OpenAI GPT version 3.5 was utilized for outcome prediction and interpretation. The source code is available on GitHub. Results: NiaARM generated 350 rules to predict diabetes. LightGBM performed better than the GPT-based model. A comparison of GPT and NiaARM rules showed disparities, prompting a similarity score analysis. LightGBM’s decision making leaned heavily on glucose, age, and BMI, as highlighted in feature importance rankings. Beeswarm plots demonstrated how feature values correlate with their influence on diagnosis outcomes. Discussion: Combining association rule mining with GPT for Type 2 diabetes mellitus classification yields limited effectiveness. Enhancements like preprocessing and hyperparameter tuning are required. Interpretation challenges and GPT’s dependency on provided rules indicate the necessity for prompt engineering and similarity score methods. Variations in feature importance rankings underscore the complexity of T2DM. Concerns regarding GPT’s reliability emphasize the importance of iterative approaches for improving prediction accuracy.
Keywords: GPT, association rule mining, classification, interpretability, diagnostics Published in DKUM: 26.11.2024; Views: 0; Downloads: 250
Full text (1,29 MB) This document has many files! More... |
3. Hybrid visualization-based framework for depressive state detection and characterization of atypical patientsLeon Kopitar, Peter Kokol, Gregor Štiglic, 2023, original scientific article Keywords: hybrid visualization, interpretation, explainable, shapley, feature importance, depression Published in DKUM: 12.06.2024; Views: 101; Downloads: 21
Full text (1,83 MB) This document has many files! More... |
4. |
5. Korelacija med hipertenzijo in biološko starostjo in analiza njunega vpliva na smrtnostUrša Deban, 2023, master's thesis Abstract: Uvod: Povišan krvni tlak oziroma hipertenzija je pomemben dejavnik tveganja srčno-žilnih in ledvičnih obolenj. Biološko starost lahko izračunamo na podlagi klinično merljivih parametrov.
Metode: Z namenom analize korelacije med hipertenzijo in biološko starostjo in njuno povezanostjo s smrtnostjo smo analizirali podatke iz podatkovne zbirke NHANES, ki vsebuje podatke o zdravstvenem stanju ameriških prebivalcev. Iz podatkovne zbirke NCHS pa smo pridobili podatke o smrtnosti. Izračunali smo biološko starost in analizirali statistično pomembnost razlik v krvnem tlaku in biološkem staranju med različnimi demografskimi skupinami. Z modelom logistične regresije smo primerjali napovedno moč krvnega tlaka in staranja na smrtnost. Paciente smo razdelili v tri skupine glede na hipertenzivni status ter primerjali statistične parametre med njimi.
Rezultati: Zaznali smo nizko korelacijo med krvnim tlakom in kronološko ter biološko starostjo, statistično pomembne razlike v biološkem staranju, ter krvnim tlakom in spolom. Ugotovili smo statistično pomembne razlike med nekaterimi, ne pa vsemi rasami. V analizi skupine hipertenzivnih pacientov nekatere razlike med demografskimi skupinami zbledijo. Izmed vseh testiranih spremenljivk se je kot najmočneje povezana s smrtnostjo pokazala ocena biološke starosti na podlagi krvnih meritev.
Razprava in zaključek: Rezultati raziskave izpostavljajo pomen biološke starosti pri nastanku hipertenzije, nakazujejo razlike v krvnem tlaku med demografskimi skupinami in pomen biološke starosti pri oceni tveganja smrtnosti. Keywords: hipertenzija, krvni tlak, biološko staranje, smrtnost, logistična regresija Published in DKUM: 22.05.2023; Views: 561; Downloads: 99
Full text (1,32 MB) |
6. Gradnja napovednih modelov s pomočjo strukturiranih in nestrukturiranih podatkovnih virovLeon Kopitar, 2017, master's thesis Abstract: Teoretična izhodišča: Sladkorna bolezen tipa 2 (SB2) je najpogostejša oblika sladkorne bolezni, predvsem v razvitih državah sveta. Za SB2 zboleva vedno več ljudi, in to zaradi neprimernega življenjskega stila, predvsem premalo fizične dejavnosti in nepravilnega prehranjevanja. Čeprav večina ljudi SB2 vidi kot samoumevno bolezen, ki se lahko pojavi v poznih letih, se mnogi ne zavedajo njene resnosti. SB2 predstavlja glavni vzrok za možgansko kap in bolezni srca. Poleg tega lahko privede do slepote, bolezni ledvic oziroma, v skrajnem primeru, tudi do smrti. S starostjo se tveganje za SB2 razumljivo povečuje, vendar pa lahko v veliki meri na povečanje tveganja vplivamo predvsem sami. Smrtnemu izidu so najbolj podvrženi bolniki s SB2, ki so bili hospitalizirani na enoti intenzivnega oddelka. Glavni namen magistrskega dela je bil preveriti vpliv najpogosteje ponavljajočih se korenov besed iz zapisov o zdravljenju bolnika na točnost napovednega modela za napoved preživetja bolnikov s SB2.
Metodologija raziskovanja: Analize smo opravili na filtrirani podatkovni zbirki MIMIC-III, ki hrani skupno 4236 zapisov o bolnikih s SB2. Analize so bile izvedene s programskim jezikom R s pomočjo naslednjih klasifikatorjev: Random Forest, Single C5.0 Ruleset, Glmnet (Lasso regresija), XGBoost ter GBM. Rezultate smo evalvirali z Bootstrap metodo, ponovljeno 100-krat.
Rezultati: Vsi napovedni modeli, zgrajeni na podatkih moškega vzorca, so bili v primerjavi z modeli, zgrajenimi na podatkih ženskega vzorca, statistično signifikantno uspešnejši pri napovedovanju umrljivosti bolnikov s SB2 (ΔAUC = +0,049, p < 0,001). Z uporabo bigramov se rezultati napovedne uspešnosti statistično ne razlikujejo (p > 0,001). Ne glede na spol se rezultati pri napovedovanju z vključenim kriterijem SAPS izboljšajo v primerjavi z napovedovanjem, če kriterij SAPS ni prisoten (ΔAUCŽenske = +0,0756, ΔAUCMoški = +0,082).
Sklep: Napovedni model XGBoost je najprimernejši model za napovedovanje umrljivosti bolnikov s SB2. Prisotnost besed, ki se navezujejo na stimulacijo oziroma spodbujanje, starost, gibanje, neodzivnost in diagnozo intracerebralne krvavitve, ima največji vpliv na uspešno napovedovanje umrljivosti bolnikov s SB2. Z vključitvijo bigramov se uspešnost napovednih modelov ne izboljša signifikantno. Uporaba pogosto uporabljenega kriterija SAPS, ki temelji na fizioloških podatkih, ostaja primarno vodilo pri napovedovanju umrljivosti bolnikov s SB2. Keywords: sladkorna bolezen tipa 2, napovedni modeli, zapisi medicinskih sester Published in DKUM: 10.10.2017; Views: 1763; Downloads: 347
Full text (1,03 MB) |