Title: | Using generative AI to improve the performance and interpretability of rule-based diagnosis of Type 2 diabetes mellitus |
---|
Authors: | ID Kopitar, Leon (Author) ID Fister, Iztok (Author) ID Štiglic, Gregor (Author) |
Files: | information-15-00162.pdf (1,29 MB) MD5: 59EC8DBE817D570B9CA5890078BFB7D0
https://www.mdpi.com/2078-2489/15/3/162
|
---|
Language: | English |
---|
Work type: | Scientific work |
---|
Typology: | 1.01 - Original Scientific Article |
---|
Organization: | FZV - Faculty of Health Sciences FERI - Faculty of Electrical Engineering and Computer Science
|
---|
Abstract: | Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to improve both diagnostic accuracy and interpretability. This novel approach has not been explored before in using pretrained transformers for diabetes classification on tabular data. Methods: The study used the Pima Indians Diabetes dataset to investigate Type 2 diabetes mellitus. Python and Jupyter Notebook were employed for analysis, with the NiaARM framework for association rule mining. LightGBM and the dalex package were used for performance comparison and feature importance analysis, respectively. SHAP was used for local interpretability. OpenAI GPT version 3.5 was utilized for outcome prediction and interpretation. The source code is available on GitHub. Results: NiaARM generated 350 rules to predict diabetes. LightGBM performed better than the GPT-based model. A comparison of GPT and NiaARM rules showed disparities, prompting a similarity score analysis. LightGBM’s decision making leaned heavily on glucose, age, and BMI, as highlighted in feature importance rankings. Beeswarm plots demonstrated how feature values correlate with their influence on diagnosis outcomes. Discussion: Combining association rule mining with GPT for Type 2 diabetes mellitus classification yields limited effectiveness. Enhancements like preprocessing and hyperparameter tuning are required. Interpretation challenges and GPT’s dependency on provided rules indicate the necessity for prompt engineering and similarity score methods. Variations in feature importance rankings underscore the complexity of T2DM. Concerns regarding GPT’s reliability emphasize the importance of iterative approaches for improving prediction accuracy.
|
---|
Keywords: | GPT, association rule mining, classification, interpretability, diagnostics |
---|
Publication status: | Published |
---|
Publication version: | Version of Record |
---|
Submitted for review: | 09.02.2024 |
---|
Article acceptance date: | 05.03.2024 |
---|
Publication date: | 12.03.2024 |
---|
Year of publishing: | 2024 |
---|
Number of pages: | str. 1-17 |
---|
Numbering: | letn. 15, št. 3, št. članka 162 |
---|
PID: | 20.500.12556/DKUM-91187  |
---|
UDC: | 004.8:616.379-008.64 |
---|
ISSN on article: | 2078-2489 |
---|
COBISS.SI-ID: | 189955587  |
---|
DOI: | 10.3390/info15030162  |
---|
Publication date in DKUM: | 26.11.2024 |
---|
Views: | 0 |
---|
Downloads: | 6 |
---|
Metadata: |  |
---|
Categories: | Misc.
|
---|
:
|
Copy citation |
---|
| | | Average score: | (0 votes) |
---|
Your score: | Voting is allowed only for logged in users. |
---|
Share: |  |
---|
Hover the mouse pointer over a document title to show the abstract or click
on the title to get all document metadata. |