| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Show document Help

Title:CRISP-DM procesni model za podatkovno rudarjenje
Authors:ID Roškarič, Tadej (Author)
ID Perko, Igor (Mentor) More about this mentor... New window
Files:.pdf UN_Roskaric_Tadej_2022.pdf (2,24 MB)
MD5: 329AAE8F00039953AA767F86B58A9F9A
 
.zip UN_Roskaric_Tadej_2022.zip (1,16 MB)
MD5: 729DF15033B20AF6591D754BFB9F6AE5
 
Language:Slovenian
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:EPF - Faculty of Business and Economics
Abstract:Z vedno večjim napredkom tehnologije je na voljo vse več kapacitet za shranjevanje in analizo podatkov, pri čemer podatkovne baze postajajo vse kompleksnejše in iz tega razloga potrebujemo standardizirane postopke za analitično procesiranje. Medpanožni standardni postopek za podatkovno rudarjenje CRISP-DM (angl. Cross-Industry Standard Process for Data Mining) je primer tovrstnega standarda, ki je od njegovega nastanka leta 1996 še vedno med glavnimi procesnimi modeli na področju podatkovnega rudarjenja v vseh gospodarskih sektorjih. V tem diplomskem delu opredelimo njegove posamezne faze in korake ter jih podrobno opišemo. Ker podatkovno rudarjenje zaradi njegove poslovne vrednosti pridobiva vedno večji pomen, se na tem področju pojavlja vse več alternativ, zato CRISP-DM primerjamo z modeloma SEMMA (angl. Sample, Explore, Modify, Model, Assess) in ASUM-DM (angl. Analytics Solutions Unified Method for Data Mining) z zaključkom, da slednja nista dovolj fleksibilna za status splošnega standarda. Pregledali smo ustrezno literaturo in opravili študijo primera, v kateri smo optimizirali marketinško kampanjo za bančne storitve na podlagi podatkov portugalske finančne institucije. Po analizi literature in končanem praktičnem primeru smo pretehtali vpliv posameznih faz na kakovost rezultatov in ugotovili, da je v akademskem svetu najmanj pokrita prav faza uvedbe, ki pa je v praksi nepogrešljivega pomena. Prav tako smo izpostavili nekatere ključne pomanjkljivosti, ki znotraj originalnega CRISP-DM procesnega modela niso rešene. V ta namen smo predlagali dodatne korake, kot so postopek zbiranja podatkov, razširitev procesa uvedbe modela in nova faza podatkovne etike. Na podlagi teh predlogov ugotavljamo, da potreba po razširitvi prvotnega CRISP-DM modela obstaja.
Keywords:CRISP-DM, podatkovno rudarjenje, Python, procesni model, strojno učenje
Place of publishing:[Maribor]
Publisher:T. Roškarič
Year of publishing:2022
PID:20.500.12556/DKUM-81909 New window
UDC:004.6
COBISS.SI-ID:125510915 New window
Publication date in DKUM:13.10.2022
Views:387
Downloads:33
Metadata:XML RDF-CHPDL DC-XML DC-RDF
Categories:EPF
:
Copy citation
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.
Licensing start date:16.06.2022

Secondary language

Language:English
Title:The CRISP-DM process model for data mining
Abstract:With recent advances in technology, the capacity to store and analyse data is expanding, with database complexity increasing. Thus, we need a standardised model for analytics processes. The Cross-Industry Standard Process for Data Mining (CRISP-DM) is an example of a standard, which from its creation in 1996 has remained one of the main process models in the field of data mining in all sectors of the economy. In this thesis, we thoroughly define and describe its specific phases and steps. Data mining is gaining recognition because of its potential to add business value and thus new process models are emerging. We compare CRISP-DM with the process models SEMMA (Sample, Explore, Modify, Model, Assess) and ASUM-DM (Analytics Solutions Unified Method for Data Mining), and come to the conclusion that the former two are not flexible enough to deserve the status of a general standard. The relevant literature about CRISP-DM is examined and a case study, focusing on the optimisation of a marketing campaign through data collected by a Portuguese financial institution, is created. After examining the relevant literature and finishing our case study, we verify all phases of CRISP-DM for their importance to the quality of the results. We conclude that in the field of academia, the deployment phase is getting the least amount of attention even though it is crucial in practice. Apart from that, we also identify some important drawbacks of CRISP-DM. We propose some additional steps as for instance technical data acquisition, an expansion of the model implementation process, and a data mining ethical examination phase, which may provide value added to the standard. These proposals point to the need for expanding the original CRISP-DM model.
Keywords:CRISP-DM, data mining, Python, process model, machine learning


Comments

Leave comment

You must log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica