| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Search the digital library catalog Help

Query: search in
search in
search in
search in
* old and bologna study programme

Options:
  Reset


1 - 7 / 7
First pagePrevious page1Next pageLast page
1.
An algorithm for protecting knowledge discovery data
Boštjan Brumen, Izidor Golob, Tatjana Welzer-Družovec, Ivan Rozman, Marjan Družovec, Hannu Jaakkola, 2003, original scientific article

Abstract: In the paper, we present an algorithm that can be applied to protect data before a data mining process takes place. The data mining, a part of the knowledge discovery process, is mainly about building models from data. We address the following question: can we protect the data and still allow the data modelling process to take place? We consider the case where the distributions of original data values are preserved while the values themselves change, so that the resulting model is equivalent to the one built with original data. The presented formal approach is especially useful when the knowledge discovery process is outsourced. The application of the algorithm is demonstrated through an example.
Keywords: data protection algorithm, classification algorithm, disclosure control, data mining, knowledge discovery, data security
Published: 01.06.2012; Views: 1181; Downloads: 22
URL Link to full text

2.
Contrasting temporal trend discovery for large healthcare databases
Goran Hrovat, Gregor Štiglic, Peter Kokol, Milan Ojsteršek, original scientific article

Abstract: With the increased acceptance of electronic health records, we can observe theincreasing interest in the application of data mining approaches within this field. This study introduces a novel approach for exploring and comparingtemporal trends within different in-patient subgroups, which is basedon associated rule mining using Apriori algorithm and linear model-based recursive partitioning. The Nationwide Inpatient Sample (NIS), Healthcare Costand Utilization Project (HCUP), Agency for Healthcare Research and Qualitywas used to evaluate the proposed approach. This study presents a novelapproach where visual analytics on big data is used for trend discovery in form of a regression tree with scatter plots in the leaves of the tree. Thetrend lines are used for directly comparing linear trends within a specified time frame. Our results demonstrate the existence of opposite trendsin relation to age and sex based subgroups that would be impossible to discover using traditional trend-tracking techniques. Such an approach can be employed regarding decision support applications for policy makers when organizing campaigns or by hospital management for observing trends that cannot be directly discovered using traditional analytical techniques.
Keywords: data mining, decision support, trend discovery
Published: 27.11.2014; Views: 892; Downloads: 207
.pdf Full text (1013,97 KB)
This document has many files! More...

3.
SSD - Subspace Subgroup Discovery
Gregor Štiglic, 2012, software

Keywords: knowledge discovery, subgroup discovery, data mining
Published: 10.07.2015; Views: 966; Downloads: 19
URL Link to full text

4.
5.
Algorithms for association rule learning
Renata Akhmetshakirova, 2017, undergraduate thesis

Abstract: One of the most popular methods of knowledge discovery in databases is the extraction of association rules. There are many different algorithms for association rule learning , which differ in space and time complexity. To perform a comparative analysis, we have implemented Apriori, Eclat and FP-growth algorithms and compared their time and memory consumption using synthetic and real databases. The analysis has shown that the FP-growth algorithm is the most efficient in the majority of cases.
Keywords: association rules, data mining, Apriori, Eclat, FP-growth
Published: 24.02.2017; Views: 982; Downloads: 68
.pdf Full text (1,17 MB)

6.
Analyzing information seeking and drug-safety alert response by health care professionals as ew methods for surveillance
Alison Callahan, Igor Pernek, Gregor Štiglic, Jurij Leskovec, Howard Strasberg, Nigam Haresh Shah, 2015, original scientific article

Abstract: Background: Patterns in general consumer online search logs have been used to monitor health conditions and to predict health-related activities, but the multiple contexts within which consumers perform online searches make significant associations difficult to interpret. Physician information-seeking behavior has typically been analyzed through survey-based approaches and literature reviews. Activity logs from health care professionals using online medical information resources are thus a valuable yet relatively untapped resource for large-scale medical surveillance. Objective: To analyze health care professionals% information-seeking behavior and assess the feasibility of measuring drug-safety alert response from the usage logs of an online medical information resource. Methods: Using two years (2011-2012) of usage logs from UpToDate, we measured the volume of searches related to medical conditions with significant burden in the United States, as well as the seasonal distribution of those searches. We quantified the relationship between searches and resulting page views. Using a large collection of online mainstream media articles and Web log posts we also characterized the uptake of a Food and Drug Administration (FDA) alert via changes in UpToDate search activity compared with general online media activity related to the subject of the alert. Results: Diseases and symptoms dominate UpToDate searches. Some searches result in page views of only short duration, while others consistently result in longer-than-average page views. The response to an FDA alert for Celexa, characterized by a change in UpToDate search activity, differed considerably from general online media activity. Changes in search activity appeared later and persisted longer in UpToDate logs. The volume of searches and page view durations related to Celexa before the alert also differed from those after the alert. Conclusions: Understanding the information-seeking behavior associated with online evidence sources can offer insight into the information needs of health professionals and enable large-scale medical surveillance. Our Web log mining approach has the potential to monitor responses to FDA alerts at a national level. Our findings can also inform the design and content of evidence-based medical information resources such as UpToDate
Keywords: internet log analysis, data mining, physicians, information-seeking behavior, drug safety surveillance
Published: 02.08.2017; Views: 473; Downloads: 66
.pdf Full text (4,18 MB)
This document has many files! More...

7.
Link prediction on Twitter
Sanda Martinčić-Ipšić, Edvin Močibob, Matjaž Perc, 2017, original scientific article

Abstract: With over 300 million active users, Twitter is among the largest online news and social networking services in existence today. Open access to information on Twitter makes it a valuable source of data for research on social interactions, sentiment analysis, content diffusion, link prediction, and the dynamics behind human collective behaviour in general. Here we use Twitter data to construct co-occurrence language networks based on hashtags and based on all the words in tweets, and we use these networks to study link prediction by means of different methods and evaluation metrics. In addition to using five known methods, we propose two effective weighted similarity measures, and we compare the obtained outcomes in dependence on the selected semantic context of topics on Twitter. We find that hashtag networks yield to a large degree equal results as all-word networks, thus supporting the claim that hashtags alone robustly capture the semantic context of tweets, and as such are useful and suitable for studying the content and categorization. We also introduce ranking diagrams as an efficient tool for the comparison of the performance of different link prediction algorithms across multiple datasets. Our research indicates that successful link prediction algorithms work well in correctly foretelling highly probable links even if the information about a network structure is incomplete, and they do so even if the semantic context is rationalized to hashtags.
Keywords: link prediction, data mining, Twitter, network analysis
Published: 15.09.2017; Views: 434; Downloads: 53
.pdf Full text (6,98 MB)
This document has many files! More...

Search done in 0.19 sec.
Back to top
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica