1. Analysis of gene ranking algorithms with extraction of relevant biomedical concepts from Pubmed publicationsSimon Kocbek, Rune Saetre, Gregor Štiglic, Jin-Dong Kim, Igor Pernek, Yoshimasa Tsuruoka, Peter Kokol, Sophia Ananiadou, Jun-ichi Tsujii, 2011, published scientific conference contribution abstract Keywords: DNA microarray, gene ranking, algorithms, Pubmed publications Published in DKUM: 05.06.2012; Views: 1883; Downloads: 42 Link to full text |
2. Stability of ranked gene lists in large microarray analysis studiesGregor Štiglic, Peter Kokol, 2010, original scientific article Abstract: This paper presents an empirical study that aims to explain the relationship between the number of samples and stability of different gene selection techniques for microarray datasets. Unlike other similar studies where number of genes in a ranked gene list is variable, this study uses an alternative approach where stability is observed at different number of samples that are used for gene selection. Three different metrics of stability, including a novel metric in bioinformatics, were used to estimate the stability of the ranked gene lists. Results of this study demonstrate that the univariate selection methods produce significantly more stable ranked gene lists than themultivariate selection methods used in this study. More specifically, thousands of samples are needed for these multivariate selection methods to achieve the same level of stability any given univariate selection method can achieve with only hundreds. Keywords: gene selection techniques, microarray, analysis studies Published in DKUM: 05.06.2012; Views: 2062; Downloads: 354 Full text (1,39 MB) This document has many files! More... |
3. Gene set enrichment meta-learning analysis: next-generation sequencing versus microarraysGregor Štiglic, Mateja Bajgot, Peter Kokol, 2010, original scientific article Abstract: Background Reproducibility of results can have a significant impact on the acceptance of new technologies in gene expression analysis. With the recent introduction of the so-called next-generation sequencing (NGS) technology and established microarrays, one is able to choose between two completely different platforms for gene expression measurements. This study introduces a novel methodology for gene-ranking stability analysis that is applied to the evaluation of gene-ranking reproducibility on NGS and microarray data. Results The same data used in a well-known MicroArray Quality Control (MAQC) study was also used in this study to compare ranked lists of genes from MAQC samples A and B, obtained from Affymetrix HG-U133 Plus 2.0 and Roche 454 Genome Sequencer FLX platforms. An initial evaluation, where the percentage ofoverlapping genes was observed, demonstrates higher reproducibility on microarray data in 10 out of 11 gene-ranking methods. A gene set enrichment analysis shows similar enrichment of top gene sets when NGS is compared with microarrays on a pathway level. Our novel approach demonstrates high accuracy of decision trees when used for knowledge extraction from multiple bootstrapped gene set enrichment analysis runs. A comparison of the two approaches in sample preparation for high-throughput sequencing shows that alternating decision trees represent the optimal knowledge representation method in comparison with classical decision trees. Conclusions Usual reproducibility measurements are mostly based on statistical techniques that offer very limited biological insights into the studied gene expression data sets. This paper introduces the meta-learning-based gene set enrichment analysis that can be used to complement the analysis of gene-ranking stabilityestimation techniques such as percentage of overlapping genes or classic gene set enrichment analysis. It is useful and practical when reproducibility of gene ranking results or different gene selection techniquesis observed. The proposed method reveals very accurate descriptive models that capture the co-enrichment of gene sets which are differently enriched in the compared data sets. Keywords: meta-learning, microarray, gene expression analysis Published in DKUM: 05.06.2012; Views: 4018; Downloads: 335 Full text (1,17 MB) This document has many files! More... |