| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Show document Help

Title:Primerjava učinkovitosti izvedbe in ponovljivosti rezultatov bioinformatskih analiz RNA sekvenciranja med različnimi posodobitvami programskega okolja R
Authors:ID Dolšak, Veronika (Author)
ID Gorenjak, Mario (Mentor) More about this mentor... New window
ID Potočnik, Uroš (Comentor)
Files:.pdf MAG_Dolsak_Veronika_2023.pdf (2,09 MB)
MD5: F6B71D9A26FAA20189E6F2D6D4C7E641
 
Language:Slovenian
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FZV - Faculty of Health Sciences
Abstract:Izhodišče: Razvoj tehnologije sekvenciranja naslednje generacije je močno pospešil hitrost pridobivanja velike količine podatkov sekvenciranja, ki potrebujejo nadaljnje bioinformatske analize, posledično pa je hitro naraslo tudi število programskih orodij za urejanje teh podatkov. Pogosta izbira za analizo podatkov RNA-sekvenciranja (RNA-seq) za odkrivanje genov in poti diferencialnega izražanja genov z zagotavljanjem popolne analize so programski paketi Bioconductor, namenjeni za delo v programskem okolju R. Različice programskega okolja R se pogosto nadgrajujejo, zaradi česar se v praksi opazi različno učinkovitost, kar lahko vpliva na primerljivost rezultatov analiz RNA-seq, analiziranih z več različicami programskega okolja R. Metode: Surove podatke RNA-seq smo analizirali z uporabo programskih orodij Bioconductor: Rsubread, edgeR in limma, in to v več različicah programskega okolja R: R 3.5, R 3.6, R 4.0, R 4.1 in R 4.2. Rezultati: Rezultati primerjav učinkovitosti poravnave s programskim orodjem Rsubred kažejo statistično pomembne razlike med primerjavami R 4.2 z ostalimi različicami programskega okolja R, prav tako se kažejo statistično pomembne razlike v rezultatih primerjav analize diferencialnega izražanja genov, pridobljenih z istim cevovodom ukazov med različico R 4.2 in ostalimi različicami R ter med različico R 3.5 in ostalimi različicami R. Diskusija: Iz rezultatov smo ugotovili, da je treba izvajati analizo podatkov RNA-seq z najnovejšo posodobljeno različico programskega okolja R in najnovejšimi različicami programskih orodij Bioconduktor, kar je še posebnega pomena, kadar izvajamo metaanalizo podatkov RNA-seq iz različnih neodvisnih študij.
Keywords:RNA-sekvenciranje, diferencialno izražanja genov, R, bioinformatika
Place of publishing:Maribor
Publisher:[V. Dolšak]
Year of publishing:2023
PID:20.500.12556/DKUM-84344 New window
UDC:575.112(043.2)
COBISS.SI-ID:158635011 New window
Publication date in DKUM:13.07.2023
Views:373
Downloads:58
Metadata:XML DC-XML DC-RDF
Categories:FZV
:
Copy citation
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Licences

License:CC BY-NC-ND 4.0, Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Link:http://creativecommons.org/licenses/by-nc-nd/4.0/
Description:The most restrictive Creative Commons license. This only allows people to download and share the work for no commercial gain and for no other purposes.
Licensing start date:25.05.2023

Secondary language

Language:English
Title:Comparison of performance efficiency and reproducibility of RNA-seq bioinformatics analyses between different upgrades of R software environment
Abstract:Basis: The development of next-generation sequencing technology has been greatly accelerated by the speed of obtaining a large amount of sequencing data that needs further bioinformatics analysis. Consequently, the number of software tools for editing this data has also grown rapidly. A common choice for analyzing RNA sequencing (RNA-seq) data to discover genes and pathways of differential gene expression by providing complete analysis is the Bioconductor software packages which are designed to work in the R programming environment. Versions of the R programming environment are frequently upgraded because of which different efficiency occurs in practice, which may affect the comparability of the results of RNA-seq analyses analyzed with different versions of the R programming environment. Methods: We analyzed raw RNA-seq data using the Bioconductor software tools (Rsubread, edgeR, and limma) in different versions of the R programming environment: R 3.5, R 3.6, R 4.0, R 4.1, and R 4.2. Results: The results of the comparisons of the efficiency of the alignment with the Rsubred software tool show statistically significant differences between the comparisons of R 4.2 with other versions of R. There are also statistically significant differences in the results of the comparisons of the analysis of the differential expression of genes obtained with the same pipeline of commands between the versions of R 4.2 and other versions of R, as well as between R 3.5 and other R versions. Discussion: Based on the results, we ascertained that it is necessary to perform the analysis of RNA-seq data with the latest updated version of the R programming environment and the latest versions of the Bioconductor programming tools, which is of particular importance when performing a meta-analysis of RNA-seq data from different independent studies. 
Keywords:RNA sequencing, differential gene expression, R, bioinformatics


Comments

Leave comment

You must log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica