1. Commit-level software change intent classification using a pre-trained transformer-based code modelTjaša Heričko, Boštjan Šumak, Sašo Karakatič, 2024, original scientific article Abstract: Software evolution is driven by changes made during software development and maintenance. While source control systems effectively manage these changes at the commit level, the intent behind them are often inadequately documented, making understanding their rationale challenging. Existing commit intent classification approaches, largely reliant on commit messages, only partially capture the underlying intent, predominantly due to the messages’ inadequate content and neglect of the semantic nuances in code changes. This paper presents a novel method for extracting semantic features from commits based on modifications in the source code, where each commit is represented by one or more fine-grained conjoint code changes, e.g., file-level or hunk-level changes. To address the unstructured nature of code, the method leverages a pre-trained transformer-based code model, further trained through task-adaptive pre-training and fine-tuning on the downstream task of intent classification. This fine-tuned task-adapted pre-trained code model is then utilized to embed fine-grained conjoint changes in a commit, which are aggregated into a unified commit-level vector representation. The proposed method was evaluated using two BERT-based code models, i.e., CodeBERT and GraphCodeBERT, and various aggregation techniques on data from open-source Java software projects. The results show that the proposed method can be used to effectively extract commit embeddings as features for commit intent classification and outperform current state-of-the-art methods of code commit representation for intent categorization in terms of software maintenance activities undertaken by commits. Keywords: software maintenance, code commit, mining software repositories, adaptive pre-training, fine-tuning, semantic code embedding, CodeBERT, GraphCodeBERT, classification, code intelligence Published in DKUM: 14.08.2024; Views: 81; Downloads: 5 Full text (1,65 MB) |
2. The Impact of Code Bloat on Genetic Program Comprehension: Replication of a Controlled Experiment on Semantic InferenceTomaž Kosar, Željko Kovačević, Marjan Mernik, Boštjan Slivnik, 2023, original scientific article Keywords: genetic programming, controlled experiment, program comprehension, replication, semantic inference, attribute grammars Published in DKUM: 22.05.2024; Views: 168; Downloads: 9 Full text (389,51 KB) This document has many files! More... |
3. A VAN-Based Multi-Scale Cross-Attention Mechanism for Skin Lesion Segmentation NetworkShuang Liu, Zeng Zhuang, Yanfeng Zheng, Simon Kolmanič, 2023, original scientific article Abstract: With the rise of deep learning technology, the field of medical image segmentation has undergone rapid development. In recent years, convolutional neural networks (CNNs) have brought many achievements and become the consensus in medical image segmentation tasks. Although many neural networks based on U-shaped structures and methods, such as skip connections have achieved excellent results in medical image segmentation tasks, the properties of convolutional operations limit their ability to effectively learn local and global features. To address this problem, the Transformer from the field of natural language processing (NLP) was introduced to the image segmentation field. Various Transformer-based networks have shown significant performance advantages over mainstream neural networks in different visual tasks, demonstrating the huge potential of Transformers in the field of image segmentation. However, Transformers were originally designed for NLP and ignore the multidimensional nature of images. In the process of operation, they may destroy the 2D structure of the image and cannot effectively capture low-level features. Therefore, we propose a new multi-scale cross-attention method called M-VAN Unet, which is designed based on the Visual Attention Network (VAN) and can effectively learn local and global features. We propose two attention mechanisms, namely MSC-Attention and LKA-Cross-Attention, for capturing low-level features and promoting global information interaction. MSC-Attention is designed for multi-scale channel attention, while LKA-Cross-Attention is a cross-attention mechanism based on the large kernel attention (LKA). Extensive experiments show that our method outperforms current mainstream methods in evaluation metrics such as Dice coefficient and Hausdorff 95 coefficient. Keywords: CNNs, deep learning, medical image processing, NLP, semantic segmentation Published in DKUM: 14.03.2024; Views: 495; Downloads: 306 Full text (1,46 MB) This document has many files! More... |
4. Gender of job titles in advertisements in american newspapersMaja Petek, 2018, master's thesis Abstract: This master’s thesis focuses on gender-specific and gender-neutral expressions or words for job titles in English. When we are choosing the right expression for a job title we often neglect one gender or we put one gender in forefront. In the theoretical part of our master’s thesis we explain the difference between biological sex (sex) and semantic sex (gender). We also write about gender categories, sexism and feminism. We discuss the important topic of politically correct language and we focus on gender marked words.
In the practical part we analyse job advertisements in old and new newspapers. We want to determine the use and the frequency of the use of gender marked job titles. Then we use these job titles and we analyse them by using two corpuses: COHA (Corpus of Historical American English) and COCA (Corpus of Contemporary American English).
We try to determine if the job titles in the past were gender marked, if the words that we use nowadays are gender marked and what is the frequency of the use of gender marked job titles nowadays with the comparison with its use in the past. Keywords: semantic gender, linguistic sexism, gender-neutral language, feminism, job titles in English Published in DKUM: 05.05.2023; Views: 480; Downloads: 13 Full text (1,81 MB) |
5. Automatic compiler/interpreter generation from programs for domain-specific languages using semantic inference : doktorska disertacijaŽeljko Kovačević, 2022, doctoral dissertation Abstract: Presented doctoral dissertation describes a research work on Semantic Inference, which can be regarded as an extension of Grammar Inference. The main task of Grammar Inference is to induce a grammatical structure from a set of positive samples (programs), which can sometimes also be accompanied by a set of negative samples. Successfully applying Grammar Inference can result only in identifying the correct syntax of a language. But, when valid syntactical structures are additionally constrained with context-sensitive information the Grammar Inference needs to be extended to the Semantic Inference. With the Semantic Inference a further step is realised, namely, towards inducing language semantics. In this doctoral dissertation it is shown that a complete compiler/interpreter for small Domain-Specific Languages (DSLs) can be generated automatically solely from given programs and their associated meanings using Semantic Inference. For the purpose of this research work the tool LISA.SI has been developed on the top of the compiler/interpreter generator tool LISA that uses Evolutionary Computations to explore and exploit the enormous search space that appears in Semantic Inference. A wide class of Attribute Grammars has been learned. Using Genetic Programming approach S-attributed and L-attributed have been inferred successfully, while inferring Absolutely Non-Circular Attribute Grammars (ANC-AG) with complex dependencies among attributes has been achieved by integrating a Memetic Algorithm (MA) into the LISA.SI tool. Keywords: Grammatical Inference, Semantic Inference, Genetic Programming, Attribute Grammars, Memetic Algorithm, Domain-Specific Languages Published in DKUM: 17.02.2022; Views: 1273; Downloads: 125 Full text (3,59 MB) |
6. EOSC interoperability framework : Report from the EOSC Executive Board Working Groups FAIR and ArchitectureOscar Corcho, Magnus Eriksson, Krzysztof Kurowski, Milan Ojsteršek, Christine Choirat, Mark van de Sanden, Frederik Coppens, 2021, scientific monograph Abstract: This document has been developed by the Interoperability Task Force of the EOSC Executive Board FAIR Working Group, with participation from the Architecture WG. Achieving interoperability within EOSC is essential in order for the federation of services that will compose EOSC to provide added value for service users. In the context of the FAIR principles, interoperability is discussed in relation to the fact that “research data usually need to be integrated with other data; in addition, the data need to interoperate with applications or workflows for analysis, storage, and processing”. Our view on interoperability does not only consider data but also the many other research artefacts that may be used in the context of research activity, such as software code, scientific workflows, laboratory protocols, open hardware designs, etc. It also considers the need to make services and e-infrastructures as interoperable as possible. This document identifies the general principles that should drive the creation of the EOSC Interoperability Framework (EOSC IF), and organises them into the four layers that are commonly considered in other interoperability frameworks (e.g., the European Interoperability Framework - EIF): technical, semantic, organisational and legal interoperability. For each of these layers, a catalogue of problems and needs, as well as challenges and high-level recommendations have been proposed, which should be considered in the further development and implementation of the EOSC IF components. Such requirements and recommendations have been developed after an extensive review of related literature as well as by running interviews with stakeholders from ERICs (European Research Infrastructure Consortia), ESFRI (European Strategy Forum on Research Infrastructures) projects, service providers and research communities. Some examples of such requirements are: “every semantic artefact that is being maintained in EOSC must have sufficient associated documentation, with clear examples of usage and conceptual diagrams”, or “Coarse-grained and fine-grained dataset (and other research object) search tools need to be made available”, etc. The document finally contains a proposal for the management of FAIR Digital Objects in the context of EOSC and a reference architecture for the EOSC Interoperability Framework that is inspired by and extends the European Interoperability Reference Architecture (EIRA), identifying the main building blocks required. Keywords: technical interoperability, semantic interoperability, organizational interoperability, legal interoperability, EOSC, metadata crosswalk, reference architecture Published in DKUM: 21.09.2021; Views: 1041; Downloads: 53 Full text (1,06 MB) This document has many files! More... This document is also a collection of 2 documents! |
7. Primerjava in ovrednotenje ogrodij CSS : diplomsko deloŽiga Požun, 2019, undergraduate thesis Abstract: V diplomski nalogi smo se ukvarjali s primerjavo in ovrednotenjem ogrodij CSS. Ta ogrodja so bila Bulma CSS, Foundation in Semantic UI. Najprej smo obrazložili, kaj razumemo pod pojmom ogrodje CSS, in našteli prednosti, slabosti in uporabnost ogrodij. V nadaljevanju smo izvedli raziskavo vsakega ogrodja, kjer smo pokazali različne lastnosti ogrodja in njegovo uporabo. Na koncu smo opravili primerjavo in ovrednotenje ogrodij. Med seboj smo jih primerjali po različnih kriterijih in glede na podporo, ki jo njihovi razvijalci nudijo. Keywords: ogrodja CSS, Bulma CSS, Foundation, Semantic UI Published in DKUM: 22.11.2019; Views: 1000; Downloads: 105 Full text (768,39 KB) |
8. Analiza in primerjava sodobnih programskih orodij za razvoj prilagodljivih interaktivnih spletnih vmesnikovUrška Arzenšek, 2018, undergraduate thesis Abstract: Pri izdelavi spletnih rešitev so v veliko pomoč spletna programska ogrodja, ki prihranijo čas razvoja in močno olajšajo delo, saj že vsebujejo vnaprej pripravljeno kodo. V diplomski nalogi so podrobneje predstavljena ogrodja Semantic UI, UIkit in Skeleton. Namen diplomskega dela je bil analizirati in primerjati spletna programska ogrodja na primeru izgradnje spletne strani. S pomočjo omenjenih ogrodij smo izdelali vizualno podobne spletne strani in analizirali ter primerjali naslednje kriterije: osnovni elementi (navigacija, obrazci, gumbi), dokumentacija, skupnost, podpora in aktivni razvoj, čas učenja, statična analiza programske kode, ter ogrodja na podlagi teh kriterijev tudi ocenili. Ugotovili smo, da je za izdelavo spletnih strani najprimernejše ogrodje Semantic UI, predvsem z vidika podpore uporabnikom in dokumentacije, kar pa smo tudi potrdili s statično analizo kode, saj ogrodje Semantic UI vsebuje največ predpripravljenih datotek CSS in Javascript, ki vsebujejo veliko stilov in funkcij, ki so na voljo za oblikovanje spletnih strani. Keywords: spletno programsko ogrodje, Semantic UI, UIkit, Skeleton Published in DKUM: 14.11.2018; Views: 1655; Downloads: 121 Full text (2,02 MB) |
9. The employee as the unknown actor? : a discourse analysis of the employee share ownership debate with special emphasis on the Central and Eastern EuropeOlaf Kranz, Thomas Steger, Ronald Hartz, 2016, original scientific article Abstract: Background and purpose: Although employee share ownership (ESO) deserves of a long tradition, we still know little about employees’ perspectives about ESO. The lack of knowledge about the employees’ attitudes towards ESO is discursively filled in the ESO debate. This paper challenges that deficit by carrying out a semantic analysis of the literature with the aim to identify the various actor constructions used implicitly in the ESO discourse.
Design/Methodology/Approach: We conduct a semantic analysis of the ESO discourse. To unfold the order of this discourse we draw on the distinction between surface and underlying structure of communication in the sense of Michel Foucault. We interpret some semantic lead differences, a term coined by Niklas Luhmann, to constitute the underlying structure of communication.
Results: We can identify six different streams on the ESO discourse’s surface level each defined by the ends pursued. The discourse’s underlying structure is made up of the distinctions production-consumption, capital-labour, and ownership-control that also determine the actor models implicitly in use.
Conclusion: We can identify five different actor models implicit in the ESO discourse. While the CEE discourse differs on the surface level in as far as it is more concerned with questions of political legitimation of the privatisation process than with questions of economic efficiency, thus introducing political distinctions in the discourse rather missing in the west, it shares the underlying semantic lead differences with the Western discourse as well as the actor models anchored in those differences. Keywords: Employee Share Ownership, discourse analysis, semantic lead distinctions, actor constructions, CEE countries Published in DKUM: 22.01.2018; Views: 1129; Downloads: 110 Full text (657,20 KB) This document has many files! More... |
10. A question-based design pattern advisement approachLuka Pavlič, Vili Podgorelec, Marjan Heričko, 2014, original scientific article Abstract: Design patterns are a proven way to build flexible software architectures. But the selection of an appropriate design pattern is a difficult task in practice, particularly for less experienced developers. In this paper, a question based design pattern advisement approach will be proposed. This approach primarily assists developers in identifying and selecting the most suitable design pattern for a given problem. We will also propose certain extensions to the existing Object-Oriented Design Ontology (ODOL). In addition to the advisement procedure, a new design pattern advisement ontology will be defined. We have also developed a tool that supports the proposed ontology and question-based advisement (OQBA) approach. The conducted controlled experiment and two surveys have shown that the proposed approach is beneficial to all software developers, especially to those who have less experience with design patterns. Keywords: design patterns, pattern selection, ontology, semantic web, selection algorithm Published in DKUM: 06.07.2017; Views: 1712; Downloads: 420 Full text (621,06 KB) This document has many files! More... |