A question answering system on domain specific knowledge with semantic web support
Borut Gorenjak, Marko Ferme, Milan Ojsteršek, 2011, original scientific article

Abstract: In today's world the majority of information is accessible via the World Wide Web. A common way to access this information is through information retrieval applications like web search engines. We already know that web search engines flood their users with enormous amount of data from which they cannot figure out the essential and most important information. These disadvantages can be reduced with question answering systems. The basic idea of question answering systems is to be able to provide answers to a specific question written in natural language. The main goal of question answering systems is to find a specific answer. This paper presents an architecture of our ontology-driven system that uses semantic description of the processes, databases and web services for question answering system in the Slovenian language.
Keywords: ontology, natural language processing, question answering system, semantic web, web ontology language
Published in DKUM: 01.06.2012; Views: 2614; Downloads: 51
.pdf Full text (807,82 KB)

TextProc - a natural language processing framework and its use as plagiarism detection system
Janez Brezovnik, Milan Ojsteršek, 2011, original scientific article

Abstract: A natural language processing framework called TextProc is described in this paper. First the frameworks software architecture is described. The architecture is made of several parts and all of them are described in detail. Natural language processing capabilities are implemented as software plug-ins. Plug-ins can be put together into processes that perform a practical natural processing function. Several practical TextProc processes are briefly described, like part-of-speech tagging, named entity tagging and others. One of those is capable to perform plagiarism detection on texts in Slovenian language, which is explained in detail. This process is actually used in digital library of University of Maribor. The integration of digital library with TextProc is also briefly described. At the end of this paper some ideas for future development are given.
Keywords: natural language processing, text processing, text mining, Slovenian language, plagiarism detection
Published in DKUM: 01.06.2012; Views: 2456; Downloads: 81
.pdf Full text (438,30 KB)

Advanced features of Digital library of University of Maribor
Janez Brezovnik, Milan Ojsteršek, 2011, original scientific article

Abstract: Advanced features of digital library of University of Maribor are described inthis paper. A short introduction describes some basic facts about the digital library and mentions its main purpose, but the main part of this paperis about features, that are mostly not found in other digital libraries. These features include integration with other information systems, plagiarism detection, informative and useful statistics about mentors and specific content extraction from documents, served by the digital library. We present existing functionality and describe some ideas for future development. A natural language processing framework, called TextProc, is also briefly mentioned, since it is used to perform plagiarism detection.
Keywords: digital library, natural language, plagiarism detection, Slovenian language, text processing
Published in DKUM: 01.06.2012; Views: 2609; Downloads: 56
URL Link to full text

Local search engine with global content based on domain specific knowledge
Sandi Pohorec, Mateja Verlič, Milan Zorman, 2009, original scientific article

Abstract: In the growing need for information we have come to rely on search engines. The use of large scale search engines, such as Google, is as common as surfingthe World Wide Web. We are impressed with the capabilities of these search engines but still there is a need for improvment. A common problem withsearching is the ambiguity of words. Their meaning often depends on the context in which they are used or varies across specific domains. To resolve this we propose a domain specific search engine that is globally oriented. We intend to provide content classification according to the target domain concepts, access to privileged information, personalization and custom rankingfunctions. Domain specific concepts have been formalized in the form ofontology. The paper describes our approach to a centralized search service for domain specific content. The approach uses automated indexing for various content sources that can be found in the form of a relational database, we! b service, web portal or page, various document formats and other structured or unstructured data. The gathered data is tagged with various approaches and classified against the domain classification. The indexed data is accessible through a highly optimized and personalized search service.
Keywords: information search, personalization, indexes, crawling, domain specific crawling, natural language processing, content tagging, distributed data sources, ranking functions
Published in DKUM: 31.05.2012; Views: 1741; Downloads: 37
URL Link to full text

