1. Modelling highly inflected languagesMirjam Sepesy Maučec, Zdravko Kačič, Bogomir Horvat, 2004, izvirni znanstveni članek Opis: Statistical language models encapsulate varied information, both grammatical and semantic, present in a language. This paper investigates various techniques for overcoming the difficulties in modelling highly inflected languages. The main problem is a large set of different words. We propose to model the grammatical and semantic information of words separately by splitting them into stems and endings. All the information is handled within a data-driven formalism. Grammatical information is well modelled by using short-term dependencies. This article is primarily concerned with the modelling of semantic information diffused through the entire text. It is presumed that the language being modelled is homogeneous in topic. The training corpus, which is very topically heterogeneous, is divided into three semantic levels based on topic similarity with the target environment text. Text on each semantic level is used as training text for one component of a mixture model. A document is defined as a basic unit of a training corpus, which is semantically homogeneous. The similarity of topic between a document and a collection of target environment texts is determined by the cosine vector similarity function and TFIDF weighting heuristic. The crucial question in the case of highly inflected languages is how to define terms. Terms are defined as clusters of words. Clustering is based on approximate string matching. We experimented with Levenshtein distance and Ratcliff/Obershelp similarity measure, both in combination with ending-stripping. Experiments on the Slovenian language were performed on a corpus of VEČER newswire text. The results show a significant reduction in OOV rate and perplexity. Objavljeno v DKUM: 01.06.2012; Ogledov: 1373; Prenosov: 53
Povezava na celotno besedilo |
2. Analiza prometa in uporaba orodij v omrežjih z Windows 2000 strežnikom : diplomsko naloga visokošolskega študijskega programaSrečko Vrečko, 2003, diplomsko delo Ključne besede: protokoli, zlog, okvir, prepustnost povezave, redundantni biti, aktivni imenik, strežnik DNS, elektronska pošta, prenos podatkov, svetovni splet, bralnik omrežja, dostop na daljavo Objavljeno v DKUM: 23.04.2008; Ogledov: 3230; Prenosov: 210
Celotno besedilo (1,79 MB) |
3. |
4. Koncepti radiofrekvenčnih pretvornikov za gradnjo enokanalnih omrežij za zemeljsko in mobilno digitalno televizijo : magistrska nalogaDaniel Copot, 2005, magistrsko delo Ključne besede: radiodifuzija, DVB-T, DVB-H, oddajniki, pretvorniki, enokanalno omrežje, antene, COFDM modulacija, IPDC, enokanalni pretvorniki Objavljeno v DKUM: 15.04.2008; Ogledov: 3900; Prenosov: 297
Celotno besedilo (7,39 MB) |
5. Digitalni zvok : diplomska naloga univerzitetnega študijskega programaBoštjan Imperl, 2003, diplomsko delo Ključne besede: digitalni zvok, zvok, sluh, A/D pretvorba, frekvenca vzorčenja, kvantizacija, digitalno avdio urejanje, Dolby Objavljeno v DKUM: 15.04.2008; Ogledov: 5354; Prenosov: 917
Celotno besedilo (2,59 MB) |
6. |
7. |
8. |
9. |
10. |