| | SLO | ENG | Cookies and privacy

Bigger font | Smaller font

Show document

Title:Izdelava oblikoslovnega označevalnika za slovenski jezik in primerjava z drugimi rešitvami
Authors:Hrovat, Goran (Author)
Ojsteršek, Milan (Mentor) More about this mentor... New window
Files:.pdf UNI_Hrovat_Goran_2010.pdf (1,68 MB)
 
Language:Slovenian
Work type:Undergraduate thesis (m5)
Organization:FERI - Faculty of Electrical Engineering and Computer Science
Abstract:V diplomskem delu smo se ukvarjali s področjem računalniško podprtega oblikoslovnega označevanja besedil v slovenskem jeziku. Najprej smo opisali napogostejše probleme, ki se pojavljajo. Nato smo opisali delovanje dveh odprtokodnih oblikoslovnih ozna čevalnikov: Stanford POS Tagger in TreeTagger. V praktičnem delu smo izdelali lastni oblikoslovni ozna čevalnik in prilagodili odprtokodna ozna čevalnika za procesiranje besedil v slovenskem jeziku. Učno mno žico je predstavljal korpus FidaPlus. Rezultate oblikoslovnega označevanja iz vseh treh označevalnikov smo med seboj primerjali.
Keywords:oblikoslovno označevanje, procesiranje naravnega jezika, NLP, naravni jezik, slovenski jezik, lematizacija
Year of publishing:2010
Publisher:[G. Hrovat]
Source:Maribor
UDC:004.93:811.163.6(043.2)
COBISS_ID:14305302 Link is opened in a new window
NUK URN:URN:SI:UM:DK:YSIMUXFV
Views:2278
Downloads:166
Metadata:XML RDF-CHPDL DC-XML DC-RDF
Categories:KTFMB - FERI
:
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:AddThis
AddThis uses cookies that require your consent. Edit consent...

Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Secondary language

Language:English
Title:DEVELOPING A POS TAGGER FOR SLOVENIAN LANGUAGE AND COMPARING TO OTHER SOLUTIONS
Abstract:In this diploma work, we elaborated automatic POS tagging of Slovenian text. First of all, we presented common problems and described two open source POS taggers: Stanford POS tagger and TreeTagger. As a practical part, we developed our own POS tagger and adapted the tested open source taggers to work on Slovene text. We used FidaPlus corpus as a training set. Finally, we compared the results of all three POS taggers.
Keywords:POS tagging, part of speech, NLP, natural language, natural language processing, lemmatisation, slovenian language


Comments

Leave comment

You have to log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back
Logos of partners University of Maribor University of Ljubljana University of Primorska University of Nova Gorica