| | SLO | ENG | Piškotki in zasebnost

Večja pisava | Manjša pisava

Izpis gradiva Pomoč

Naslov:VREDNOTENJE KAKOVOSTI VEČMODALNIH STORITEV V SODOBNIH TELEKOMUNIKACIJSKIH SISTEMIH
Avtorji:ID Lovrenčič, Tomaž (Avtor)
ID Žgank, Andrej (Mentor) Več o mentorju... Novo okno
Datoteke:.pdf DR_Lovrencic_Tomaz_i2014.pdf (8,05 MB)
MD5: FC0CA6C7052A69A33575971BD8FEAECA
 
Jezik:Slovenski jezik
Vrsta gradiva:Doktorska disertacija
Tipologija:2.08 - Doktorska disertacija
Organizacija:FERI - Fakulteta za elektrotehniko, računalništvo in informatiko
Opis:V doktorski disertaciji obravnavamo problematiko vrednotenja kakovosti večmodalnih storitev v sodobnih telekomunikacijskih sistemih. Pri tem smo izpostavili degradacije, ki vplivajo na uporabniško kakovost in jih glede na izvor razdelimo v izvorne in omrežne. Njihov vpliv lahko izmerimo s subjektivnimi ali z objektivnimi metodami. Ker so večmodalne storitve lahko obojesmerni sistemi, je potreben nadzor degradacij na vhodnih in izhodnih modalnostih sistema. Pri tem prihaja do medmodalnega učinka kot posledice karakteristik človeške zaznave. Osredotočenost uporabnika na polja interesa (ROI) daje degradacijam v teh območjih večji vpliv, kar lahko izkoristimo za porazdeljeno vrednotenje. Cilj disertacije je predlagati model za vrednotenje kakovosti večmodalnih storitev in izdelati vzorčen koncept evalvatorja, ki bo upošteval omenjena dejstva. Za dosego cilja smo nalogo razdelili na tri področja: v prvem smo določili vpliv degradacij na vhodno modalnost, v drugem smo zgradili primerno večmodalno bazo HD-posnetkov in naredili subjektivno in objektivno vrednotenje izhodne modalnosti, v tretjem pa predlagali nov model večmodalnega porazdeljenega vrednotenja kakovosti. Pri vrednotenju kakovosti vhodne modalnosti sistema smo analizirali storitev IVR s funkcijo razpoznavanja govora, kjer smo na podlagi meritev povprečne objektivne ocene kakovosti (objMOS) iz govorne baze SpeechDat(II) ovrednotili vpliv degradacije transkodiranja in izgube paketov (PL). Govorni kodeki so pri tem pokazali precejšnja odstopanja, tudi med različnimi konfiguracijami istih govornih kodekov. Govorna izguba je degradirala signal do te mere, da je bila potrebna uporaba robustnejše modalnosti v obliki DTMF-izbiranja. Na podlagi analize smo predlagali klasifikator vhodne modalnosti na osnovi Gaussovih modelov (GMM). V učni fazi smo analizirali različne konfiguracije klasifikatorja. Testna faza je pokazala uspešno delovanje klasifikatorja za izbiro vhodne modalnosti v različnih scenarijih izgube paketov. Pri raziskavi vpliva degradacij na izhodno modalnost smo izdelali večmodalno bazo posnetkov s štirimi vrstami vsebine. Baza je vsebovala posnetke z avdiom (A, kodek AAC, 48kbps), videom (V, kodek H.264/AVC, 1920x1080) in avdio-videom (AV) pri različnih scenarijih izgube paketov. Izvedli smo subjektivno testiranje z 20 osebami na 240 posnetkih, pri katerih smo dobili povprečne subjektivne ocene kakovosti (subMOS), kar je služilo za referenco objektivnemu vrednotenju. Objektivno vrednotenje je potekalo s standardom PESQ, pri video modalnosti pa smo iz nabora 26 slikovnih metrik izbrali tisto z najboljšo korelacijo s subjektivno oceno: slikovno metriko NQM. Na podlagi rezultatov smo predlagali model vrednotenja kakovosti večmodalne storitve, ki je upošteval tip modalnosti, tip scene, količino degradacij in enomodalne ocene objMOS. Korelacija na testnem naboru je bila 0,892. Pri analizi osredotočenosti uporabnika storitve na ROI in možnosti porazdeljenega vrednotenja smo uporabili detektor vizualne razpoznave strukture obraza, ki temelji na algoritmu Viola-Jones s kaskadnimi klasifikatorji s šibkimi Haarovim podobnimi značilkami, ki smo ga ustrezno modificirali, da smo dosegli čim boljšo detekcijo obraza. Z analizo smo določili pristop porazdeljenega vrednotenja vizualne informacije z enostavnim vrednotenjem ozadja (ne-ROI) z metriko PSNR in kompleksnejšim vrednotenjem obraza (ROI) z metriko NQM. Pomembnost porazdeljenega vrednotenja kakovosti storitev smo potrdili s subjektivnimi testi.
Ključne besede:kakovost storitev, večmodalne vsebine, kakovost videa, kakovost govora, procesiranje in analiza slik, analiza avdia, klasifikacija
Kraj izida:Maribor
Založnik:[T. Lovrenčič]
Leto izida:2014
PID:20.500.12556/DKUM-46840 Novo okno
UDK:621.391:004.932:004.934(043.3)
COBISS.SI-ID:277694720 Novo okno
NUK URN:URN:SI:UM:DK:6FHQBP4R
Datum objave v DKUM:28.01.2015
Število ogledov:2320
Število prenosov:218
Metapodatki:XML DC-XML DC-RDF
Področja:KTFMB - FERI
:
LOVRENČIČ, Tomaž, 2014, VREDNOTENJE KAKOVOSTI VEČMODALNIH STORITEV V SODOBNIH TELEKOMUNIKACIJSKIH SISTEMIH [na spletu]. Doktorska disertacija. Maribor : T. Lovrenčič. [Dostopano 10 april 2025]. Pridobljeno s: https://dk.um.si/IzpisGradiva.php?lang=slv&id=46840
Kopiraj citat
  
Skupna ocena:
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
(0 glasov)
Vaša ocena:Ocenjevanje je dovoljeno samo prijavljenim uporabnikom.
Objavi na:Bookmark and Share


Postavite miškin kazalec na naslov za izpis povzetka. Klik na naslov izpiše podrobnosti ali sproži prenos.

Sekundarni jezik

Jezik:Angleški jezik
Naslov:QUALITY ASSESSMENT OF MULTIMODAL SERVICES IN CONTEMPORARY TELECOMMUNICATION SYSTEMS
Opis:This thesis focuses on quality assessment of multimodal services in contemporary telecommunication systems. It addresses quality degradations which affect user experience. Depending on their origin, they can be categorized as source or network impairments. Their impact can be measured with subjective or objective methods. Since multimodal services can be bi-directional systems, it is necessary to have control over input and output modalities of the system. This leads to intermodal influences between the modalities as a consequence of human perception. Furthermore, the users’ focus on Regions-of-Interest (ROI) gives degradations in those particular regions greater impact on the overall quality, which we can use for differentiated quality assessment. The aim of this thesis is to propose a model for quality assessment of multimodal services and develop the concept of the quality evaluator, which takes the above mentioned facts into account. Therefore, the thesis is divided into three sections. In the first section, the impact of quality degradations on the input modality is determined. In the second, a suitable multimodal database comprising HD recordings is established. This section also presents subjective and objective assessment of output modality, where subjective mean opinion score (subMOS) and objective mean opinion score (objMOS) were conducted. Based on the results, a new model of multimodal quality assessment is proposed. The last section addresses differential quality evaluation based on ROI. As part of the evaluation of the effect of quality degradations on the input modality, a voice-driven IVR service with a built-in speech recognition module (ASR) is analyzed. Assessment begins by measuring objMOS values of the samples from the SpeechDat(II) database. Samples were degraded by transcoding and packet loss. There were substantial differences between the speech codecs used, even when the exact same codec was used with different configurations. Generally, deterioration was greater for codecs with lower bandwidth. The voice signal degraded to such an extent that it was necessary to use a more robust modality, i.e. DTMF dialing. After an analysis of the results, a classifier of input modality based on the Gaussian Mixture Models (GMM) was proposed. When training the classifier, different classification parameters were conducted. Test phase confirmed the successful operation of the classifier regarding the input modality with various packet loss scenarios. For the purpose of assessing the impact of degradations on the quality of output modality, a specifically designed multimodal database was established. It comprised audio (AAC at 48 kbps), video (H.264/AVC at a resolution of 1920x1080 pixels) and combined audio and video clips for a total of 240 samples, used in various packet loss scenarios. After that, subjective tests with 20 subjects were conducted, which gave reference data for objective quality assessment. Objective quality was measured separately for audio and video modalities. To assess the audio modality, standardized PESQ speech quality metric was used, and to assess the video modality NQM video metric was applied. Then, using the regression method, a linear model for evaluating the quality of multimodal services was proposed, which takes into account the type of modality, type of scene, amount of degradation and unimodal objMOS scores. Correlation yields 0.892. The differential quality evaluation consists of two stages. First, a ROI face detector was used, based on the Viola-Jones object detection algorithm with weak Haar-like feature-based cascade classifiers. Then, using good detection results, an analysis of the optimization possibilities due to differential quality assessment of visual modality is presented. This investigation proposed evaluating the quality of ROI regions with a more complex algorithm (NQM) since those regions have higher visual attention, and using a simpler quality metric (PSNR) for the background, i.e. non-ROI regions. The importance of differential quality assessment was
Ključne besede:quality of service, multimodal content, video quality, speech quality, image processing and analysis, audio analysis, classification


Komentarji

Dodaj komentar

Za komentiranje se morate prijaviti.

Komentarji (0)
0 - 0 / 0
 
Ni komentarjev!

Nazaj
Logotipi partnerjev Univerza v Mariboru Univerza v Ljubljani Univerza na Primorskem Univerza v Novi Gorici