| | SLO | ENG | Piškotki in zasebnost

Večja pisava | Manjša pisava

Iskanje po katalogu digitalne knjižnice Pomoč

Iskalni niz: išči po
išči po
išči po
išči po
* po starem in bolonjskem študiju

Opcije:
  Ponastavi


1 - 2 / 2
Na začetekNa prejšnjo stran1Na naslednjo stranNa konec
1.
Online speech/music segmentation based on the variance mean of filter bank energy
Marko Kos, Matej Grašič, Zdravko Kačič, 2009, izvirni znanstveni članek

Opis: This paper presents a novel feature for online speech/music segmentation basedon the variance mean of filter bank energy (VMFBE). The idea that encouraged the feature's construction is energy variation in a narrow frequency sub-band. The energy varies more rapidly, and to a greater extent for speech than for music. Therefore, an energy variance in such a sub-band isgreater for speech than for music. The radio broadcast database and the BNSIbroadcast news database were used for feature discrimination and segmentation ability evaluation. The calculation procedure of the VMFBE feature has 4 out of 6 steps in common with the MFCC feature calculation procedure. Therefore, it is a very convenient speech/music discriminator for use in real-time automatic speech recognition systems based on MFCC features, because valuable processing time can be saved, and computation load is only slightly increased. Analysis of the feature's speech/music discriminative ability shows an average error rate below 10% for radio broadcast material and it outperforms other features used for comparison, by more than 8%. The proposed feature as a stand-alone speech/music discriminator in a segmentation system achieves an overall accuracy of over 94% on radio broadcast material.
Ključne besede: online speech segmentation, algorithm, speech techniques
Objavljeno v DKUM: 26.06.2017; Ogledov: 1337; Prenosov: 443
.pdf Celotno besedilo (1,49 MB)
Gradivo ima več datotek! Več...

2.
A comprehensive noise robust speech parameterization algorithm using wavelet packet decomposition-based denoising and speech feature representation techniques
Bojan Kotnik, Zdravko Kačič, 2007, izvirni znanstveni članek

Opis: This paper concerns the problem of automatic speech recognition in noise-intense and adverse environments. The main goal of the proposed work is the definition, implementation, and evaluation of a novel noise robust speech signal parameterization algorithm. The proposed procedure is based on time-frequency speech signal representation using wavelet packet decomposition. A new modified soft thresholding algorithm based on time-frequency adaptive threshold determination was developed to efficiently reduce the level of additive noise in the input noisy speech signal. A two-stage Gaussian mixture model (GMM)-based classifier was developed to perform speech/nonspeech as well as voiced/unvoiced classification. The adaptive topology of the wavelet packet decomposition tree based on voiced/unvoiced detection was introduced to separately analyze voiced and unvoiced segments of the speech signal. The main feature vector consists of a combination of log-root compressed wavelet packet parameters, and autoregressive parameters. The final output feature vector is produced using a two-staged feature vector postprocessing procedure. In the experimental framework, the noisy speech databases Aurora 2 and Aurora 3 were applied together with corresponding standardized acoustical model training/testing procedures. The automatic speech recognition performance achieved using the proposed noise robust speech parameterization procedure was compared to the standardized mel-frequency cepstral coefficient (MFCC) feature extraction procedures ETSI ES 201 108 and ETSI ES 202 050.
Ključne besede: speech parametrization, algorithm, speech techniques
Objavljeno v DKUM: 26.06.2017; Ogledov: 1707; Prenosov: 397
.pdf Celotno besedilo (984,48 KB)
Gradivo ima več datotek! Več...

Iskanje izvedeno v 0.06 sek.
Na vrh
Logotipi partnerjev Univerza v Mariboru Univerza v Ljubljani Univerza na Primorskem Univerza v Novi Gorici