Top 52 papers published in the topic of Pitch detection algorithm in 2014

Showing papers on "Pitch detection algorithm published in 2014"

Patent•

Packet loss concealment for speech coding

[...]

Yang Gao¹•Institutions (1)

7 Feb 2014

TL;DR: In this article, a speech coding method is proposed to reduce error propagation due to voice packet loss, which is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame.

...read moreread less

Abstract: A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class. A pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. A strongly voiced class is decided by checking if the pitch lags are stable and the pitch gains are high enough with the frame; for the strongly voiced frame, the pitch lags and the pitch gains can be encoded more efficiently than other speech classes.

...read moreread less

19 citations

Proceedings Article•10.1109/ACSSC.2014.7094691•

Pitch estimation for non-stationary speech

[...]

Mads Grcesboll Christensen¹, Jesper Jensen¹•Institutions (1)

Aalborg University¹

1 Nov 2014

TL;DR: Experimental results show that the new model and the estimator lead to both improved pitch estimates and reconstruction quality, but also that the improvements in pitch are usually quite small, typically in the order of a few Hertz.

...read moreread less

Abstract: Recently, parametric methods have proven capable of overcoming the problems of correlation-based methods for pitch estimation. However, the argument against such methods is that the underlying model is wrong, particularly for non-stationary signals, like speech. To investigate whether this is true, we propose a new, non-stationary harmonic chirp model for pitch estimation, and we derive an estimator for determining its parameters. Experimental results show that the new model and the estimator lead to both improved pitch estimates and reconstruction quality, but also that the improvements in pitch are usually quite small, typically in the order of a few Hertz.

...read moreread less

15 citations

Proceedings Article•10.1109/ICACCCT.2014.7019296•

A new pitch detection scheme based on ACF and AMDF

[...]

Sandeep Kumar¹, Sumantra Bhattacharya¹, Premanand Patel¹•Institutions (1)

Indian Institute of Technology Dhanbad¹

8 May 2014

TL;DR: Results for different speech signals show that this new method of pitch detection is the best in terms of speech quality and computational complexity.

...read moreread less

Abstract: A new pitch detection scheme has been proposed based on the short-time autocorrelation function (ACF) and average magnitude difference function (AMDF). The performance of the proposed scheme has been evaluated, through simulation, in a complete speech analysis-synthesis system. For detection of pitch, local maxima of ACF and local minima of AMDF values are computed. To reduce computational complexity, the original speech signal is converted into a three level signal before computing ACF and AMDF. Synthesized speech quality, computational complexity and time taken during simulation are the parameters that have been considered while comparing this system with the analysis-synthesis systems that use autocorrelation, cepstrum and wavelet based pitch detection methods. Results for different speech signals show that this new method of pitch detection is the best in terms of speech quality and computational complexity.

...read moreread less

15 citations

Polyphonic pitch detection by iterative analysis of the autocorrelation function

[...]

Sebastian Kraft¹, Udo Zölzer•Institutions (1)

Helmut Schmidt University¹

1 Jan 2014

TL;DR: A polyphonic pitch detection approach is presented, which is based on the iterative analysis of the autocorrelation function, and yields good results in the range of state of the art systems.

...read moreread less

Abstract: In this paper, a polyphonic pitch detection approach is presented, which is based on the iterative analysis of the autocorrelation function. The idea of a two-channel front-end with periodicity estimation by using the autocorrelation is inspired by an algorithm from Tolonen and Karjalainen. However, the analysis of the periodicity in the summary autocorrelation function is enhanced with a more advanced iterative peak picking and pruning procedure. The proposed algorithm is compared to other systems in an evaluation with common data sets and yields good results in the range of state of the art systems.

...read moreread less

10 citations

Patent•

Pitch detection method

[...]

Zhang Tianqi, Xin Xu, Zhang Gang, Shi Sui, Zhang Yajuan - Show less +1 more

10 Dec 2014

TL;DR: In this paper, a pitch detection method for content-based music retrieval is presented, which comprises the following steps of: converting a music signal to a frequency domain by virtue of Fourier transform to calculate, carrying out the first step of pitch detection on the signal according to a harmonic peak value method to find five low-frequency harmonic peaks, carried out ascending sort according to the values of frequencies, then calculating the ratio among the frequencies, determining a group of pitch candidate sequences according to data measured by an experiment, then carrying out pitch detecting on the original music signal by a c

...read moreread less

Abstract: The invention discloses a pitch detection method which researches aiming at the problem of a poor pitch detection technology in content-based music retrieval. The pitch detection method comprises the following steps of: converting a music signal to a frequency domain by virtue of Fourier transform to calculate, carrying out the first step of pitch detection on the signal according to a harmonic peak value method to find five low-frequency harmonic peaks, carrying out ascending sort according to the values of frequencies, then calculating the ratio among the frequencies, determining a group of pitch candidate sequences according to the data measured by an experiment, then carrying out pitch detection on the original music signal by a cepstrum method, combining the pitch sequences obtained by the two methods into a new pitch candidate sequence, finally finding a pitch corresponding to the lowest cost, that is, the standard pitch obtained by the method, by virtue of a confidence degree and viterbi optimal algorithm. The method disclosed by the invention is great in robustness and good in anti-noise performance.

...read moreread less

9 citations

Journal Article•10.1186/1687-6180-2014-92•

A hybrid algorithm for blind source separation of a convolutive mixture of three speech sources

[...]

Shahab Faiz Minhas¹, Patrick Gaydecki¹•Institutions (1)

University of Manchester¹

17 Jun 2014-EURASIP Journal on Advances in Signal Processing

TL;DR: A novel hybrid algorithm for blind source separation of three speech signals in a real room environment that exploits an information-theoretic approach, based on higher order statistics, to achieve source separation and is well suited for real-time implementation due to its fast adaptive methodology.

...read moreread less

Abstract: In this paper we present a novel hybrid algorithm for blind source separation of three speech signals in a real room environment. The algorithm in addition to using second-order statistics also exploits an information-theoretic approach, based on higher order statistics, to achieve source separation and is well suited for real-time implementation due to its fast adaptive methodology. It does not require any prior information or parameter estimation. The algorithm also uses a novel post-separation speech harmonic alignment that results in an improved performance. Experimental results in simulated and real environments verify the effectiveness of the proposed method, and analysis demonstrates that the algorithm is computationally efficient.

...read moreread less

8 citations

Proceedings Article•10.1145/2663204.2666276•

Enhanced Autocorrelation in Real World Emotion Recognition

[...]

Sascha Meudt¹, Friedhelm Schwenker¹•Institutions (1)

University of Ulm¹

12 Nov 2014

TL;DR: Results of the evaluation show that the enhanced autocorrelation outperform other state-of-the-art features in case of the challenge data set, which lies in between real world data sets showing naturalistic emotional utterances, and the widely applied and well-understood acted emotional data sets.

...read moreread less

Abstract: Multimodal emotion recognition in real world environments is still a challenging task of affective computing research. Recognizing the affective or physiological state of an individual is difficult for humans as well as for computer systems, and thus finding suitable discriminative features is the most promising approach in multimodal emotion recognition. In the literature numerous features have been developed or adapted from related signal processing tasks. But still, classifying emotional states in real world scenarios is difficult and the performance of automatic classifiers is rather limited. This is mainly due to the fact that emotional states can not be distinguished by a well defined set of discriminating features. In this work we present an enhanced autocorrelation feature as a multi pitch detection feature and compare its performance to feature well known, and state-of-the-art in signal and speech processing. Results of the evaluation show that the enhanced autocorrelation outperform other state-of-the-art features in case of the challenge data set. The complexity of this benchmark data set lies in between real world data sets showing naturalistic emotional utterances, and the widely applied and well-understood acted emotional data sets.

...read moreread less

8 citations

Journal Article•10.14445/22315381/IJETT-V7P251•

Distinction Between EMD & EEMD Algorithm for Pitch Detection in Speech Processing

[...]

Bhawna Sharma, Sukhvinder Kaur

25 Jan 2014-international journal of engineering trends and technology

TL;DR: This paper describes the different algorithms for finding pitch markers in speech signal and it also explains how EEMD is better than EMD algorithm.

...read moreread less

Abstract: In this paper we describes the different algorithms for finding pitch markers in speech signal and it also explain how EEMD is better than EMD algorithm One of the major problem in EMD algorithm is mode mixing. EEMD algorithm helps in solving mode mixing problem. EEMD algorithm is a noise assisted data analysis (NADA) for extracting pitch information for the speech signal. In EEMD signal is decomposed into intermediate functions called IMF. Using these IMFs, information regarding pitch markers can be evaluated. Keywords— EMD, EEMD, IMF, NADA.

...read moreread less

7 citations

Book Chapter•10.1007/978-3-319-02732-6_9•

Spectral Analysis of Speech Signal and Pitch Estimation

[...]

Mohamed Hesham Farouk¹•Institutions (1)

Cairo University¹

1 Jan 2014

TL;DR: Wavelet transform (WT) provides a way to explore the spectral characteristics of non-stationary speech signals and the tree structure of WP analysis can be customized to match the critical bands of human hearing giving better spectral estimation for speech signal than other methods.

...read moreread less

Abstract: Wavelet transform (WT) provides a way to explore the spectral characteristics of non-stationary speech signals. Multiresolution analysis based on the wavelet theory permits the introduction of the concepts of signal filtering with different bandwidths or frequency resolutions. As both time and frequency analysis can be conducted by WT, the tree structure of WP analysis can be customized to match the critical bands of human hearing giving better spectral estimation for speech signal than other methods. Wavelet-based pitch estimation assumes that the glottis closures are correlated with the maxima in the adjacent scales of the WT. This approach ensures more accurate estimation of pitch period.

...read moreread less

5 citations

Proceedings Article•10.1109/ICASSP.2014.6853842•

Multi-pitch tracking using Gaussian mixture model with time varying parameters and Grating Compression Transform

[...]

M. N. Abhijith¹, Prasanta Kumar Ghosh¹, K. Rajgopal¹•Institutions (1)

Indian Institute of Science¹

4 May 2014

TL;DR: This work proposes an unsupervised method for obtaining multiple pitch tracks using time-varying means of a Gaussian mixture model (GMM), referred to as TVGMM, which achieves multi-pitch tracking and results in lower root mean squared error in pitch track estimation compared to that by Kalman filtering.

...read moreread less

Abstract: Grating Compression Transform (GCT) is a two-dimensional analysis of speech signal which has been shown to be effective in multi-pitch tracking in speech mixtures. Multi-pitch tracking methods using GCT apply Kalman filter framework to obtain pitch tracks which requires training of the filter parameters using true pitch tracks. We propose an unsupervised method for obtaining multiple pitch tracks. In the proposed method, multiple pitch tracks are modeled using time-varying means of a Gaussian mixture model (GMM), referred to as TVGMM. The TVGMM parameters are estimated using multiple pitch values at each frame in a given utterance obtained from different patches of the spectrogram using GCT. We evaluate the performance of the proposed method on all voiced speech mixtures as well as random speech mixtures having well separated and close pitch tracks. TVGMM achieves multi-pitch tracking with 51% and 53% multi-pitch estimates having error <= 20% for random mixtures and all-voiced mixtures respectively. TVGMM also results in lower root mean squared error in pitch track estimation compared to that by Kalman filtering.

...read moreread less

5 citations

Proceedings Article•

Melody Extraction from Polyphonic Audio of Western Opera: A Method based on Detection of the Singer's Formant.

[...]

Zheng Tang¹, Dawn A. A. Black²•Institutions (2)

University of Washington¹, Queen Mary University of London²

1 Jan 2014

TL;DR: This paper introduces a novel melody extraction algorithm based on the Fan Chirp Transform that has the best performance in voicing detection, voicing false alarm, and overall accuracy and is capable of correcting outliers in pitch detection.

...read moreread less

Abstract: Current melody extraction approaches perform poorly on the genre of opera [1, 2]. The singer’s formant is defined as a prominent spectral-envelope peak around 3 kHz found in the singing of professional Western opera singers [3]. In this paper we introduce a novel melody extraction algorithm based on this feature for opera signals. At the front end, it automatically detects the singer’s formant according to the Long-Term Average Spectrum (LTAS). This detection function is also applied to the short-term spectrum in each frame to determine the melody. The Fan Chirp Transform (FChT) [4] is used to compute pitch salience as its high time-frequency resolution overcomes th e difficulties introduced by vibrato. Subharmonic attenuation is adopted to handle octave errors which are comm on in opera vocals. We improve the FChT algorithm so that it is capable of correcting outliers in pitch detection. The performance of our method is compared to 5 state-ofthe-art melody extraction algorithms on a newly created dataset and parts of the ADC2004 dataset. Our algorithm achieves an accuracy of 87.5% in singer’s formant detection. In the evaluation of melody extraction, it has the best performance in voicing detection (91.6%), voicing false alarm (5.3%) and overall accuracy (82.3%).

...read moreread less

A pitch salience function derived from harmonic frequency deviations for polyphonic music analysis

[...]

Alessio Degani, Riccardo Leonardi, Pierangelo Migliorati, Geoffroy Peeters

1 Sep 2014

TL;DR: A novel approach for the computation of a pitch salience function is presented which does not rely on energy but only on frequency location and is evaluated for a task of multiple-pitch estimation using the MAPS test-set.

...read moreread less

Abstract: In this paper, a novel approach for the computation of a pitch salience function is presented. The aim of a pitch (considered here as synonym for fundamental frequency) salience function is to es- timate the relevance of the most salient musical pitches that are present in a certain audio excerpt. Such a function is used in nu- merous Music Information Retrieval (MIR) tasks such as pitch, multiple-pitch estimation, melody extraction and audio features computation (such as chroma or Pitch Class Profiles). In order to compute the salience of a pitch candidate f , the classical approach uses the weighted sum of the energy of the short time spectrum at its integer multiples frequencies hf. In the present work, we pro- pose a different approach which does not rely on energy but only on frequency location. For this, we first estimate the peaks of the short time spectrum. From the frequency location of these peaks, we evaluate the likelihood that each peak is an harmonic of a given fundamental frequency. The specificity of our method is to use as likelihood the deviation of the harmonic frequency locations from the pitch locations of the equal tempered scale. This is used to cre- ate a theoretical sequence of deviations which is then compared to an observed one. The proposed method is then evaluated for a task of multiple-pitch estimation using the MAPS test-set.

...read moreread less

Proceedings Article•10.1109/ICESC.2014.26•

Pitch Contour Extraction of Singing Voice in Polyphonic Recordings of Indian Classical Music

[...]

Kalyani Akant, Shyamkant Limaye

9 Jan 2014

TL;DR: In this paper, the problem of extraction of pitch contour of singing voice in the context of the polyphonic recordings of ICM is addressed and novel algorithms developed in Fourier of Fourier Transform domain are addressed.

...read moreread less

Abstract: In Indian Classical Music (ICM) singing voice is accompanied by continuous drone and percussive instruments. At the onsets of percussion, smooth pitch contour cannot be obtained by conventional pitch detection algorithms. In this paper, the problem of extraction of pitch contour of singing voice in the context of the polyphonic recordings of ICM is addressed. In this method, frames were classified as monophonic / polyphonic and harmonic / inharmonic using novel algorithms developed in Fourier of Fourier Transform domain. The pitch was estimated for those frames which were monophonic and harmonic and for those polyphonic frames where predominant melody was of singing voice. Estimation of pitch was done using Fourier of Fourier Transform doing parabolic interpolation to spectral peaks. The developed method is immune to octave errors and accuracy in pitch estimation is suitable for microtones in ICM.

...read moreread less

Journal Article•10.4028/WWW.SCIENTIFIC.NET/AMM.596.433•

Pitch Detection Method Based on Morphological Filtering

[...]

Yao Qi Wang¹, Xiao Peng Wang¹, Lv Cheng Wang¹•Institutions (1)

Lanzhou Jiaotong University¹

01 Jul 2014-Applied Mechanics and Materials

TL;DR: In this paper, a pitch detection method based on morphological filtering is proposed, which can accurately locate the moment of glottal opening and closing through tracking mutation of instantaneous energy, so that variation of pitch period can be accurately tracked.

...read moreread less

Abstract: A new method of pitch detection based on morphological filtering is proposed. Noisy speech signal is filtered by morphological filtering to remove the noise and highlight pitch, and then HHT is employed to get Hilbert-Huang spectrum and to calculate instantaneous energy and its derivative. The moment of glottal opening and closing can be accurately located through tracking mutation of instantaneous energy, so that variation of pitch period can be accurately tracked. Compared with other traditional method of pitch detection, this method not only truly describes non-stationary and non-linear characteristics of speech signal, but also it is an adaptive process for the analysis of the speech signal. The experiments showed that the method has strong anti-noise and can accurately detect the pitch of speech in low SNR.

...read moreread less

Patent•

Audio classifying method and device

[...]

Zhao Weifeng

8 Oct 2014

TL;DR: In this article, the pitch detection is carried out on an audio file to acquire the pitch sequence of the audio file, the tonic of audio file is searched, and according to the pitch sequences, the mode detection is performed on the audio files to determine the classification of audio files.

...read moreread less

Abstract: The embodiment of the invention provides an audio classifying method and device. The method comprises the steps that Pitch detection is carried out on an audio file to be classified, so as to acquire the Pitch sequence of the audio file; according to the Pitch sequence, the tonic of the audio file is searched; and according to the tonic of the audio file, mode detection is carried out on the audio file to determine the classification of the audio file. According to the invention, the classifying cost of the audio file can be reduced; the classifying efficiency is improved; and the intelligence is enhanced.

...read moreread less

Proceedings Article•10.1109/CCECE.2014.6900972•

Pitch estimation of noisy speech using ensemble empirical mode decomposition and dominant harmonic modification

[...]

Sujan Kumar Roy¹, Wei-Ping Zhu¹•Institutions (1)

Concordia University¹

4 May 2014

TL;DR: Experimental evaluation of the proposed PEA shows that it outperforms some of the existing PEAs for a wide range of SNRs.

...read moreread less

Abstract: This paper presents an efficient pitch estimation algorithm (PEA) using dominant harmonic modification (DHM) and ensemble empirical mode decomposition (EEMD). The noisy speech is first low-pass filtered within the ranges of fundamental frequencies (50-500Hz) to obtain the pre-filtered signal (PFS). The pre-processed signal is then modified by enhancing its dominant harmonic and followed by the computation of the normalized autocorrelation function (NACF). Then, an EEMD based data adaptive time domain noise filtering is applied to the NACF. Finally, partial reconstruction is performed in the EEMD domain to determine the pitch period. Experimental evaluation of the proposed PEA shows that it outperforms some of the existing PEAs for a wide range of SNRs.

...read moreread less

Proceedings Article•10.1109/IFOST.2014.6991090•

Correlation based pitch extraction method in speech signal

[...]

Mirza A. F. M. Rashidul Hasan¹, Rubaiyat Yasmin¹, Dipankar Das¹, M. S. Rahman²•Institutions (2)

University of Rajshahi¹, Shahjalal University of Science and Technology²

18 Dec 2014

TL;DR: The experimental results of computer simulations on male and female voices in white noise perform that the gross pitch errors are lower in proposed method as compared to other related method in different types of signal to noise ratio conditions.

...read moreread less

Abstract: This paper proposed a correlation based method using the autocorrelation function and the YIN. The autocorrelation function and also YIN is a popular measurement in estimating pitch in time domain. The performance of these two methods, however, is effected due to the position of dominant harmonics (usually the first formant) and the presence of spurious peaks introduced in noisy conditions. The experimental results of computer simulations on male and female voices in white noise perform that the gross pitch errors are lower in proposed method as compared to other related method in different types of signal to noise ratio conditions.

...read moreread less

Proceedings Article•10.1109/ICACCI.2014.6968303•

Efficient pitch detection algorithms for pitched musical instrument sounds: A comparative performance evaluation

[...]

Chetan Pratap Singh¹, T. Kishore Kumar¹•Institutions (1)

National Institute of Technology, Warangal¹

1 Dec 2014

TL;DR: The goal of this paper is to investigate how these algorithms should be adapted to pitched musical instrument sounds analysis and to provide a comparative performance evaluation of the most representative state-of-the-art approaches.

...read moreread less

Abstract: Pitch detection of an audio signal is an interesting research topic in the field of speech signal processing. Pitch is one of the most important perceptual features, as it conveys much information about the audio signal. It is closely related to the physical feature of fundamental frequency f0. For musical instrument sounds, the f0 and the measured pitch can be considered equivalent. In this paper four pitch detection algorithms have been proposed for pitched musical instrument sounds. The goal of this paper is to investigate how these algorithms should be adapted to pitched musical instrument sounds analysis and to provide a comparative performance evaluation of the most representative state-of-the-art approaches. This study is carried out on a large database of pitched musical instrument sounds, comprising four types of pitched musical instruments violin, trumpet, guitar and flute. The algorithmic performance is assessed according to the ability to estimate pitch contour accurately.

...read moreread less

Proceedings Article•10.1109/CISP.2014.7003882•

An improved pitch detection of speech combined with speech enhancement

[...]

Xin Xu¹, Tianqi Zhang¹, Shi Sui¹, Ya-juan Zhang¹•Institutions (1)

Chongqing University¹

1 Oct 2014

TL;DR: The improved pitch detection method combined with speech enhancement and the method of improved average magnitude difference function (AMDF) weighted autocorrelation function (ACF) is used for accurate pitch detection of the voiced.

...read moreread less

Abstract: For poor robustness issues of pitch detection of noisy speech, the improved pitch detection method combined with speech enhancement is proposed in this paper. Firstly, in order to reduce background noise and receive the clean speech relatively, we use the multi-band spectral subtraction and the masking properties of human auditory system to work on the noisy speech, and next use the energy and zero-crossing rate's product, quotient to adjudge the voiced part. Finally, the method of improved average magnitude difference function (AMDF) weighted autocorrelation function (ACF) is used for accurate pitch detection of the voiced. Theoretical and experimental simulations show that, the method can detect pitch accurately in low SNR, and the robustness improved significantly.

...read moreread less

Proceedings Article•10.1109/ICCWAMTIP.2014.7073394•

Multipitch tracking with continuous correlation feature and hybrid DBNS/HMM model

[...]

Jie Lin¹, Gen Zhang¹, Bo Fu¹, Hao Yujie¹•Institutions (1)

University of Electronic Science and Technology of China¹

1 Dec 2014

TL;DR: In this method, a novel continuous correlation feature was employed for calculating pitch model that not only represents the harmonicity but also includes the information of spectral continuity, and hence improving the accuracy of the multi-pitch estimate.

...read moreread less

Abstract: This paper proposed a new approach used for tracking multi-pith within one mixture speech signal. In this method, we employed a novel continuous correlation feature for calculating pitch model. This feature not only represents the harmonicity but also includes the information of spectral continuity, and hence improving the accuracy of the multi-pitch estimate. A DBNs and HMM hybrid model was further utilized to construct pitch models for determining pitch states and search for the best pitch state sequence. The new approach has been evaluated on mixture speech data and the results demonstrated its efficiency.

...read moreread less

Book Chapter•10.4018/978-1-4666-5063-3.CH010•

Predictive Analytics in Digital Signal Processing: A Convolutive Model for Polyphonic Instrument Identification and Pitch Detection Using Combined Classification

[...]

Josh Weese¹•Institutions (1)

Kansas State University¹

1 Jan 2014

TL;DR: This thesis presents a new model for representing the spectral structure of polyphonic signals: Uniform MAx Gaussian Envelope (UMAGE), which precisely approximates the distribution of frequency parts in the spectrum while still being resilient to oscillating rapidly (noise).

...read moreread less

Journal Article•

A Comparison of Real-Time Pitch Detection Algorithms in SuperCollider

[...]

Elliot Kermit-Canfield

08 Oct 2014-Journal of The Audio Engineering Society

TL;DR: In this article, three readily available pitch detection algorithms implemented as unit generators in the SuperCollider programming language are evaluated and compared with regard to their accuracy and latency for a variety of test signals consisting of both harmonic and non-harmonic content.

...read moreread less

Abstract: Three readily-available pitch detection algorithms implemented as unit generators in the SuperCollider programming language are evaluated and compared with regard to their accuracy and latency for a variety of test signals consisting of both harmonic and non-harmonic content. Suggestions are made for the type of signal on which each algorithm performs well.

...read moreread less

Reconhecimento automático de Distonia Laríngea com base na sustentação do Pitch Automatic Laryngeal Dystonia recognition based on Pitch sustainment

[...]

Ernesto Yuiti Saito, Gleidy Vannesa E. Rojas, Lílian Neto, Aguiar Ricz, Sylvio Barbon Junior - Show less +1 more

1 Jan 2014

TL;DR: The results showed that the diagnosis made by the tool and specialist is equivalent and therefore the proposed use of the Pitch sustainment as a measure for the recognition of the pathology was effective.

...read moreread less

Abstract: Objectives: Develop an automated tool for recognition of segments without the presence of voice during phonation of the patient based on Pitch sustainment. Method: The procedures for construction and verification of the technique are the acquisition of voice, windowing, application of Discrete Fourier Transform, the Pitch detection and verification of Pitch. Results: With the analysis of 101 voices, the tool diagnosed 56 voices with laryngeal dystonia and 45 as healthy. Already the specialist diagnosed 53 voices with laryngeal dystonia and 48 voices as healthy. Conclusion: The results showed that the diagnosis made by the tool and specialist is equivalent and therefore the proposed use of the Pitch sustainment as a measure for the recognition of the pathology was effective.

...read moreread less

Proceedings Article•10.1109/ICOSP.2014.7015048•

Speech enhancement based on analysis-synthesis framework with improved pitch estimation and spectral envelope enhancement

[...]

Bin Liu¹, Fuyuan Mo¹, Jianhua Tao¹•Institutions (1)

Chinese Academy of Sciences¹

1 Oct 2014

TL;DR: An improved multi-band summary correlogram (MBSC) algorithm is proposed for pitch estimation and voiced/unvoiced (V/UV) detection and the proposed pitch detection algorithm achieves a lower pitch detection error compared with the reference algorithm.

...read moreread less

Abstract: This paper presents a speech enhancement approach based on analysis-synthesis framework. An improved multi-band summary correlogram (MBSC) algorithm is proposed for pitch estimation and voiced/unvoiced (V/UV) detection. The proposed pitch detection algorithm achieves a lower pitch detection error compared with the reference algorithm. The denoising autoencoder (DAE) is applied to enhance the line spectrum frequencies (LSFs). The reconstruction loss could be decreased compare with the swallow model. The proposed approach is evaluated using the perceptual evaluation of speech quality (PESQ) and the experimental results show that the proposed approach improves the performance of speech enhancement compared with the conventional speech enhancement approach. In addition, it could be applied to parametric speech coding even at low bit rate and low SNR environments.

...read moreread less

Proceedings Article•10.1109/ICNSC.2014.6819660•

Pitch detection algorithms modifications and implementations towards automated vocal analysis

[...]

Yuhong Zhang¹, Aaron C. Elkins², Jay F. Nunamaker²•Institutions (2)

Texas Southern University¹, University of Arizona²

7 Apr 2014

TL;DR: Voice and speech feature extraction using advanced signal processing methodology is focused on and generated speech features are used to submit data mining algorithms for classifying deception.

...read moreread less

Abstract: Discriminating between deceit and truth is a significant security challenge in a variety of situations, including border crossings, job interviews, flight passenger screenings, and police interviews. Previous research indicates that some features of vocal speech, e.g., fundamental frequency, are related to human emotion and stress levels making them applicable deception detection. This paper focuses on voice and speech feature extraction using advanced signal processing methodology. These generated speech features are used to submit data mining algorithms for classifying deception. The result of this paper is expected to be directly applied to the deception detection system.

...read moreread less

Journal Article•

Frequency Estimation Algorithm of Sinusoid Signal Based on Autocorrelation Detection and Energy Centrobaric Correction Method

[...]

Hou Pan-we¹•Institutions (1)

North University of China¹

01 Jan 2014-Science Technology and Engineering

TL;DR: Simulation results showed that frequency estimation performance of this algorithm on the whole frequency band is relatively stable, and the root mean square error( RMSE) of frequency estimation error is smaller relative to Rife algorithm, Quinn algorithm and energy centrobaric correction method.

...read moreread less

Abstract: In order to improve the frequency estimation accuracy of sinusoidal signal with white Gaussian noise,a comprehensive sinusoidal frequency estimation algorithm combined autocorrelation detection with energy centrobaric correction method was proposed. Firstly the weak sinusoidal signal embedded in white Gaussian noise was detected by multiple autocorrelation to improve the signal to noise ratio. Then the power spectrum could be obtained by the Discrete Fourier Transform,and the signal frequency could be roughly estimated by searching the position of the maximum spectral line. Finally the sinusoidal signal frequency was accurately estimated by using energy centrobaric correction method for discrete spectrum. Simulation results showed that frequency estimation performance of this algorithm on the whole frequency band is relatively stable,and the root mean square error( RMSE) of frequency estimation error is smaller relative to Rife algorithm,Quinn algorithm and energy centrobaric correction method.The algorithm is easily implemented in hardware and also has some practical value for engineering.

...read moreread less

Journal Article•

Pitch Detection Method Based on Morphological Filtering and HHT

[...]

Wang Yao-q¹•Institutions (1)

Lanzhou Jiaotong University¹

01 Jan 2014-Journal of the China Railway Society

TL;DR: In this article, a pitch detection method based on morphological filtering and Hilbert-Huang transform (HHT) was proposed, which can accurately detect pitch of speech signals in low SNR.

...read moreread less

Abstract: The new method of pitch detection based on morphological filtering and Hilbert-Huang transform(HHT)was proposed.Noisy speech signals were filtered by the morphological filter to remove noises and highlight pitch,and then HHT was employed to get the Hilbert-Huang spectrum and calculate instantaneous energy and its derivative.The moment of glottal opening and closing can be located accurately through mutation of instantaneous energy,so variation of pitch periods can be tracked accurately.Compared with other traditional methods of pitch detection,the proposed method truely describes the non-stationary and non-linear characteristics of speech signals,and its analysis on voice signals is an adaptive process.The experimental results show that the method gives strong noise immunity and can accurately detect pitch of speech signals in low SNR.

...read moreread less

Journal Article•10.4028/WWW.SCIENTIFIC.NET/AMM.543-547.2833•

The DSP Implementation of Algorithm for Voice Speed Changing and Pitch Shifting Based on TD-PSOLA

[...]

Yi Yi Zhang¹, Fei Fei Wang¹, Wei Tao Du¹•Institutions (1)

Communication University of China¹

01 Mar 2014-Applied Mechanics and Materials

TL;DR: Simulation results indicate this algorithm can effectively improve the precision of pitch detection and convert speech into desired one and a real-time system is implemented on DSP, which can produce desired fundamental frequency and duration.

...read moreread less

Abstract: In this paper, we present an algorithm for voice speed changing and pitch shifting based on TD-PSOLA. MATLAB simulation results indicate this algorithm can effectively improve the precision of pitch detection and convert speech into desired one. After the algorithm being realized and optimized, a real-time system is implemented on DSP, which can produce desired fundamental frequency and duration.

...read moreread less

Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations - eScholarship

[...]

Lee Ngee Tan

1 Jan 2014

TL;DR: In this paper, an exemplar-based sparse representation (SR) classifier was proposed for human pitch detection, automatic speech recognition, and birdsong phrase classification, which achieved good performance with only 7 training images per subject.

...read moreread less

Abstract: This dissertation focuses on algorithms for robust speech and bird song processing. Many applications perform well under ideal signal conditions, e.g. noise-free, full bandwidth, sufficient training data. However, a large degradation in performance is generally observed when the input signal condition deviates from these ideal conditions. This dissertation describes robust algorithms for three applications, namely human-pitch detection, automatic speech recognition, and birdsong phrase classification. In the first application, a noise-robust, multi-band summary correlogram (MBSC)-based pitch detector is proposed. Novel signal processing schemes, which include comb-filter channel selection and subband reliability weighting, are designed to enhance the MBSC's peak at the most likely pitch period.In the second application, a feature enhancement scheme using jointly-sparse reference and estimated soft-mask representations, is developed for noise-robust automatic speech recognition (ASR). Reference and estimated soft-mask exemplar-pairs are extracted from clean and noisy utterance-pairs in the training data. Using a sparsity-based dictionary learning algorithm, dictionary representations are trained from the exemplar-pairs. The sparse linear combination of estimated soft-mask dictionary representations that best approximates the test utterance's estimated soft-mask is applied to the reference soft-mask dictionary to produce an enhanced soft-mask. This enhanced soft-mask is then used to perform noise suppression on the spectrogram from which features for ASR are extracted.In the third application, a simple exemplar-based sparse representation (SR) classifier is evaluated on limited data for birdsong phrase classification and verification. Song recordings of the Cassin's Vireo are used for performance evaluation. This study of the SR classifier for bird phrase classification is inspired by a paper that proposed the SR classifier for face recognition and outlier face detection, and reported good performance with only 7 training images per subject. Algorithmic enhancements are subsequently added to the original SR classification framework to improve the classification accuracy of automatically detected and segmented phrases, and phrases sang by bird individuals that are not found in the training set. These algorithmic enhancements include dynamic time warping (DTW) and frame-based feature normalization prior to SR classification. When the class decisions from DTW and first pass SR classification are different, SR classification is repeated with frequency-bin-normalized spectrographic features to resolve the two conflicting decisions.

...read moreread less

Journal Article•10.11591/TELKOMNIKA.V12I12.6482•

Pitch detection base on EMD and the second spectrum

[...]

Jingfang Wang¹•Institutions (1)

Hunan International Economics University¹

01 Dec 2014-Indonesian Journal of Electrical Engineering and Computer Science

TL;DR: In this article, a pitch detection method for the secondary spectrum of noisy speech was designed, the noisy speech oval (Elliptic Filter, EF) band-pass filter is designed first in this method, and then the experience mode decomposition (EMD) of Hilbert-Huang transform (HHT) is used to decompose the signal into a finite number of intrinsic mode functions (IMF), and IMF components of different scales are associated with the decomposition of the signal before calculation, the maximum of two modes associated synthetic pitch signal detection is taken.

...read moreread less

Abstract: A new method for pitch detection of secondary spectrum is designed in the paper, the noisy speech oval (Elliptic Filter, EF) band-pass filter is designed first in this method, and then the experience mode Decomposition(EMD)of Hilbert-Huang transform (HHT) is used to decompose the signal into a finite number of intrinsic mode functions (IMF), and IMF components of different scales are associated with the decomposition of the signal before calculation, the maximum of two modes associated (IMF) synthetic pitch signal detection is taken. Experimental results show that the method could be better than the traditional autocorrelation method, and cepstrum method has better results, especially with voicing obvious segment features, there is better performance of pitch detection in noisy speech, signal to noise ratio(SNR) also has good robustness in the lower sound environment. http://dx.doi.org/10.11591/telkomnika.v12i12.6482

...read moreread less