Speech style transfer

Patent

Speech style transfer

- 13 Oct 2020

3

TL;DR: In this article, a speech synthesizer is trained to generate synthesized audio data that corresponds to words uttered by a source speaker according to speech characteristics of a target speaker using time-stamped phoneme sequences, pitch contour data and speaker identification data.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Patent

System and method for personalized speaker verification

Wang Zhiming, +2 more

- 28 Jan 2021

Patent

System and method for determining voice characteristics

Wang Zhiming, +2 more

- 20 Feb 2020

•Proceedings Article•10.21437/VCC_BC.2020-21

FastVC: Fast Voice Conversion with non-parallel data.

Oriol Barbany Mayor, +1 more

- 08 Oct 2020

- arXiv: Audio and Speech Processing

TL;DR: The proposed model, FastVC, is based on a conditional AutoEncoder trained on non-parallel data and requires no annotations at all, and outperforms the VC Challenge 2020 baselines on the cross-lingual task in terms of naturalness.

...read moreread less

References

Patent

Speech synthesis using deep neural networks

Andrew W. Senior, +2 more

- 25 Oct 2012

TL;DR: In this article, a neural network is trained to map input phonetic transcriptions of training-time text strings into sequences of acoustic feature vectors, which yield predefined speech waveforms when processed by a signal generation module.

...read moreread less

140

Patent

Speech analyzer detecting pitch frequency, speech analyzing method, and speech analyzing program

Shunji Mitsuyoshi, +2 more

- 02 Jun 2006

TL;DR: A speech analyzer includes a speech acquiring section, a frequency converting section, an autocorrelation section, and a pitch detection section as mentioned in this paper, where the pitch detection system determines the pitch frequency from the distance between two local crests or troughs of the waveform.

...read moreread less

45

Patent

Voice identification method using long-short term memory model recurrent neural network

Xia Chunqiu

- 11 Jan 2017

TL;DR: In this article, a voice identification method using a long short term memory model recurrent neural network (LSTM) was proposed. But the model was not designed for speech recognition.

...read moreread less

43

Patent

Method and system for detecting voice spoofing attack of speakers on basis of deep learning

Qian Yanmin, +2 more

- 17 Aug 2016

TL;DR: In this article, a method and a system for detecting voice spoofing attack of speakers on the basis of deep learning is presented, which includes constructing audio-frequency training sets, initializing deep feed-forward neural networks and deep recurrent neural networks by the aid of multi-frame feature vectors and single-frame vector sequences of the training sets; respectively leading frame level and sequence level feature vectors of to-be-tested audio frequencies into two trained linear differential analysis models in test phases, weighting two obtained result grades to obtain scores and comparing the scores to predefined threshold values so

...read moreread less

28

Patent

Phonotactic-Based Speech Recognition & Re-synthesis

Robert Alex Fuhrman

- 28 Aug 2016

TL;DR: In this article, a phonotactic post-processor is configured to rescore the N-best phoneme candidates output by a primary ensemble phoneme neural network using a priori phono-temporal information.

...read moreread less

9