Cong Zhou
Dolby Laboratories
13 Papers
41 Citations
Cong Zhou is an academic researcher from Dolby Laboratories. The author has contributed to research in topics: Generative model & Computer science. The author has an hindex of 4, co-authored 11 publications.
Chat about Author
Papers
High-quality Speech Coding with Sample RNN
Janusz Klejsa,Per Hedelin,Cong Zhou,Fejgin Roy M,Lars Villemoes +4 more
- 12 May 2019
TL;DR: A speech coding scheme employing a generative model based on SampleRNN that, while operating at significantly lower bitrates, matches or surpasses the perceptual quality of state-of-the-art classic wide-band codecs is provided.
47
Voice Conversion with Conditional SampleRNN.
Cong Zhou,Michael Horgan,Vivek Kumar,Cristina Michel Vasco,Dan Darcy +4 more
- 02 Sep 2018
TL;DR: A novel approach to conditioning the SampleRNN generative model for voice conversion that is capable of many-to-many voice conversion without requiring parallel data, enabling broad applications and subjective evaluation demonstrates that this approach outperforms conventional VC methods.
25
Source Coding of Audio Signals with a Generative Model
Fejgin Roy M,Janusz Klejsa,Lars Villemoes,Cong Zhou +3 more
- 27 May 2020
TL;DR: In this paper, a waveform is first quantized, yielding a finite bitrate representation, and then reconstructed by random sampling from a model conditioned on the quantized waveform.
8
Patent
Sound and video object tracking
Cong Zhou,Timo Kunkel,Cristina Michel Vasco +2 more
- 15 Jun 2017
TL;DR: In this article, image data relating to real-world objects or persons is collected from a scene while collecting audio data relating with the realworld objects and persons from the same scene, and a salient object is selected from among the candidate salient objects.
8
Patent
Audio Capture for Aerial Devices
Timo Kunkel,Cong Zhou,Vivek Kumar,Rémi Audfray +3 more
- 17 Oct 2017
TL;DR: In this article, a controller device carried by the subject can generate one or more signals for the UAV to follow, and these signals can be used to temporally synchronize video captured at a UAV and audio captured by the microphone.
6