Learning latent representations for style control and transfer in end-to-end speech synthesis YJ Zhang, S Pan, L He, ZH Ling ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 198 | 2019 |
Part-of-speech tagging with bidirectional long short-term memory recurrent neural network P Wang, Y Qian, FK Soong, L He, H Zhao arXiv preprint arXiv:1510.06168, 2015 | 136 | 2015 |
Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis Y Fan, Y Qian, FK Soong, L He 2015 IEEE international conference on acoustics, speech and signal …, 2015 | 135 | 2015 |
A unified tagging solution: Bidirectional lstm recurrent neural network with word embedding P Wang, Y Qian, FK Soong, L He, H Zhao arXiv preprint arXiv:1511.00215, 2015 | 107 | 2015 |
Robust sequence-to-sequence acoustic modeling with stepwise monotonic attention for neural TTS M He, Y Deng, L He arXiv preprint arXiv:1906.00672, 2019 | 82 | 2019 |
Developing RNN-T models surpassing high-performance hybrid models with customization capability J Li, R Zhao, Z Meng, Y Liu, W Wei, S Parthasarathy, V Mazalov, Z Wang, ... arXiv preprint arXiv:2007.15188, 2020 | 71 | 2020 |
Word embedding for recurrent neural network based TTS synthesis P Wang, Y Qian, FK Soong, L He, H Zhao 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015 | 67 | 2015 |
A new gan-based end-to-end tts training algorithm H Guo, FK Soong, L He, L Xie arXiv preprint arXiv:1904.04775, 2019 | 49 | 2019 |
Improving prosody with linguistic and bert derived features in multi-speaker based mandarin chinese neural tts Y Xiao, L He, H Ming, FK Soong ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 40 | 2020 |
Learning distributed word representations for bidirectional lstm recurrent neural network P Wang, Y Qian, FK Soong, L He, H Zhao Proceedings of the 2016 Conference of the North American Chapter of the …, 2016 | 39 | 2016 |
Speaker and language factorization in DNN-based TTS synthesis Y Fan, Y Qian, FK Soong, L He 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 39 | 2016 |
Modeling F0 trajectories in hierarchically structured deep neural networks X Yin, M Lei, Y Qian, FK Soong, L He, ZH Ling, LR Dai Speech Communication 76, 82-92, 2016 | 31 | 2016 |
Exploiting syntactic features in a parsed tree to improve end-to-end tts H Guo, FK Soong, L He, L Xie arXiv preprint arXiv:1904.04764, 2019 | 28 | 2019 |
Naturalspeech: End-to-end text to speech synthesis with human-level quality X Tan, J Chen, H Liu, J Cong, C Zhang, Y Liu, X Wang, Y Leng, Y Yi, L He, ... arXiv preprint arXiv:2205.04421, 2022 | 26 | 2022 |
Conversational end-to-end tts for voice agents H Guo, S Zhang, FK Soong, L He, L Xie 2021 IEEE Spoken Language Technology Workshop (SLT), 403-409, 2021 | 25 | 2021 |
Modeling multi-speaker latent space to improve neural tts: Quick enrolling new speaker and enhancing premium voice Y Deng, L He, F Soong arXiv preprint arXiv:1812.05253, 2018 | 25 | 2018 |
Using personalized speech synthesis and neural language generator for rapid speaker adaptation Y Huang, L He, W Wei, W Gale, J Li, Y Gong ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 23 | 2020 |
Unsupervised speaker adaptation for DNN-based TTS synthesis Y Fan, Y Qian, FK Soong, L He 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 22 | 2016 |
Towards Universal Text-to-Speech. J Yang, L He Interspeech, 3171-3175, 2020 | 20 | 2020 |
Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis Y Fan, Y Qian, FK Soong, L He Sixteenth Annual Conference of the International Speech Communication …, 2015 | 16 | 2015 |