Now, I’m a research engineer in FAIR (Foundamental AI Research Lab, ), working on multimodal AI agent. Formerly, I was a research scientist in ByteDance
, working on large speech model and AI avatar. Our work were widely deployed in famous applications and services, such as Tiktok/抖音, Capcut/剪映, Volcano Engine(火山引擎), etc.
I graduated from the Department of Computer Science, Zhejiang University (浙江大学计算机科学与技术学院) with a bachelor’s degree in 2020. After that, in 2023, I graduated with a master’s degree in the Department of Computer Science, Zhejiang University, advised by Kejun Zhang (张克俊).
My research interest includes speech synthesis, music generation, avatar and translation. I have published more than 20 papers at the top international AI conferences such as NeurIPS, ICLR, ICML, ACL, AAAI, etc. I served as area chair for ACL, EMNLP, and NAACL. Also, I served as reviewer for NeurIPS, ICLR, TASLP, TMM, CVPR, ICCV, etc.
I used to be a research intern at Tencent AI Lab and SEA AI Lab
, collaborating with Shuicheng Yan (颜水成) and Yi Ren (任意).
Before that, I was a research intern at ByteDance AI Lab
, advised by Bilei Zhu (朱碧磊).
Also, I had a one-year long internship at Microsoft Research Asia
, closely collaborated with Xu Tan (谭旭), Tao Qin (秦涛) and Tie-yan Liu (刘铁岩).
I’m one of the main contributors of several popular open-source projects: Muzic , MegaTTS3
, etc.