I am currently a senior researcher at Microsoft Research Asia (MSRA).
My research interests include 3D vision&generation, neural rendering, and Embodied AI.
Prior to that, I was a researcher at Xiaobing.AI working on realistic virtual avatar generation. Before joining Xiaobing, I had received my Ph.D. degree from Tsinghua University under the supervision of Prof. Harry Shum in 2022, and
worked closely with Jiaolong Yang and Xin Tong as a research intern at MSRA from 2017 to 2022.
Before that, I had received my B.S. from Department of Physics in Tsinghua University in 2017.
Publications
Structured 3D Latents for Scalable and Versatile 3D Generation
Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang
arXiv 2024,
[PDF][Project][Code]
We propose a native 3D generative model built on a unified Structured Latent representation and Rectified Flow Transformers, enabling versatile and high-quality 3D asset creation.
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li, Yaobo Liang, Zeyu Wang, Lin Luo, Xi Chen, Mozheng Liao, Fangyun Wei, Yu Deng, Sicheng Xu, Yizhong Zhang, Xiaofan Wang, Bei Liu, Jianlong Fu, Jianmin Bao, Dong Chen, Yuanchun Shi, Jiaolong Yang, Baining Guo
arXiv 2024,
[PDF][Project][Code]
We propose a new advanced VLA architecture for robot manipulation, which leverages cognitive information extracted by powerful VLMs to guide action prediction of a specialized action module.
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Ruicheng Wang, Sicheng Xu, Cassie Dai, Jianfeng Xiang, Yu Deng, Xin Tong, Jiaolong Yang
arXiv 2024,
[PDF][Project][Code]
We present MoGe, a powerful model for recovering 3D geometry from monocular open-domain images.
Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer Yu Deng, Duomin Wang, Baoyuan Wang
2024 European Conference on Computer Vision, ECCV 2024,
[PDF][Project][Code]
We learn a lifelike 4D head synthesizer by creating pseudo multi-view videos from monocular ones as supervision.
Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang
2024 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2024,
[PDF][Project][Code]
We propose a one-shot 4D head synthesis approach for high-fidelity 4D head avatar reconstruction while trained on large-scale synthetic data.
Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation
Xingyu Chen*, Yu Deng*, Baoyuan Wang (* for equal contribution)
2023 IEEE International Conference on Computer Vision, ICCV 2023,
[PDF][Project][Code]
We propose a novel approach that enables a 3D-aware GAN to generate images with both state-of-the-art photorealism and strict 3D consistency.
GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds
Jianfeng Xiang, Jiaolong Yang, Yu Deng, Xin Tong
2023 IEEE International Conference on Computer Vision, ICCV 2023,
[PDF][Project][BibTeX]
We propose GRAM-HD, a 3D-aware GAN that can generate photorealistic and 3D-consistent images at 1024x1024 resolution.
Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image Yu Deng, Baoyuan Wang, Heung-Yeung Shum
2023 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2023,
[PDF][Project][BibTeX]
We propose a learning-based approach for high-fidelity and 3D-consistent novel view synthesis of monocular portrait images.
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis
Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang
2023 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2023,
[PDF][Project]
We propose a one-shot talking head synthesis approach with disentangled control over lip motion, eye gaze&blink, head pose, and emotional expression.
AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars
Yue Wu, Yu Deng, Jiaolong Yang, Fangyun Wei, Qifeng Chen, Xin Tong
2022 Conference on Neural Information Processing Systems, NeurIPS 2022,
Spotlight [PDF][Project][BibTeX]
We propose AniFaceGAN, an animatable 3D-aware GAN for multiview consistent face animation generation.
Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects
Ziyu Wang, Yu Deng, Jiaolong Yang, Jingyi Yu, Xin Tong
The 30th Pacific Graphics Conference, PG 2022,
[PDF][Project][Code][BibTeX]
We propose a generative model for synthesizing radiance fields of topology-varying objects with disentangled shape and appearance variations.
GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation Yu Deng, Jiaolong Yang, Jianfeng Xiang, Xin Tong
2022 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2022,
Oral Presentation [PDF][Project][Code][BibTeX]
We propose Generative Radiance Manifolds (GRAM), a method that can generate 3D-consistent images with explicit camera control, trained on only unstructured 2D images.
Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence Yu Deng, Jiaolong Yang, Xin Tong
2021 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021,
[PDF][Code][BibTeX]
We propose a novel Deformed Implicit Field (DIF) representation for modeling 3D shapes of a category and generating dense correspondences among shapes with structure variations.
Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning Yu Deng, Jiaolong Yang, Dong Chen, Fang Wen, Xin Tong
2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020,
Oral Presentation [PDF][Code][BibTeX]
We propose DiscoFaceGAN, an approach for face image generation of virtual people with disentangled, precisely-controllable latent representations for identity, expression, pose, and illumination.
Deep 3D Portrait from a Single Image
Sicheng Xu, Jiaolong Yang, Dong Chen, Fang Wen, Yu Deng, Yunde Jia, Xin Tong
2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020,
[PDF][Code][BibTeX]
We propose a learning-based approach for recovering the 3D geometry of human head from a single portrait image without any ground-truth 3D data.
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, Xin Tong
2019 IEEE Conference on Computer Vision and Pattern Recognition Workshop on AMFG, CVPRW 2019,
Best Paper Award [PDF][Code][BibTeX]
We propose a novel deep 3D face reconstruction approach that leverages a robust hybrid loss function and performs multi-image face reconstruction by exploiting complementary information from different images for shape aggregation.