I am a Ph.D. Candidate at the School of Computer Science, Fudan University, where I work at Vision and Learning Lab (FVL) under the supervision of Prof. Yu-Gang Jiang (IEEE Fellow) and Prof. Zuxuan Wu. Before this, I received my BS degree from Tianjin University.
I have published 15+ papers
at the top international AI conferences such as CVPR, NeurIPS, ECCV. My current research interests include:
- 1️⃣ 🌟🌟🌟 Generative models: text-to-video generation, controllable visual generation, video editing
- 2️⃣ Representation learning: video understanding, 3D understanding, image retrieval
I am set to graduate in 2025 and am actively seeking job opportunities in both industry and academia. If you are interested in discussing potential collaborations or positions, please feel free to email me at zhenxingfd@gmail.com.
🔥 News
- [Apr’2025] Achieved over 500+ citations on Google Scholar and an h-index of 11.
- [Apr’2025] Released MagicMotion and achieved over 100+ Github stars.
- [Feb’2025] StableAnimator is accepted by CVPR 2025 and achieved over 1200+ Github stars.
- [Sep’2024] Two papers are accepted by NeurIPS 2024.
- [Aug’2024] “A Survey on Video Diffusion Models” is accepted by ACM Computing Surveys.
- [Feb’2024] Invited talk at Openmmlab about Video Generation Models, [slides].
- [Feb’2024] SimDA is accepted by CVPR 2024.
- [Dec’2023] Serve as a reviewer for ICML 2024.
- [Dec’2023] Invited talk at Kunlun Research, “A Survey on Video Diffusion Models”.
- [Aug’2023] Awarded a certificate of “Star of Tomorrow” at MSRA.
- [May’2023] One paper is accepted by ACL 2023.
- [Feb’2023] Two papers are accepted by CVPR 2023.
- [July’2022] Three papers are accepted by ECCV 2022.
- [Mar’2022] Start my internship at MicroSoft Research Asia (MSRA).
📝 Publications
A full publication list is available on [Google Scholar] [Semantic Scholar]
(*: equal contribution; †: corresponding authors.)

A Survey on Video Diffusion Models
Zhen Xing , Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
ACM Computing Survey (CSUR, IF=23.8), 2024
[Paper][HomePage][Zhihu][机器之心][量子位]
Surveying 300+ recent literatures on video generation and editing with diffusion models. Acheving Github 2000+ stars.




GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng, Xitong Yang, Zhen Xing, Zuxuan Wu, Yu-Gang Jiang
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]







-
AdaDiff: Adaptive Step Selection for Fast Diffusion
Hui Zhang, Zuxuan Wu, Zhen Xing, Jie Shao, Yu-Gang Jiang
AAAI, 2025, [Paper] -
Advancing Dark Action Recognition via Modality Fusion and Dark-to-Light Diffusion Model
Yuxuan Wang, Zhen Xing, Zuxuan Wu
ICASSP, 2025, [Paper] -
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie, Haidong Cao, Zejia Weng, Zhen Xing, Shiwei Shen, Jiaqi Leng, Xipeng Qiu, Yanwei Fu, Zuxuan Wu, Yu-Gang Jiang
Arxiv, 2025, [Paper] -
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang, Yixuan Wei, Zhen Xing, Yifei Ma, Zuxuan Wu, Ji Li, Zheng Zhang, Qi Dai, Chong Luo, Xin Geng, Baining Guo
NeurIPS, 2024, [Paper], [HomePage] -
FDGaussian: Fast Gaussian Splatting via Geometric-aware Diffusion Model
Qijun Feng, Zhen Xing, Zuxuan Wu, Yu-Gang Jiang
Arxiv, 2024, [Paper], [HomePage] -
TranSFormer: Slow-Fast Transformer for Machine Translation
Bei Li, Yi Jing, Xu Tan, Zhen Xing, Tong Xiao, Jingbo Zhu
ACL (Findings), 2023, [Paper] -
Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing, Jiangtong Li, Li Niu
ACM MM, 2022, [Paper] -
3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation
Zhidan Liu, Zhen Xing, Xiangdong Zhou, Yijiang Chen, Guichun Zhou
ICMR, 2022, [Paper] -
CaSS: A Channel-aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification
Yijiang Chen, Xiangdong Zhou, Zhen Xing, Zhidan Liu, Minyang Xu
DASFFA, 2022, [Paper] -
From Coarse to Fine: Hierarchical Structure-aware Video Summarization
Wenxu Li, Gang Pan, Chen Wang, Zhen Xing, Zhenjun Han
TOMM, 2022, [Paper]
💬 Invited Talks
- 2024.04, Talk at ByteDance, SimDA: Simple Diffusion Adapter for Efficient Video Generation [slides]
- 2024.02, Tutorial at Openmmlab, The Past and Present of Video Diffusion Models [slides]
- 2023.12, Tutorial at Kunlun Research, The Tutorial of Video Generative Models [slides]
💻 Internships
- 2022.03 - 2023.08, MicroSoft Research Asia, Visual Computing Group.
🎓 Academic Service
- Conference Program Committee:
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-2025
- IEEE/CVF International Conference on Computer Vision (ICCV) 2023-2025
- European Conference on Computer Vision (ECCV) 2022-2024
- International Conference on Learning Representations (ICLR) 2024-2025
- ACM SIGGRAPH Conference (SIGGRAPH) 2025
- International Conference on Machine Learning (ICML) 2024-2025
- Conference on Neural Information Processing Systems (NeurIPS) 2024-2025
- AAAI Conference on Artificial Intelligence (AAAI) 2023-2025
- Journal Reviewer:
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- International Journal of Computer Vision (IJCV)
- IEEE Transactions on Multimedia (TMM)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- Knowledge-Based Systems (KBS)
- Pattern Recognition (PR)
🎖 Honors and Awards
Below, I exhasutively list some of my Honors and Awards that inspire me a lot.
- Outstanding graduates of Shanghai. (Top-1%, PhD) [2025]
- Tencent academic scholarship. (Top-3%, PhD) [2024]
- Fudan University excellent academic scholarship. (Top-5%, PhD) [2023]
- “Star of Tomorrow” intern of MicroSoft Research Asia. (Top 10%, PhD)[2023]
- Tencent academic scholarship. (Rank 1/130, PhD) [2022]
- Fudan University excellent academic scholarship. (Top-5%, Master) [2021]
- Outstanding graduates of TianJin University. (Top-5%) [2020]
- Excellent monitor of Tianjin University. (Top-10) [2018]
- Academic scholarship of Tianjin University. (Top-10%) [2017-2020]