I am a final-year Ph.D. Candidate at the School of Computer Science, Fudan University, where I work in the Vision and Learning Lab (FVL) under the supervision of Prof. Yu-Gang Jiang (IEEE Fellow) and Prof. Zuxuan Wu. Before this, I received my B.S. degree from Tianjin University.

I have published 15+ papers Google citations GitHub User's stars at the top international AI conferences such as CVPR, NeurIPS, ECCV. My current research interests include:

  • 1️⃣ 🌟🌟🌟 Generative models: text-to-video generation, controllable visual generation, video editing
  • 2️⃣ Representation learning: video understanding, 3D understanding, image retrieval

I will join Alibaba Tongyi Lab as a Research Scientist, dedicating myself to research on the video-generation model Wan. If you are interested in discussing potential collaborations, please feel free to email me at zhenxingfd@gmail.com.

🔥 News

  • [May’2025] Defended Ph.D. thesis and awarded Outstanding Graduates of Shanghai!
  • [May’2025] Achieved 550+ citations on Google Scholar and an h-index of 12.
  • [Apr’2025] Released MagicMotion and achieved 100+ Github stars.
  • [Feb’2025] StableAnimator accepted to CVPR 2025 and achieved 1200+ Github stars.
  • [Sep’2024] Two papers accepted to NeurIPS 2024.
  • [Aug’2024] “A Survey on Video Diffusion Models” accepted to ACM Computing Surveys.
  • [Feb’2024] Invited talk at Openmmlab about Video Generation Models, [slides].
  • [Feb’2024] SimDA accepted to CVPR 2024.
  • [Dec’2023] Served as a reviewer for ICML 2024.
  • [Dec’2023] Invited talk at Kunlun Research, “A Survey on Video Diffusion Models”.
  • [Aug’2023] Awarded certificate of “Star of Tomorrow” at MSRA.
  • [May’2023] One paper accepted to ACL 2023.
  • [Feb’2023] Two papers accepted to CVPR 2023.
  • [July’2022] Three papers accepted to ECCV 2022.

📝 Publications

A full publication list is available on [Google Scholar] [Semantic Scholar]

(*: equal contribution; †: project leader.)

Video Generation
sym

A Survey on Video Diffusion Models
Zhen Xing , Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
ACM Computing Survey (CSUR, IF=23.8), 2024
[Paper][HomePage][Zhihu][机器之心][量子位]
Surveying 300+ recent literatures on video generation and editing with diffusion models. Acheving Github 2000+ stars.

Video Generation
sym

SimDA: A Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing, Qi Dai, Han Hu, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper][HomePage]
The first Parameter-efficient Text-to-Video generation model.

Video Generation
sym

StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu, Zhen Xing†, Xintong Han, Zhi-Qi Cheng, Qi Dai, Chong Luo, Zuxuan Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper][Code][Homepage][机器之心]
Acheving Github 1300+ stars.

Video Generation
sym

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li*, Zhen Xing*†, Rui Wang, Hui Zhang, Zuxuan Wu
Technical Report, 2025
[Paper][Code][HomePage][量子位]

Video Generation
sym

GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng, Xitong Yang, Zhen Xing, Zuxuan Wu, Yu-Gang Jiang
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]

Video Generation
sym

AID: Adapting Image2Video Diffusion Models for Instruction-based Video Prediction
Zhen Xing, Qi Dai, Zejia Weng, Zuxuan Wu, Yu-Gang Jiang
Technical Report, 2024
[Paper][HomePage]

Video Editing
sym

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing, Qi Dai, Zihao Zhang, Hui Zhang, Han Hu, Zuxuan Wu, Yu-Gang Jiang
Technical Report, 2024
[Paper][HomePage][Zhihu]

Video Understanding
sym

SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]

3D Understanding
sym

PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling, Zhen Xing†, Manliang Cao, Xiangdong Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]

3D Generation
sym

Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]

3D Generation
sym

Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network
Zhen Xing, Yijiang Chen, Zhixin Ling, Xiangdong Zhou, Yu Xiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]

Image Retrieval
sym

Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing†,Jian Zhou, Xiangdong Zhou
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]

  • ​AdaDiff: Adaptive Step Selection for Fast Diffusion​
    Hui Zhang, Zuxuan Wu, ​​Zhen Xing​​, Jie Shao, Yu-Gang Jiang
    ​AAAI​​, 2025, [Paper]

  • ​Advancing Dark Action Recognition via Modality Fusion and Dark-to-Light Diffusion Model​
    Yuxuan Wang, ​​Zhen Xing​​, Zuxuan Wu
    ​ICASSP​​, 2025, [Paper]

  • ​Human2Robot: Learning Robot Actions from Paired Human-Robot Videos​
    Sicheng Xie, Haidong Cao, Zejia Weng, ​​Zhen Xing​​, Shiwei Shen, Jiaqi Leng, Xipeng Qiu, Yanwei Fu, Zuxuan Wu, Yu-Gang Jiang
    ​Arxiv​​, 2025, [Paper]

  • ​Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms​
    Miaosen Zhang, Yixuan Wei, ​​Zhen Xing​​, Yifei Ma, Zuxuan Wu, Ji Li, Zheng Zhang, Qi Dai, Chong Luo, Xin Geng, Baining Guo
    ​NeurIPS​​, 2024, [Paper], [HomePage]

  • ​FDGaussian: Fast Gaussian Splatting via Geometric-aware Diffusion Model​
    Qijun Feng, ​​Zhen Xing​​, Zuxuan Wu, Yu-Gang Jiang
    ​Arxiv​​, 2024, [Paper], [HomePage]

  • ​TranSFormer: Slow-Fast Transformer for Machine Translation​
    Bei Li, Yi Jing, Xu Tan, ​​Zhen Xing​​, Tong Xiao, Jingbo Zhu
    ​ACL (Findings)​​, 2023, [Paper]

  • ​Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval​
    Zhixin Ling, ​​Zhen Xing​​, Jiangtong Li, Li Niu
    ​ACM MM​​, 2022, [Paper]

  • ​3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation​
    Zhidan Liu, ​​Zhen Xing​​, Xiangdong Zhou, Yijiang Chen, Guichun Zhou
    ​ICMR​​, 2022, [Paper]

  • ​CaSS: A Channel-aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification​
    Yijiang Chen, Xiangdong Zhou, ​​Zhen Xing​​, Zhidan Liu, Minyang Xu
    ​DASFFA​​, 2022, [Paper]

  • ​From Coarse to Fine: Hierarchical Structure-aware Video Summarization​
    Wenxu Li, Gang Pan, Chen Wang, ​​Zhen Xing​​, Zhenjun Han
    ​TOMM​​, 2022, [Paper]

🎖 Honors and Awards

Below, I exhasutively list some of my Honors and Awards that inspire me a lot.

  • Outstanding graduates of Shanghai. (Top-1%, PhD) [2025]
  • Alibaba Star Program and Tencent Qingyun Plan. [2025]
  • Tencent academic scholarship. (Top-3%, PhD) [2024]
  • Fudan University excellent academic scholarship. (Top-5%, PhD) [2023]
  • “Star of Tomorrow” intern of MicroSoft Research Asia. (Top 10%, PhD)[2023]
  • Tencent academic scholarship. (Rank 1/130, PhD) [2022]
  • Fudan University excellent academic scholarship. (Top-5%, Master) [2021]
  • Outstanding graduates of TianJin University. (Top-5%) [2020]
  • Excellent monitor of Tianjin University. (Top-10) [2018]
  • Academic scholarship of Tianjin University. (Top-10%) [2017-2020]

💬 Invited Talks

  • 2024.04, Talk at ByteDance, SimDA: Simple Diffusion Adapter for Efficient Video Generation [slides]
  • 2024.02, Tutorial at Openmmlab, The Past and Present of Video Diffusion Models [slides]
  • 2023.12, Tutorial at Kunlun Research, The Tutorial of Video Generative Models [slides]

💻 Internships

  • 2022.03 - 2023.08, MicroSoft Research Asia, Visual Computing Group.
    • Duties included: Video Diffusion Model, Video Understanding
    • Advisor: Qi Dai, and Han Hu

🎓 Academic Service

  • Conference Program Committee:
    • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-2025
    • IEEE/CVF International Conference on Computer Vision (ICCV) 2023-2025
    • European Conference on Computer Vision (ECCV) 2022-2024
    • International Conference on Learning Representations (ICLR) 2024-2025
    • ACM SIGGRAPH Conference (SIGGRAPH) 2025
    • International Conference on Machine Learning (ICML) 2024-2025
    • Conference on Neural Information Processing Systems (NeurIPS) 2024-2025
    • AAAI Conference on Artificial Intelligence (AAAI) 2023-2025
  • Journal Reviewer:
    • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    • International Journal of Computer Vision (IJCV)
    • IEEE Transactions on Multimedia (TMM)
    • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
    • Knowledge-Based Systems (KBS)
    • Pattern Recognition (PR)