I am a Ph.D. Candidate at the School of Computer Science, Fudan University, where I work at Vision and Learning Lab (FVL) under the supervision of Prof. Yu-Gang Jiang (IEEE Fellow) and Prof. Zuxuan Wu. Before this, I received my BS degree from Tianjin University.

I have published 15+ papers Google citations GitHub User's stars at the top international AI conferences such as CVPR, NeurIPS, ECCV. My current research interests include:

  • 1️⃣ 🌟🌟🌟 Generative models: text-to-video generation, controllable visual generation, video editing
  • 2️⃣ Representation learning: video understanding, 3D understanding, image retrieval

I am set to graduate in 2025 and am actively seeking job opportunities in both industry and academia. If you are interested in discussing potential collaborations or positions, please feel free to email me at zhenxingfd@gmail.com.

🔥 News

  • [Apr’2025] Achieved over 500+ citations on Google Scholar and an h-index of 11.
  • [Apr’2025] Released MagicMotion and achieved over 100+ Github stars.
  • [Feb’2025] StableAnimator is accepted by CVPR 2025 and achieved over 1200+ Github stars.
  • [Sep’2024] Two papers are accepted by NeurIPS 2024.
  • [Aug’2024] “A Survey on Video Diffusion Models” is accepted by ACM Computing Surveys.
  • [Feb’2024] Invited talk at Openmmlab about Video Generation Models, [slides].
  • [Feb’2024] SimDA is accepted by CVPR 2024.
  • [Dec’2023] Serve as a reviewer for ICML 2024.
  • [Dec’2023] Invited talk at Kunlun Research, “A Survey on Video Diffusion Models”.
  • [Aug’2023] Awarded a certificate of “Star of Tomorrow” at MSRA.
  • [May’2023] One paper is accepted by ACL 2023.
  • [Feb’2023] Two papers are accepted by CVPR 2023.
  • [July’2022] Three papers are accepted by ECCV 2022.
  • [Mar’2022] Start my internship at MicroSoft Research Asia (MSRA).

📝 Publications

A full publication list is available on [Google Scholar] [Semantic Scholar]

(*: equal contribution; †: corresponding authors.)

Video Generation
A Survey on Video Diffusion Models
Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
ACM Computing Survey (CSUR, IF=23.8), 2024
[Paper][HomePage][Zhihu][机器之心][量子位]
Surveying 300+ recent literatures on video generation and editing with diffusion models. Acheving Github 2000+ stars.
Video Generation
SimDA: A Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing, Qi Dai, Han Hu, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper][HomePage]
Video Generation
StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu, Zhen Xing, Xintong Han, Zhi-Qi Cheng, Qi Dai, Chong Luo, Zuxuan Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper][Code][Homepage][机器之心]
Acheving Github 1200+ stars.
Video Generation
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li*, Zhen Xing*, Rui Wang, Hui Zhang, Zuxuan Wu
Technical Report, 2025
[Paper][Code][HomePage][量子位]
Video Generation
GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng, Xitong Yang, Zhen Xing, Zuxuan Wu, Yu-Gang Jiang
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]
Video Generation
AID: Adapting Image2Video Diffusion Models for Instruction-based Video Prediction
Zhen Xing, Qi Dai, Zejia Weng, Zuxuan Wu, Yu-Gang Jiang
Technical Report, 2024
[Paper][HomePage]
Video Editing
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing, Qi Dai, Zihao Zhang, Hui Zhang, Han Hu, Zuxuan Wu, Yu-Gang Jiang
Technical Report, 2024
[Paper][HomePage][Zhihu]
Video Recongnition
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]
3D Understanding
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling, Zhen Xing, Manliang Cao, Xiangdong Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]
3D Generation
Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]
3D Generation
Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network
Zhen Xing, Yijiang Chen, Zhixin Ling, Xiangdong Zhou, Yu Xiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]
Image Retrieval
Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing,Jian Zhou, Xiangdong Zhou
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]
  • ​AdaDiff: Adaptive Step Selection for Fast Diffusion​
    Hui Zhang, Zuxuan Wu, ​​Zhen Xing​​, Jie Shao, Yu-Gang Jiang
    ​AAAI​​, 2025, [Paper]

  • ​Advancing Dark Action Recognition via Modality Fusion and Dark-to-Light Diffusion Model​
    Yuxuan Wang, ​​Zhen Xing​​, Zuxuan Wu
    ​ICASSP​​, 2025, [Paper]

  • ​Human2Robot: Learning Robot Actions from Paired Human-Robot Videos​
    Sicheng Xie, Haidong Cao, Zejia Weng, ​​Zhen Xing​​, Shiwei Shen, Jiaqi Leng, Xipeng Qiu, Yanwei Fu, Zuxuan Wu, Yu-Gang Jiang
    ​Arxiv​​, 2025, [Paper]

  • ​Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms​
    Miaosen Zhang, Yixuan Wei, ​​Zhen Xing​​, Yifei Ma, Zuxuan Wu, Ji Li, Zheng Zhang, Qi Dai, Chong Luo, Xin Geng, Baining Guo
    ​NeurIPS​​, 2024, [Paper], [HomePage]

  • ​FDGaussian: Fast Gaussian Splatting via Geometric-aware Diffusion Model​
    Qijun Feng, ​​Zhen Xing​​, Zuxuan Wu, Yu-Gang Jiang
    ​Arxiv​​, 2024, [Paper], [HomePage]

  • ​TranSFormer: Slow-Fast Transformer for Machine Translation​
    Bei Li, Yi Jing, Xu Tan, ​​Zhen Xing​​, Tong Xiao, Jingbo Zhu
    ​ACL (Findings)​​, 2023, [Paper]

  • ​Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval​
    Zhixin Ling, ​​Zhen Xing​​, Jiangtong Li, Li Niu
    ​ACM MM​​, 2022, [Paper]

  • ​3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation​
    Zhidan Liu, ​​Zhen Xing​​, Xiangdong Zhou, Yijiang Chen, Guichun Zhou
    ​ICMR​​, 2022, [Paper]

  • ​CaSS: A Channel-aware Self-supervised Representation Learning Framework for Multivariate Time Series Classification​
    Yijiang Chen, Xiangdong Zhou, ​​Zhen Xing​​, Zhidan Liu, Minyang Xu
    ​DASFFA​​, 2022, [Paper]

  • ​From Coarse to Fine: Hierarchical Structure-aware Video Summarization​
    Wenxu Li, Gang Pan, Chen Wang, ​​Zhen Xing​​, Zhenjun Han
    ​TOMM​​, 2022, [Paper]

💬 Invited Talks

  • 2024.04, Talk at ByteDance, SimDA: Simple Diffusion Adapter for Efficient Video Generation [slides]
  • 2024.02, Tutorial at Openmmlab, The Past and Present of Video Diffusion Models [slides]
  • 2023.12, Tutorial at Kunlun Research, The Tutorial of Video Generative Models [slides]

💻 Internships

  • 2022.03 - 2023.08, MicroSoft Research Asia, Visual Computing Group.
    • Duties included: Video Diffusion Model, Video Understanding
    • Advisor: Qi Dai, and Han Hu

🎓 Academic Service

  • Reviewing
    • Conferences:
      • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-2025
      • IEEE/CVF International Conference on Computer Vision (ICCV) 2023-2025
      • European Conference on Computer Vision (ECCV) 2022-2024
      • International Conference on Learning Representations (ICLR) 2024-2025
      • ACM SIGGRAPH Conference (SIGGRAPH) 2025
      • International Conference on Machine Learning (ICML) 2024-2025
      • Conference on Neural Information Processing Systems (NeurIPS) 2024-2025
      • AAAI Conference on Artificial Intelligence (AAAI) 2023-2025
    • Journals:
      • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
      • International Journal of Computer Vision (IJCV)
      • IEEE Transactions on Multimedia (TMM)
      • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
      • Knowledge-Based Systems (KBS)
      • Pattern Recognition (PR)

🎖 Honors and Awards

Below, I exhasutively list some of my Honors and Awards that inspire me a lot.

  • Outstanding graduates of Shanghai. (Top-1%, PhD) [2025]
  • Tencent academic scholarship. (Top-3%, PhD) [2024]
  • Fudan University excellent academic scholarship. (Top-5%, PhD) [2023]
  • “Star of Tomorrow” intern of MicroSoft Research Asia. (Top 10%, PhD)[2023]
  • Tencent academic scholarship. (Rank 1/130, PhD) [2022]
  • Fudan University excellent academic scholarship. (Top-5%, Master) [2021]
  • Outstanding graduates of TianJin University. (Top-5%) [2020]
  • Excellent monitor of Tianjin University. (Top-10) [2018]
  • Academic scholarship of Tianjin University. (Top-10%) [2017-2020]