Biography

I am a Ph.D. Candidate at the School of Computer Science, Fudan University, where I work at Vision and Learning Lab (FVL) under the supervision of Prof. Yu-Gang Jiang (IEEE Fellow) and Prof. Zuxuan Wu. Before this, I received my BS degree from Tianjin University.

My research interests lie broadly in computer vision and deep learning. I mainly focus on video generation, editing and recognition. I am also open and willing to explore other vision tasks, e.g., AIGC, 3D understanding. See details about me in CV.

I am set to graduate in 2025 and am actively exploring job opportunities in both industry and academia. If you are interested in discussing potential collaborations or positions, please feel free to email me at zhenxingfd@gmail.com.

News

  • [Oct 2024] Achieved over 300+ citations on Google Scholar and an h-index of 10.
  • [Sep’2024] Two papers are accepted by NeurIPS 2024.
  • [Aug’2024] “A Survey on Video Diffusion Models” is accepted by ACM Computing Surveys.
  • [Feb’2024] Invited talk at Openmmlab about Video Generation Models, [slides].
  • [Feb’2024] SimDA is accepted by CVPR 2024.
  • [Dec’2023] Serve as a reviewer for ICML 2024.
  • [Dec’2023] Invited talk at Kunlun Research, “A Survey on Video Diffusion Models”.
  • [Aug’2023] Awarded a certificate of “Star of Tomorrow” at MSRA.
  • [May’2023] One paper is accepted by ACL 2023.
  • [Feb’2023] Two papers are accepted by CVPR 2023.
  • [July’2022] Three papers are accepted by ECCV 2022.
  • [Mar’2022] Start my internship at MicroSoft Research Asia (MSRA).

Selected Publications:

Video Generation
A Survey on Video Diffusion Models
Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
ACM Computing Survey (CSUR, IF=23.8), 2024
[Paper][HomePage][Zhihu]
Surveying 100+ recent literatures on video generation and editing with diffusion models.
Video Generation
SimDA: A Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing, Qi Dai, Han Hu, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper][HomePage]
Video Generation
GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng, Xitong Yang, Zhen Xing, Zuxuan Wu, Yu-Gang Jiang
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]
Video Generation
AID: Adapting Image2Video Diffusion Models for Instruction-based Video Prediction
Zhen Xing, Qi Dai, Zejia Weng, Zuxuan Wu, Yu-Gang Jiang
Technique Report, 2024
[Paper][HomePage]
Video Editing
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing, Qi Dai, Zihao Zhang, Hui Zhang, Han Hu, Zuxuan Wu, Yu-Gang Jiang
IJCV (Under Review)
[Paper][HomePage][Zhihu]
Video Recongnition
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]
Panorama Understanding
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling, Zhen Xing, Manliang Cao, Xiangdong Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper][Code]
3D Generation
Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]
3D Generation
Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network
Zhen Xing, Yijiang Chen, Zhixin Ling, Xiangdong Zhou, Yu Xiang
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]
Image Retrieval
Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Zhixin Ling, Zhen Xing,Jian Zhou, Xiangdong Zhou
European Conference on Computer Vision (ECCV), 2022
[Paper][Code]