Cheng Shi

Cheng Shi (石骋)

PhD Student

Department of Computer Science
The University of Hong Kong

Email: shicheng2025 [at] connect [dot] hku [dot] hk

GitHub | Google Scholar

Biography

I am currently a first-year Ph.D. student at Department of Computer Science, The University of Hong Kong, where I have the privilege of being supervised by Prof. Yizhou Yu (ACM/IEEE Fellow). Previously, I obtained my Master's degree from ShanghaiTech University in 2024, where I was advised by Prof. Sibei Yang, and my Bachelor's degree from ShanghaiTech University in 2022.

My research interests lie at the intersection of computer vision, natural language processing, and multimodal AI. My current research focuses on open-world visual perception and vision foundation models.

News

Sep 2025 3 papers accepted by NeurIPS 2025.
Jul 2025 1 paper accepted by ICCV 2025.
Feb 2025 1 paper accepted by CVPR 2025.
Nov 2024 Awarded National Scholarship.
Jul 2024 2 papers accepted by ECCV 2024.
Nov 2023 Awarded National Scholarship.

Selected Publications

* denotes equal contribution and † corresponding author

Vision Transformer Needs More Than Register
In submission, 2025
Cheng Shi and Sibei Yang†
 
Vision Function Layer in Multimodal LLMs
NeurIPS 2025
Cheng Shi, Yizhou Yu and Sibei Yang
 
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
NeurIPS 2025
Yulin Zhang, Cheng Shi, Yang Wang and Sibei Yang†
 
Discovering Compositional Hallucination in LVLMs
NeurIPS 2025
Ge Zheng, Jiajin Tang, Jiaye Qian, Hanzhuo Huang, Cheng Shi and Sibei Yang†
 
Part2Object
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
ECCV 2024
Cheng Shi*, Yuling Zhang*, Bin Yang, Jiajin Tang, Yuexin Ma and Sibei Yang†
 
Plain-DNet
Plain-DNet: A Plain Multi-Dataset Object Detector
ECCV 2024
Cheng Shi*, Yuchen Zhu* and Sibei Yang†
 
Zip-Your-CLIP
Zip-Your-CLIP: CLIP Itself is a Good Object-detector
ICLR 2024
Cheng Shi and Sibei Yang†
 
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
NeurIPS 2023
Hanzhuo Huang*, Yufan Feng*, Cheng Shi, Lan Xu, Jingyi Yu, and Sibei Yang†
 
LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models
ICCV 2023
Cheng Shi, and Sibei Yang†
 
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
ICCV 2023
Cheng Shi, and Sibei Yang†
 
Contrastive Grouping with Transformer for Referring Image Segmentation
CVPR 2023
Jiajin Tang, Ge Zheng, Cheng Shi, and Sibei Yang†
 
DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
SIGGRAPH 2023
Longwen Zhang*, Qiwei Qiu*, Hongyang Lin*, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang†, Lan Xu†, Jingyi Yu†
 
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
ECCV 2022
Cheng Shi, and Sibei Yang†