|
Cheng Shi (石骋)PhD Student
Department of Computer Science |
I am currently a first-year Ph.D. student at Department of Computer Science, The University of Hong Kong, where I have the privilege of being supervised by Prof. Yizhou Yu (ACM/IEEE Fellow). Previously, I obtained my Master's degree from ShanghaiTech University in 2024, where I was advised by Prof. Sibei Yang, and my Bachelor's degree from ShanghaiTech University in 2022.
My research interests lie at the intersection of computer vision, natural language processing, and multimodal AI. My current research focuses on open-world visual perception and vision foundation models.
* denotes equal contribution and † corresponding author
|
Vision Transformer Needs More Than Register
In submission, 2025
|
|
|
Vision Function Layer in Multimodal LLMs
NeurIPS 2025
|
|
|
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
NeurIPS 2025
|
|
|
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
ECCV 2024
|
|
Plain-DNet: A Plain Multi-Dataset Object Detector
ECCV 2024
|
|
Zip-Your-CLIP: CLIP Itself is a Good Object-detector
ICLR 2024
|