Weihao Ye (叶伟豪)

I am a second-year Master's student at Xiamen University. My supervisors are Prof. Rongrong Ji and Prof. Yiyi Zhou. I obtained an Honors Bachelor's degree in Computer Science from SZU in 2023, supervised by Prof. Xu Wang.

My research interest lies in efficiency optimization for multimodal understanding and generation models. I aim to develop techniques that significantly reduce the resource requirements of multimodal models, making them more suitable for real-world deployment. Looking ahead, I am also interested in exploring reinforcement learning to guide and refine model behavior in complex, interactive environments.

Email  /  Github  /  Google Scholar

profile photo
Long-Term Research Goal

My long-term research goal is to develop systems from the bottom up, aligning low-level algorithmic efficiency with the practical needs of real-world applications.

Long-Term Goal Illustration

Publications

Check my Google Scholar for the most updated list of publications.

Paper Thumbnail
[1] Fit and Prune: Fast and Training-Free Visual Token Pruning for Multi-modal Large Language Models

Weihao Ye, Qiong Wu, Wenhao Lin, Yiyi Zhou. AAAI 2025 (Citations: 34)

This paper proposes a training-free visual token pruning method based on attention distribution fitting, significantly boosting inference efficiency of MLLMs.

Paper Thumbnail
[2] Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings

Qiong Wu, Wenhao Lin, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji. NeurIPS 2025 (under review) (Citations: 4)

This work introduces a dynamic visual token exit mechanism to accelerate MLLMs, along with extensive empirical studies.

Paper Thumbnail
[3] Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models

Qiong Wu, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji. IJCV (under review) (Citations: 10)

This paper proposes an efficient fine-tuning method that skips redundant attention heads using Propagation-Information Adapter (PIA), reducing computational cost.

Paper Thumbnail
[4] A Learning-based Framework for Multi-View Instance Segmentation in Panorama

Weihao Ye*, Ziyang Mai*, Qiudan Zhang, Xu Wang. DSAA 2022 (CCF-C) (Citations: 1)

This work introduces a multi-view joint framework for panoramic instance segmentation, improving segmentation accuracy.

Internship
ByteDance Logo

March 2025 – Present

I worked as a research intern at ByteDance, where I contributed to the acceleration and optimization of multimodal models used in products such as Doubao and Dreamina.

Honors and Awards
  • [2024] Selected for Tsinghua & Tencent SDG Project Grant Program, responsible for software development. [link]
  • [2023] SZU Outstanding Graduate, SZU
  • [2023] Third Prize (GBA Region), Huawei Software Elite Challenge – Multi-Robot Collaboration, responsible for motion and collision algorithms.
  • [2022] Gold Award of Game for Peace Developer Contest, Tencent, team leader & game programmer. [link]
  • [2022] National Third Prize of Lanqiao Programming Competition.
  • [2022] Provincial Second Prize of National College Digital Art & Design Competition, team leader & main programmer.
  • [2020–2022] university-level honors during undergraduate studies, including Star of Academics and Star of Innovation & Entrepreneurship.