About Me

I’m a final-year PhD student at the Department of Computer Science & Engineering, Hong Kong University of Science and Technology, co-supervised by Prof. Heung-Yeung Shum and Prof. Lionel M. Ni. Previously, I obtained my bachelor’s degree in Computer Science and Technology, South China University of Science and Technology.

I am an intern at FAIR MPK (Facebook AI Research, Menlo Park). I have interned at International Digital Economy Academy, Shenzhen (advised by Prof. Lei Zhang) and Microsoft Research, Redmond (advised by Dr. Jianwei Yang and Dr. Chunyuan Li).

📌My research focuses on fine-grained visual understanding and multi-modal learning. My previous work can be categorized into three main areas.

I anticipate graduating in 2025 and am open to both academic and industrial research positions in North America and Asia. If you are interested, please feel free to contact me.

✉️ Welcome to contact me for any discussion and cooperation!

🔥 News

📝 Selected Works

Refer to my google scholar for the full list.

  • LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models.
    Feng Li*, Renrui Zhang*, Hao Zhang*, Yuanhan Zhang, Bo Li, Wei Li, Zejun Ma, Chunyuan Li
    arxiv 2024.
    [Paper][blog][Code]GitHub stars

  • Visual In-Context Prompting.
    Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe Xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao.
    CVPR 2024.
    [Paper][Code]GitHub stars

  • SoM: Set-of-Mark Visual Prompting for GPT-4V.
    Jianwei Yang*, Hao Zhang*, Feng Li*, Xueyan Zou*, Chunyuan Li, Jianfeng Gao.
    arxiv 2023.
    [Paper][Code]Github stars

  • Semantic-SAM: Segment and Recognize Anything at Any Granularity.
    Feng Li*, Hao Zhang*, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, Jianfeng Gao.
    ECCV 2024.
    [Paper][Code]Github stars

  • OpenSeeD: A Simple Framework for Open-Vocabulary Segmentation and Detection.
    Hao Zhang*, Feng Li*, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang. ICCV 2023.
    [Paper][Code]Github stars

  • SEEM: Segment Everything Everywhere All at Once.
    Xueyan Zou*, Jianwei Yang*, Hao Zhang*, Feng Li*, Linjie Li, Jianfeng Gao, Yong Jae Lee.
    NeurIPS 2023.
    [Paper][Code]Github stars

  • Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
    Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang.
    ECCV 2024.
    [Paper][Code]Github stars

  • Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation.
    Feng Li*, Hao Zhang*, Huaizhe Xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum.
    CVPR 2023. Rank 9th on CVPR 2023 Most Inflentical Papers
    [Paper][Code]Github stars

  • DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection.
    Hao Zhang*, Feng Li*, Shilong Liu*, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum.
    ICLR 2023. Rank 2nd on ICLR 2023 Most Inflentical Papers
    [Paper][Code]Github stars

  • DN-DETR: Accelerate DETR Training by Introducing Query DeNoising.
    Feng Li*, Hao Zhang*, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang.
    CVPR 2022 | TPAMI 2023. Oral presentation.
    [Paper][Code]Github stars

(* denotes equal contribution or core contributor.)

🎖 Selected Awards

  • Hong Kong Postgraduate Scholoarship, 2021
  • Contemporary Undergraduate Mathematical Contest in Modeling(CUMCM), National first prize, 2019.

Flag Counter

Flag Counter