I am currently a final-year Ph.D student at the Department of Computer Science and Engineering (CSE), Chinese University of Hong Kong (CUHK), supervised by Prof. Jiaya Jia. Before that, I received the Bachelor's Degree at Harbin Institute of Technology (HIT) in 2020.

My current research interest includes Multimodal LLMs, Long-term Reasoning for LLMs, 3D Point Cloud / 2D Image Scene Understanding, particularly in:

1. Enabling multi-modal LLMs with downstream vision tasks capabilities;

2. Long-term reasoning for Large Language Models;

3. Repurposing Transformer for 3D point cloud processing.

Selected Publications [Google Scholar]

Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Xin Lai, Zhuotao Tian, Yukang Chen, Senqiao Yang, Xiangru Peng, Jiaya Jia
arXiv pre-print

70.8% and 94.0% accuracy on MATH and GSM8K, respectively! Outperforms GPT-4-1106, Gemini-1.5-Pro, Claude-3-Opus!

LISA: Reasoning Segmentation via Large Language Model
Xin Lai*, Zhuotao Tian*, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia
CVPR, 2024 (Oral, 3.3% acceptance rate)

Over 1,500 GitHub Stars!

Mask-Attention-Free Transformer for 3D Instance Segmentation
Xin Lai, Yuhui Yuan, Ruihang Chu, Yukang Chen, Han Hu, Jiaya Jia
ICCV, 2023
Spherical Transformer for LiDAR-based 3D Recognition
Xin Lai, Yukang Chen, Fanbin Lu, Jianhui Liu, Jiaya Jia
CVPR, 2023
Stratified Transformer for 3D Point Cloud Segmentation
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia
CVPR, 2022

A pioneering fully transformer-based 3D fundamental network.

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation
Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia
ECCV, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency
Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia
CVPR, 2021

Open-source Projects

Sparse Transformer
Xin Lai, Fanbin Lu, Yukang Chen
Open-source Library

Fast, memory-efficient, and easy-to-use implementation for window-based 3D Transformer, well optimized by low-level CUDA code.


Honors and Awards

Professional Services

  • Conference Services:
    International Conference on Learning Representations (ICLR’24)
    Winter Conference on Applications of Computer Vision (WACV’24)
    Conference and Workshop on Neural Information Processing Systems (NeurIPS’23)
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR’22,23)
    IEEE International Conference on Computer Vision (ICCV’21,23)
    European Conference on Computer Vision (ECCV’22)

  • Journal Reviews:
    IEEE Transactions on Image Processing (TIP)
    Pattern Recognition (PR)

  • Teaching

    2021-2022FallENGG 1110: Problem Solving By Programming
    2020-2021SpringCSCI 3251: Engineering Practicum
    2020-2021FallENGG 1110: Problem Solving By Programming
