Yujia Liang, Jile Jiao, Zhicheng Wang, Xuetao Feng, Zixuan Ye, Yuan Wang, Hao Lu: IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes. CoRR abs/2506.21116 (2025)