


default search action
EMNLP (Findings) 2025: Suzhou, China
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng:

Findings of the Association for Computational Linguistics: EMNLP 2025, Suzhou, China, November 4-9, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-335-7 - Yevhen Kostiuk, Clara Seyfried, Chris Reed:

Automating Alternative Generation in Decision-Making. 1-15 - Takuma Udagawa, Yang Zhao, Hiroshi Kanayama, Bishwaranjan Bhattacharjee:

Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification. 16-25 - Chenming Tang, Zhixiang Wang, Hao Sun, Yunfang Wu:

Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions. 26-48 - Yuanchi Ma, Jiamou Liu, Hui He, Libo Zhang, Haoyuan Li, Zhendong Niu:

Boundary Matters: Leveraging Structured Text Plots for Long Text Outline Generation. 49-63 - Pier Felice Balestrucci, Ondrej Dusek, Luca Anselma, Alessandro Mazzei:

Can Large Language Models Personalize Dialogues to Generational Styles? 64-77 - Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Yang Liu, Hang Li:

Toward Optimal LLM Alignments Using Two-Player Games. 78-99 - Mengting Gui, Shufeng Hao, Chongyang Shi, Qi Zhang:

Structural Patent Classification Using Label Hierarchy Optimization. 100-114 - Md Mahbubur Rahman, Shufeng Hao, Chongyang Shi, An Lao, Jinyan Liu:

Exploring Hyperbolic Hierarchical Structure for Multimodal Rumor Detection. 115-134 - Tue Le, Hoang Tran Vuong, Tung Nguyen, Linh Ngo Van, Dinh Viet Sang, Trung Le, Thien Huu Nguyen:

Multi-Surrogate-Objective Optimization for Neural Topic Models. 135-151 - Seonghyeon Lee, Heejae Chon, Joonwon Jang, Dongha Lee, Hwanjo Yu:

How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated Code. 152-167 - Rui Lv, Qi Liu, Weibo Gao, Jiatong Li, Kai Zhang, Shiwei Tong:

ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive Learning. 168-181 - Junhao Chen, Jingbo Sun, Xiang Li, Haidong Xin, Yuhao Xue, Yibin Xu, Hao Zhao:

LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts. 182-194 - Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao:

Versatile Framework for Song Generation with Prompt-based Control. 195-219 - Jiayi Shi, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu, Kan Li:

InsBank: Evolving Instruction Subset for Ongoing Alignment. 220-238 - Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du:

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use. 239-258 - Xinyi Wang, Yiping Song, Chang Liu, Tingjin Luo, Bo Liu, Zheng Xie, Minlie Huang:

DCMKC: A Dual Consistency Matching Approach for Multi-hop Question Answering in LLMs. 259-273 - Daixuan Cheng, Shaohan Huang, Ziyu Zhu, Xintong Zhang, Xin Zhao, Zhongzhi Luan, Bo Dai, Zhenliang Zhang:

On Domain-Adaptive Post-Training for Multimodal Large Language Models. 274-296 - Jing Ye, Rui Wang, Yuchuan Wu, Victor Ma, Feiteng Fang, Fei Huang, Yongbin Li:

CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization. 297-323 - Hao Yi, Qingyang Li, Yulan Hu, Fuzheng Zhang, Di Zhang, Yong Liu:

SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin. 324-337 - Zhangyue Yin, Yuhong Sun, Xuanjing Huang, Xipeng Qiu, Hui Zhao:

Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework. 338-365 - Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain:

sudoLLM: On Multi-role Alignment of Language Models. 366-384 - Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che:

DAC: Decomposed Automation Correction for Text-to-SQL. 385-402 - Jie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang, Yiran Guo, Yuan Li, Yining Zheng, Xuanjing Huang, Xipeng Qiu:

VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction. 403-442 - Zhiyuan Fan, Longfei Yun, Ming Yan, Yumeng Wang, Dadi Guo, Brian Mak, James T. Kwok, Yi R. Fung:

End-to-End Optimization for Multimodal Retrieval-Augmented Generation via Reward Backpropagation. 443-466 - Cheng-Han Chiang, Xiaofei Wang, Chung-Ching Lin, Kevin Lin, Linjie Li, Radu Kopetz, Yao Qian, Zhendong Wang, Zhengyuan Yang, Hung-yi Lee, Lijuan Wang:

Audio-Aware Large Language Models as Judges for Speaking Styles. 467-480 - Xinhao Wang, Xinyu Ma, Shengyong Ding, Derek F. Wong:

Evaluation of Text-to-Image Generation from a Creativity Perspective. 481-493 - Xiang Liu, Penglei Sun, Shuyan Chen, Longhan Zhang, Peijie Dong, Huajie You, Yongqi Zhang, Chang Yan, Xiaowen Chu, Tong-yi Zhang:

Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research. 494-518 - Yi Pan, Yujia Zhang, Michael Kampffmeyer, Xiaoguang Zhao:

ProPy: Building Interactive Prompt Pyramids upon CLIP for Partially Relevant Video Retrieval. 519-533 - Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Fedor Splitt, Jiaao Li, Yoana Tsoneva, Sebastian Möller, Vera Schmitt:

Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI Systems. 534-555 - Yunyue Su, Zhang Jinshuai, Bowen Fang, Wen Ye, Jinghao Zhang, Bowen Song, Weiqiang Wang, Qiang Liu, Liang Wang:

Toolscaler: Scalable Generative Tool Calling via Structure-Aware Semantic Tokenization. 556-578 - Jie Sun, Tianyu Zhang, Houcheng Jiang, Kexin Huang, Xiang Shu, Zhibo Zhu, Lintao Ma, Xingyu Lu, Jun Zhou, Junkang Wu, Chi Luo, An Zhang, Jiancan Wu, Xiang Wang:

LaMP-Val: Large Language Models Empower Personalized Valuation in Auction. 579-595 - Yedi Hu, Yunzhi Yao, Ningyu Zhang, Huajun Chen, Shumin Deng:

Exploring Model Kinship for Merging Large Language Models. 596-625 - Xuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che:

MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering. 626-647 - Yupeng Chang, Chenlu Guo, Yi Chang, Yuan Wu:

LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization. 648-659 - Jinda Liu, Yi Chang, Yuan Wu:

R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-task Learning. 660-674 - Jinbo Su, Lingzhe Gao, Wei Li, Shihao Liu, Haojie Lei, Xinyi Wang, Yuanzhao Guo, Ke Wang, Daiting Shi, Dawei Yin:

RACQC: Advanced Retrieval-Augmented Generation for Chinese Query Correction. 675-689 - Ercong Nie, Helmut Schmid, Hinrich Schütze:

Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models. 690-706 - Weiyi Wu, Xinwen Xu, Chongyang Gao, Xingjian Diao, Siting Li, Lucas A. Salas, Jiang Gui:

Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models. 707-730 - Anyi Wang, Dong Shu, Yifan Wang, Yunpu Ma, Mengnan Du:

Improving LLM Reasoning through Interpretable Role-Playing Steering. 731-751 - Chenlong Bao, Shijie Li, Minghao Hu, Ming Qiao, Bin Zhang, Jin-Tao Tang, Shasha Li, Ting Wang:

R2A-TLS: Reflective Retrieval-Augmented Timeline Summarization with Causal-Semantic Integration. 752-766 - Minghao Liu, Zhitao He, Zhiyuan Fan, Qingyun Wang, Yi R. Fung:

MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing. 767-791 - Zahraa Al Sahili, Ioannis Patras, Matthew Purver:

FairCoT: Enhancing Fairness in Text-to-Image Generation via Chain of Thought Reasoning with Multimodal Large Language Models. 792-816 - Mufan Qiu, Zheyu Shen, Pingzhi Li, Ang Li, Tianlong Chen:

Bag of Tricks for Sparse Mixture-of-Experts: A Benchmark Across Reasoning, Efficiency, and Safety. 817-835 - Jinzhe Li, Gengxu Li, Yi Chang, Yuan Wu:

Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models. 836-869 - Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li:

Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning. 870-888 - Amrit Poudel, Maria Milkowski, Tim Weninger:

The Power of Framing: How News Headlines Guide Search Behavior. 889-900 - Tsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung:

DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models. 901-915 - Xin Zhang, Qiyu Wei, Yingjie Zhu, Fanyi Wu, Sophia Ananiadou:

THCM-CAL: Temporal-Hierarchical Causal Modelling with Conformal Calibration for Clinical Risk Prediction. 916-928 - Wen Ye, Zhaocheng Liu, Yuwei Gui, Tingyu Yuan, Yunyue Su, Bowen Fang, Chaoyang Zhao, Qiang Liu, Liang Wang:

GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation. 929-958 - Haibo Wang, Zhiyang Xu, Yu Cheng, Shizhe Diao, Yufan Zhou, Yixin Cao, Qifan Wang, Weifeng Ge, Lifu Huang:

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models. 959-975 - Xiaojun Bi, Shuo Li, Junyao Xing, Ziyue Wang, Fuwen Luo, Weizheng Qiao, Lu Han, Ziwei Sun, Peng Li, Yang Liu:

DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms. 976-990 - Zezhou Wang, Yaxin Du, Xingjun Ma, Yu-Gang Jiang, Zhuzhong Qian, Siheng Chen:

Optimizing Cross-Client Domain Coverage for Federated Instruction Tuning of Large Language Models. 991-1011 - Shichen Li, Jiawei Zhang, Zhongqing Wang, Peifeng Li:

Aligning Black-Box LLMs for Aspect Sentiment Quad Prediction. 1012-1025 - Yusheng Zhao, Xiao Luo, Junyu Luo, Weizhi Zhang, Zhiping Xiao, Wei Ju, Philip S. Yu, Ming Zhang:

Multifaceted Evaluation of Audio-Visual Capability for MLLMs: Effectiveness, Efficiency, Generalizability and Robustness. 1026-1041 - Veronika Ganeeva, Kuzma Khrabrov, Artur Kadurin, Elena Tutubalina:

Two Steps from Hell: Compositionality on Chemical LMs. 1042-1049 - Min Zeng, Jingfei Sun, Xueyou Luo, Shiqi Zhang, Li Xie, Caiquan Liu, Xiaoxin Chen:

GTA: Supervised-Guided Reinforcement Learning for Text Classification with Large Language Models. 1050-1060 - Zhaohui Yang, Yuxiao Ye, Shilei Jiang, Shihong Deng, Chen Hu, Linjing Li, Daxin Jiang:

Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning. 1061-1075 - Yuhang Pei, Tao Ren, Yifan Wang, Zhipeng Sun, Wei Ju, Chong Chen, XianSheng Hua, Xiao Luo:

LEAF: Large Language Diffusion Model for Time Series Forecasting. 1076-1091 - Yuhao Zhang, Shaoming Duan, Jinhang Su, Chuanyi Liu, Peiyi Han:

SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning. 1092-1110 - Yifei Song, William Soto Martinez, Anna Nikiforovskaya, Evan Parker Kelly Chapple, Claire Gardent:

Multilingual Verbalisation of Knowledge Graphs. 1111-1162 - Leqi Zheng, Chaokun Wang, Canzhi Chen, Jiajun Zhang, Cheng Wu, Zixin Song, Shannan Yan, Ziyang Liu, Hongwei Li:

LAGCL4Rec: When LLMs Activate Interactions Potential in Graph Contrastive Learning for Recommendation. 1163-1184 - Zekai Zhang, Yiduo Guo, Jiuheng Lin, Shanghaoran Quan, Huishuai Zhang, Dongyan Zhao:

English as Defense Proxy: Mitigating Multilingual Jailbreak via Eliciting English Safety Knowledge. 1185-1196 - Xurui Song, Zhixin Xie, Shuo Huai, Jiayi Kong, Jun Luo:

Dagger Behind Smile: Fool LLMs with a Happy Ending Story. 1197-1229 - Shuo Li, Jiajun Sun, Guodong Zheng, Xiaoran Fan, Yujiong Shen, Yi Lu, Zhiheng Xi, Yuming Yang, Wenming Tan, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang:

Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations. 1230-1247 - Yulong Wu, Viktor Schlegel, Riza Batista-Navarro:

Natural Context Drift Undermines the Natural Language Understanding of Large Language Models. 1248-1259 - Patryk Marszalek, Klaudia Balazy, Jacek Tabor, Tomasz Kusmierczyk:

Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA. 1260-1271 - Jiahao Cheng, Tiancheng Su, Jia Yuan, Guoxiu He, Jiawei Liu, Xinqi Tao, Jingwen Xie, Huaxia Li:

Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation. 1272-1305 - Yahan Li, Tingyu Xia, Yuan Wu, Yi Chang:

Large Language Model Evaluation via Matrix Nuclear-Norm. 1306-1323 - Xiuchao Sui, Daiying Tian, Qi Sun, Ruirui Chen, Dongkyu Choi, Kenneth Kwok, Soujanya Poria:

From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems. 1324-1340 - Fanfan Wang, Xiangqing Shen, Jianfei Yu, Rui Xia:

Flexible Thinking for Multimodal Emotional Support Conversation via Reinforcement Learning. 1341-1356 - Rana Muhammad Shahroz, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen:

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion. 1357-1370 - Chenlu Guo, Yi Chang, Yuan Wu:

NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models. 1371-1385 - Sampoorna Poria, Xiaolei Huang:

Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia - Current Stage and Challenges. 1386-1406 - Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang:

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data. 1407-1419 - Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung:

What Makes for Good Image Captions? 1420-1437 - Jinhao Pan, Chahat Raj, Ziyu Yao, Ziwei Zhu:

What's Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs. 1438-1459 - Rasul Dent, Pedro Ortiz Suarez, Thibault Clérice, Benoît Sagot:

Identifying Rare Languages in Common Crawl Data is a Needles-in-a-Haystack Problem. 1460-1473 - Tian Lan, Wenwei Zhang, Chengqi Lyu, Shuaibin Li, Chen Xu, Heyan Huang, Dahua Lin, Xian-Ling Mao, Kai Chen:

Training Language Models to Critique With Multi-agent Feedback. 1474-1501 - Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha:

RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples. 1502-1517 - Jihao Zhao, Chunlai Zhou, Daixuan Li, Shuaishuai Zu, Biao Qin:

Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering. 1518-1532 - Neha Srikanth, Victor S. Bursztyn, Puneet Mathur, Ani Nenkova:

SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps. 1533-1559 - Abhidip Bhattacharyya, Emma Markle, Shira Wein:

One More Modality: Does Abstract Meaning Representation Benefit Visual Question Answering? 1560-1572 - Mingchen Li, Heng Fan, Song Fu, Junhua Ding, Yunhe Feng:

DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting. 1573-1585 - Kepu Zhang, Guofu Xie, Weijie Yu, Mingyue Xu, Xu Tang, Yaxin Li, Jun Xu:

Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning. 1586-1598 - Cheng Qian, Hongyi Du, Hongru Wang, Xiusi Chen, Yuji Zhang, Avirup Sil, ChengXiang Zhai, Kathleen McKeown, Heng Ji:

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges. 1599-1633 - Yuanchen Shi, Jiawang Hao, Fang Kong:

Beyond Coarse Labels: Fine-Grained Problem Augmentation and Multi-Dimensional Feedback for Emotional Support Conversation. 1634-1647 - Jiaxiang Chen, Mingxi Zou, Zhuo Wang, Qifan Wang, Danny Dongning Sun, Zhang Chi, Zenglin Xu:

FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making. 1648-1672 - Bohan Yu, Yekun Chai:

EvolKV: Evolutionary KV Cache Compression for LLM Inference. 1673-1689 - Dong Shu, Xuansheng Wu, Haiyan Zhao, Daking Rai, Ziyu Yao, Ninghao Liu, Mengnan Du:

A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models. 1690-1712 - Dong Shu, Haiyan Zhao, Jingyu Hu, Weiru Liu, Ali Payani, Lu Cheng, Mengnan Du:

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability. 1713-1735 - Tian Lan, Jinyuan Xu, Xue He, Jenq-Neng Hwang, Lei Li:

Attention Consistency for LLMs Explanation. 1736-1750 - Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, Zhifei Zheng, Min Liu, Zhiyi Yin, Jianping Zhang:

Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs. 1751-1767 - Weihua Zheng, Roy Ka-Wei Lee, Zhengyuan Liu, Wu Kui, AiTi Aw, Bowei Zou:

CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation. 1768-1788 - Jinu Lee, Julia Hockenmaier:

Evaluating Step-by-step Reasoning Traces: A Survey. 1789-1814 - Kepu Zhang, Haoyue Yang, Xu Tang, Weijie Yu, Jun Xu:

Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning. 1815-1826 - Yixin Wan, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Rahul Gupta:

Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models. 1827-1835 - Kai Yin, Xiangjue Dong, Chengkai Liu, Lipai Huang, Yiming Xiao, Zhewei Liu, Ali Mostafavi, James Caverlee:

DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management. 1836-1867 - Yiming Liu, Yuhui Zhang, Dhruba Ghosh, Ludwig Schmidt, Serena Yeung-Levy:

Data or Language Supervision: What Makes CLIP Better than DINO? 1868-1874 - Chenye Zou, Xingyue Wen, Tianyi Hu, Qian Janice Wang, Daniel Hershcovich:

Do LLMs Understand Wine Descriptors Across Cultures? A Benchmark for Cultural Adaptations of Wine Reviews. 1875-1894 - Sona Elza Simon, Preethi Jyothi:

DeFT-X: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer. 1895-1909 - Jianjian Liu, Ying Li, Zhengtao Yu, Shun Su, Shengxiang Gao, Yuxin Huang:

Memory-enhanced Large Language Model for Cross-lingual Dependency Parsing via Deep Hierarchical Syntax Understanding. 1910-1923 - Jiyue Jiang, Alfred Kar Yin Truong, Yanyu Chen, Qinghang Bao, Sheng Wang, Pengan Chen, Jiuming Wang, Lingpeng Kong, Yu Li, Chuan Wu:

Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models. 1924-1944 - Haorui Yu, Ramon Ruiz-Dolz, Qiufeng Yi:

A Structured Framework for Evaluating and Enhancing Interpretive Capabilities of Multimodal LLMs in Culturally Situated Tasks. 1945-1971 - Weizhi Wang, Rongmei Lin, Shiyang Li, Colin Lockard, Ritesh Sarkhel, Sanket Lokegaonkar, Jingbo Shang, Xifeng Yan, Nasser Zalmout, Xian Li:

Train a Unified Multimodal Data Quality Classifier with Synthetic Data. 1972-1986 - Shijian Deng, Kai Wang, Tianyu Yang, Harsh Singh, Yapeng Tian:

Self-Improvement in Multimodal Large Language Models: A Survey. 1987-2006 - Milan Bhan, Yann Choho, Jean-Noël Vittaut, Nicolas Chesneau, Pierre Moreau, Marie-Jeanne Lesot:

Towards Achieving Concept Completeness for Textual Concept Bottleneck Models. 2007-2024 - Daryna Dementieva, Nikolay Babakov, Alexander Fraser:

EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian. 2025-2048 - Yunyi Zhang, Ruozhen Yang, Siqi Jiao, SeongKu Kang, Jiawei Han:

Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking. 2049-2060 - Zeqiang Wang, Jon Johnson, Suparna De:

DLIR: Spherical Adaptation for Cross-Lingual Knowledge Transfer of Sociological Concepts Alignment. 2061-2075 - Qihang Zhang, Muchen Li, Ziao Wang, Renjie Liao, Lele Wang:

Test-Time Steering for Lossless Text Compression via Weighted Product of Experts. 2076-2088 - Philip Lippmann, Jie Yang:

Zero-Shot Contextual Embeddings via Offline Synthetic Corpus Generation. 2089-2104 - Linxin Song, Taiwei Shi, Jieyu Zhao:

The Hallucination Tax of Reinforcement Finetuning. 2105-2120 - Yihong Liu, Mingyang Wang, Amir Hossein Kargaran, Felicia Körner, Ercong Nie, Barbara Plank, François Yvon, Hinrich Schütze:

Tracing Multilingual Factual Knowledge Acquisition in Pretraining. 2121-2146 - Jun Zhuang, Hai Jin, Ye Zhang, Zhengjian Kang, Wenbin Zhang, Gaby G. Dagher, Haohan Wang:

Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation. 2147-2160 - Andrianos Michail, Simon Clematide, Rico Sennrich:

Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples. 2161-2170 - Ronald Seoh, Dan Goldwasser:

EmoGist: Efficient In-Context Learning for Visual Emotion Understanding. 2171-2182 - Haokun Chen, Sebastian Szyller, Weilin Xu, Nageen Himayat:

Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models. 2183-2192 - Yiming Zeng, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu:

Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications. 2193-2206 - Yaochen Zhu, Harald Steck, Dawen Liang, Yinhan He, Nathan Kallus, Jundong Li:

LLM-based Conversational Recommendation Agents with Collaborative Verbalized Experience. 2207-2220 - Hao Mark Chen, Wayne Luk, Yiu Ka Fai Cedric, Rui Li, Konstantin Mishchenko, Stylianos I. Venieris, Hongxiang Fan:

Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference. 2221-2238 - Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu:

Measuring Sycophancy of Language Models in Multi-turn Dialogues. 2239-2259 - Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song:

On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions. 2260-2281 - Junda Wu, Yuxin Xiong, Xintong Li, Yu Xia, Ruoyu Wang, Yu Wang, Tong Yu, Sungchul Kim, Ryan A. Rossi, Lina Yao, Jingbo Shang, Julian J. McAuley:

Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent. 2282-2295 - Yating Huang, Ziyan Huang, Lintao Xiang, Qijun Yang, Hujun Yin:

PathoHR: Hierarchical Reasoning for Vision-Language Models in Pathology. 2296-2311 - Akshay Paruchuri, Maryam Aziz, Rohit Vartak, Ayman Ali, Best Uchehara, Xin Liu, Ishan Chatterjee, Monica Agrawal:

"What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets. 2312-2336 - Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang, William Yang Wang:

Dynamic Evaluation for Oversensitivity in LLMs. 2337-2344 - Jeonghun Cho, Deokhyung Kang, Hyounghun Kim, Gary Lee:

Self-Correcting Code Generation Using Small Language Models. 2345-2368 - Van-Thuy Phi, Yuji Matsumoto:

A Unified Framework for N-ary Property Information Extraction in Materials Science. 2369-2388 - Xin Tan, Bowei Zou, AiTi Aw:

A Benchmark for Translations Across Styles and Language Variants. 2389-2402 - Lisheng Huang, Yichen Liu, Jinhao Jiang, Rongxiang Zhang, Jiahao Yan, Junyi Li, Xin Zhao:

ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework. 2403-2417 - Shiki Sato, Jun Baba, Asahi Hentona, Shinji Iwata, Akifumi Yoshimoto, Koichiro Yoshino:

Proactive User Information Acquisition via Chats on User-Favored Topics. 2418-2443 - Zhichen Liu, Yongyuan Li, Yang Xu, Yu Wang, Yingfang Yuan, Zuhao Yang:

Evaluating Text Generation Quality Using Spectral Distances of Surprisal. 2444-2463 - Yuangang Li, Jiaqi Li, Zhuo Xiao, Tiankai Yang, Yi Nian, Xiyang Hu, Yue Zhao:

NLP-ADBench: NLP Anomaly Detection Benchmark. 2464-2474 - Prommy Sultana Hossain, Chahat Raj, Ziwei Zhu, Jessica Lin, Emanuela Marasco:

Toward Inclusive Language Models: Sparsity-Driven Calibration for Systematic and Interpretable Mitigation of Social Biases in LLMs. 2475-2508 - Xanh Ho, Sunisth Kumar, Yun-Ang Wu, Florian Boudin, Atsuhiro Takasu, Akiko Aizawa:

Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers. 2509-2517 - Chengyu Huang, Tanya Goyal:

DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization. 2518-2537 - Pengfei He, Zitao Li, Yue Xing, Yaliang Li, Jiliang Tang, Bolin Ding:

Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure Perspective. 2538-2566 - Dongning Rao, Songlin He, Zhihua Jiang, Ruishi Liang:

LLM-based Open Domain Planning by Leveraging Entity-Attribute-Level Domain Models. 2567-2588 - Lin Mu, Jun Shen, Li Ni, Lei Sang, Zhize Wu, Peiquan Jin, Yiwen Zhang:

DICP: Deep In-Context Prompt for Event Causality Identification. 2589-2599 - Weiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma:

Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation. 2600-2617 - Yuhang Tian, Pan Yang, Dandan Song, Zhijing Wu, Hao Wang:

GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware Validation. 2618-2632 - Jiong Wang, Shengquan Yu:

Improving Prompt Generalization for Cross-prompt Essay Trait Scoring from the Scoring-invariance Perspective. 2633-2646 - Cheongwoong Kang, Jongeun Baek, Yeonjea Kim, Jaesik Choi:

When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models. 2647-2667 - Amelia Hardy, Houjun Liu, Allie Griffith, Bernard Lange, Duncan Eddy, Mykel J. Kochenderfer:

ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts. 2668-2683 - Xiao Luo, Changhu Wang, Yizhou Sun, Wei Wang:

How Do Large Language Models Perform on PDE Discovery: A Coarse-to-fine Perspective. 2684-2697 - Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yuan Tian, Yi Chang, Junyang Lin:

Rethinking Data Selection at Scale: Random Selection is Almost All You Need. 2698-2711 - Zhifeng Jiang, Zhihua Jin, Guoliang He:

PromptKeeper: Safeguarding System Prompts for LLMs. 2712-2728 - Ding Xia, Xinyue Gui, Fan Gao, Dongyuan Li, Mark Colley, Takeo Igarashi:

Automating eHMI Action Design with LLMs for Automated Vehicle Communication. 2729-2752 - Xiaoying Song, Anirban Saha Anik, Eduardo Blanco, Vanessa Frías-Martínez, Lingzi Hong:

A Dynamic Fusion Model for Consistent Crisis Response. 2753-2768 - Chuhuai Yue, Jiajun Chai, Yufei Zhan, Zixiang Ding, Xihao Liang, Peixin Wang, Shihai Chen, Wang Yixuan, Wang Yanping, Guojun Yin, Wei Lin:

UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent System. 2769-2782 - Kunze Li, Yu Zhang:

CrossQG: Improving Difficulty-Controllable Question Generation through Consistency Enhancement. 2783-2798 - Yejin Jeon, Youngjae Kim, Jihyun Lee, Hyounghun Kim, Gary Lee:

Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech Synthesis. 2799-2811 - Xiaoying Song, Anirban Saha Anik, Dibakar Barua, Pengcheng Luo, Junhua Ding, Lingzi Hong:

Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL. 2812-2830 - Zijian Zheng, Yonghe Lu, Jian Yin:

FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text. 2831-2846 - Xiantao Zhang:

AuraDial: A Large-Scale Human-Centric Dialogue Dataset for Chinese AI Psychological Counseling. 2847-2863 - Wenbo Xu, Haifeng Zhu, Liang Yan, Chuanyi Liu, Peiyi Han, Shaoming Duan, Jeff Z. Pan:

TS-SQL: Test-driven Self-refinement for Text-to-SQL. 2864-2889 - Pengyu Zhu, Zhenhong Zhou, Yuanhe Zhang, Shilinlu Yan, Kun Wang, Sen Su:

DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent. 2890-2912 - Xinping Lei, Tong Zhou, Yubo Chen, Kang Liu, Jun Zhao:

MotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM Ideation. 2913-2933 - Haz Sameen Shahgir, Chansong Lim, Jia Chen, Evangelos E. Papalexakis, Yue Dong:

ExpertGenQA: Open-ended QA generation in Specialized Domains. 2934-2955 - Yuansheng Ni, Ping Nie, Kai Zou, Xiang Yue, Wenhu Chen:

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation. 2956-2983 - Jiahuan Pei, Fanghua Ye, Xin Sun, Wentao Deng, Koen V. Hindriks, Junxiao Wang:

Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment. 2984-2997 - Michal Shlapentokh-Rothman, Yu-Xiong Wang, Derek Hoiem:

Visual Program Distillation with Template-Based Augmentation. 2998-3018 - Sicheng Wang, Wenyi Wu, Zibo Zhang:

NeighXLM: Enhancing Cross-Lingual Transfer in Low-Resource Languages via Neighbor-Augmented Contrastive Pretraining. 3019-3030 - Dezheng Gao, Xiaozheng Dong, Shuangtao Yang, Bo Fu:

ICLER: Intent CLassification with Enhanced Reasoning. 3031-3044 - Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, Ying-Cong Chen:

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation. 3045-3063 - Tianjiao Li, Mengran Yu, Chenyu Shi, Yanjun Zhao, Xiaojing Liu, Qi Zhang, Xuanjing Huang, Qiang Zhang, Jiayin Wang:

RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation. 3064-3079 - Siyue Zhang, Yuxiang Xue, Yiming Zhang, Xiaobao Wu, Anh Tuan Luu, Chen Zhao:

MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering. 3080-3118 - Feiyang Li, Peng Fang, Zhan Shi, Arijit Khan, Fang Wang Weihao Wang, Xin Zhang, Cui Yongjian:

CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models. 3119-3171 - Changjiang Jiang, Fengchang Yu, Haihua Chen, Wei Lu, Jin Zeng:

TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data. 3172-3196 - Dawei Zhu, Xiyu Wei, Guangxiang Zhao, Wenhao Wu, Haosheng Zou, Junfeng Ran, Xun Wang, Lin Sun, Xiangzheng Zhang, Sujian Li:

Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision. 3197-3211 - Xiang Li, Runhai Jiao, Changyu Zhou, Shoupeng Qiao, Ruojiao Qiao, Ruifan Li:

Multimodal Document-level Triple Extraction via Dynamic Graph Enhancement and Relation-Aware Reflection. 3212-3223 - Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang:

Distill Visual Chart Reasoning Ability from LLMs to MLLMs. 3224-3250 - Minghao Hu, Junzhe Wang, Weisen Zhao, Qiang Zeng, Lannan Luo:

FlowMalTrans: Unsupervised Binary Code Translation for Malware Detection Using Flow-Adapter Architecture. 3251-3272 - Fengyuan Sun, Leqi Shen, Hui Chen, Sicheng Zhao, Jungong Han, Guiguang Ding:

AdaTP: Attention-Debiased Token Pruning for Video Large Language Models. 3273-3286 - Runchuan Zhu, Bowen Jiang, Lingrui Mei, Fangkai Yang, Lu Wang, Haoxiang Gao, Fengshuo Bai, Pu Zhao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang:

AdaptFlow: Adaptive Workflow Optimization via Meta-Learning. 3287-3302 - Jon Saad-Falcon, Rajan Vivek, William Berrios, Nandita Shankar Naik, Matija Franklin, Bertie Vidgen, Amanpreet Singh, Douwe Kiela, Shikib Mehri:

LMUNIT: Fine-grained Evaluation with Natural Language Unit Tests. 3303-3324 - Shan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, Yu Liu:

ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement. 3325-3347 - Jinwen Chen, Hainan Zhang, Fei Sun, Qinnan Zhang, Sijia Wen, Ziwei Wang, Zhiming Zheng:

Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models. 3348-3365 - Wenzhang Yang, Xiaoning Ren, Cuifeng Gao, Yinxing Xue:

Rust-doctor: Enhanced Feature for Rust Ownership and Lifetime Repair with Balanced Training Data Generation. 3366-3376 - Xifeng Yao, Chengyuan Ma, Dongyu Lang, Yinhao Ni, Zhiwei Xu, Huarui Xie, Zihao Chen, Guang Shen, Dandan Tu, Yi Bai, Changzheng Zhang:

SLIM: Subtrajectory-Level Elimination for More Effective Reasoning. 3377-3394 - Zihan Chen, Song Wang, Xingbo Fu, Chengshuai Shi, Zhenyu Lei, Cong Shen, Jun-Dong Li:

From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning. 3395-3410 - Yiyang Li, Yonghuang Wu, Ying Luo, Liangtai Sun, Zishu Qin, Lin Qiu, Xuezhi Cao, Xunliang Cai:

Instance-level Randomization: Toward More Stable LLM Evaluations. 3411-3425 - Zihao Li, Feihao Fang, Xitong Zhang, Jiaru Zou, Zhining Liu, Wei Xiong, Ziwei Wu, Baoyu Jing, Jingrui He:

Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity. 3426-3455 - Haonan Tong, Ke Liu, Chuang Zhang, Xinglin Zhang, Tao Chen, Jenq-Neng Hwang, Lei Li:

PAMN: Multi-phase Correlation Modeling for Contrast-Enhanced 3D Medical Image Retrieval. 3456-3467 - Cheng Wang, Yue Liu, Baolong Bi, Duzhen Zhang, Zhong-Zhi Li, Yingwei Ma, Yufei He, Shengju Yu, Xinfeng Li, Junfeng Fang, Jiaheng Zhang, Bryan Hooi:

Safety in Large Reasoning Models: A Survey. 3468-3482 - Bo Zhang, Cong Gao, Linkang Yang, Bingxu Han, Minghao Hu, Zhunchen Luo, Guotong Geng, Xiaoying Bai, Jun Zhang, Wen Yao, Zhong Wang:

SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models. 3483-3495 - Jinxu Zhang, Qiyuan Fan, Yu Zhang:

DocAssistant: Integrating Key-region Reading and Step-wise Reasoning for Robust Document Visual Question Answering. 3496-3511 - Ruijie Hou, Jiao Yueyang, Hanxu Hu, Yingming Li, Wai Lam, Huajian Zhang, Hongyuan Lu:

LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models. 3512-3528 - Michael van Supranes, Shaowen Peng, Shoko Wakamiya, Eiji Aramaki:

Enhancing Hate Speech Classifiers through a Gradient-assisted Counterfactual Text Generation Strategy. 3529-3544 - Xiaohu Zhu, Qian Li, Lizhen Cui, Yuntao Du:

Learning SQL Like a Human: Structure-Aware Curriculum Learning for Text-to-SQL Generation. 3545-3559 - Jason S. Lucas, Ali Al-Lawati, Mahjabin Nahar, John Chen, Mahnoosh Mehrabani:

Chain-of-Interactions: Multi-step Iterative ICL Framework for Abstractive Task-Oriented Dialogue Summarization of Conversational AI Interactions. 3560-3599 - Zekun Fei, Biao Yi, Jianing Geng, Ruiqi He, Lihai Nie, Zheli Liu:

Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS Watermark. 3600-3614 - Youan Cong, Pritom Saha Akash, Cheng Wang, Kevin Chen-Chuan Chang:

Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models. 3615-3625 - Zhiqiang Liu, Enpei Niu, Yin Hua, Mengshu Sun, Lei Liang, Huajun Chen, Wen Zhang:

SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs. 3626-3640 - Yuanhe Zhang, Xinyue Wang, Haoran Gao, Zhenhong Zhou, Fanyu Meng, Yuyao Zhang, Sen Su:

PD³F: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language Models. 3641-3671 - Jiaxiang Chen, Zhuo Wang, Mingxi Zou, Zhucong Li, Zhijian Zhou, Song Wang, Zenglin Xu:

From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs. 3672-3684 - Yi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jianfeng Qu, Jiajie Xu:

PIP: Perturbation-based Iterative Pruning for Large Language Models. 3685-3701 - Xinhao Wu, Jialin Liu, Yutai Duan, Jie Liu:

Convolutional LoRA Aggregation for Unseen Tasks Adaptation. 3702-3714 - Haosi Mo, Xinyu Ma, Xuebo Liu, Derek F. Wong, Yu Li, Jie Liu, Min Zhang:

CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task. 3715-3734 - Hongliang Li, Jinan Xu, Gengping Cui, Changhao Guan, Fengran Mo, Kaiyu Huang:

Multilingual Collaborative Defense for Large Language Models. 3735-3755 - Hongfei Du, Jiacheng Shi, Jacobo Myerston, Sidi Lu, Gang Zhou, Ashley Gao:

Role-Guided Annotation and Prototype-Aligned Representation Learning for Historical Literature Sentiment Classification. 3756-3768 - Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, Yunpeng Liu:

MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition. 3769-3779 - Ashish R. Mittal, Sunita Sarawagi, Preethi Jyothi:

RECAST: Retrieval-Augmented Contextual ASR via Decoder-State Keyword Spotting. 3780-3793 - Xubin Yue, Zhenhua Xu, Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han:

PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix Enhancement. 3794-3804 - Zichen Wu, Hsiu-Yuan Huang, Yunfang Wu:

Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing. 3805-3825 - Yun-Da Tsai, Ting-Yu Yen, Pei-Fu Guo, Zhe-Yan Li, Shou-De Lin:

Text-centric Alignment for Bridging Test-time Unseen Modality. 3826-3845 - Qian Zhang, Qinliang Su, Wei Zhu, Pang Yachun:

HierPrompt: Zero-Shot Hierarchical Text Classification with LLM-Enhanced Prototypes. 3846-3859 - Zhongzhan Huang, Guoming Ling, Yupei Lin, Yandong Chen, Shanshan Zhong, Hefeng Wu, Liang Lin:

RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs. 3860-3887 - Xingsheng Zhang, Luxi Xing, Chen Zhang, Yanbing Liu, Yifan Deng, Yunpeng Li, Yue Hu, Chenxu Niu:

Can We Steer Reasoning Direction by Thinking Intervention? 3888-3913 - Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, Xun Wang, Sujian Li:

MPO: Boosting LLM Agents with Meta Plan Optimization. 3914-3935 - Siyuan Zhang, Yichi Zhang, Yinpeng Dong, Hang Su:

Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization. 3936-3968 - S. M. Rafiuddin, Muntaha Nujat Khan:

Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models. 3969-3981 - Xiaoran Yin, Xu Luo, Hao Wu, Lianli Gao, Jingkuan Song:

Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach. 3982-4005 - Sichu Liang, Linhai Zhang, Hongyu Zhu, Wenwen Wang, Yulan He, Deyu Zhou:

RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering. 4006-4033 - Ruobing Yao, Yifei Zhang, Shuang Song, Neng Gao, Chenyang Tu:

EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation. 4034-4050 - Kaustubh Shivshankar Shejole, Pushpak Bhattacharyya:

StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings. 4051-4082 - Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao:

Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning. 4083-4103 - Xiangci Li, Jessica Ouyang:

How Does Knowledge Selection Help Retrieval Augmented Generation? 4104-4121 - Tianlong Li, Wenhao Liu, Muling Wu, Shihan Dou, Zhenghua Wang, Changze Lv, Xiaohua Wang, Xiaoqing Zheng, Xuanjing Huang:

UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation. 4122-4136 - Ruobing Yao, Yifei Zhang, Shuang Song, Yuhan Liu, Neng Gao, Chenyang Tu:

ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation. 4137-4151 - Fangxin Liu, Zongwu Wang, JinHong Xia, Junping Zhao, Shouren Zhao, Jinjin Li, Jian Liu, Li Jiang, Haibing Guan:

FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization. 4152-4161 - Jianjiang Yang, Yanshu Li, Ziyan Huang:

ReLoop: "Seeing Twice and Thinking Backwards" via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding. 4162-4179 - Zhenqi Ye, Haopeng Ren, Yi Cai, Qingbao Huang, Jing Qin, Pinli Zhu, Songwen Gong:

Sequence Structure Aware Retriever for Procedural Document Retrieval: A New Dataset and Baseline. 4180-4198 - David Stap, Christof Monz:

The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation. 4199-4211 - Chenghao Liu, Qian Liu, Ziqin Zhu, Hao Fei, Aniket Mahanti:

David vs. Goliath: Cost-Efficient Financial QA via Cascaded Multi-Agent Reasoning. 4212-4229 - Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin:

Benchmarking Uncertainty Metrics for LLM Target-Aware Search. 4230-4238 - Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni:

ZOGRASCOPE: A New Benchmark for Semantic Parsing over Property Graphs. 4239-4246 - Ruosen Li, Ziming Luo, Xinya Du:

FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning. 4247-4278 - Zirui Wu, Xiao Liu, Jiayi Li, Lingpeng Kong, Yansong Feng:

Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions. 4279-4301 - Zhenhua Xu, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han:

Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models. 4302-4312 - Fang Wang, Zhengwei Tao, Ming Wang, Minghao Hu, Xiaoying Bai:

AELC: Adaptive Entity Linking with LLM-Driven Contextualization. 4313-4327 - Honglin Lin, Zhuoshi Pan, Qizhi Pei, Xin Gao, Yu Li, Mengzhang Cai, Conghui He, Lijun Wu:

MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer. 4328-4354 - Yunqing Liu, Wenqi Fan, Xiaoyong Wei, Li Qing:

GLProtein: Global-and-Local Structure Aware Protein Representation Learning. 4355-4372 - Changshuo Zhang, Ang Gao, Xiao Zhang, Yong Liu, Deyang Li, Fangchao Liu, Xinyu Zhang:

Reward Mixology: Crafting Hybrid Signals for Reinforcement Learning Driven In-Context Learning. 4373-4383 - Zhengzhao Lai, Youbin Zheng, Zhenyang Cai, Haonan Lyu, Jingpu Yang, Hongqing Liang, Yan Hu, Benyou Wang:

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization. 4384-4404 - Jeongsoo Lee, Daeyong Kwon, Kyohoon Jin:

GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG Evaluation. 4405-4424 - Zhaohan Meng, Zaiqiao Meng, Ke Yuan, Iadh Ounis:

FusionDTI: Fine-grained Binding Discovery with Token-level Fusion for Drug-Target Interaction. 4425-4444 - Birong Pan, Yongqi Li, Weiyu Zhang, Wenpeng Lu, Mayi Xu, Shen Zhou, Yuanyuan Zhu, Ming Zhong, Tieyun Qian:

A Survey on Training-free Alignment of Large Language Models. 4445-4461 - Massimo Rizzoli, Simone Alghisi, Olha Khomyn, Gabriel Roccabruna, Seyed Mahed Mousavi, Giuseppe Riccardi:

CIVET: Systematic Evaluation of Understanding in VLMs. 4462-4480 - Yoshiki Takenami, Yin Jou Huang, Yugo Murawaki, Chenhui Chu:

How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations. 4481-4498 - Pengchao Feng, Ziyang Ma, Wenxi Chen, Yao Li, Sheng Wang, Kai Yu, Xie Chen:

Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation. 4499-4507 - Yulin Chen, Haoran Li, Yuan Sui, Yangqiu Song, Bryan Hooi:

Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods. 4508-4527 - Hao Wang, Dandan Song, Zhijing Wu, Yuhang Tian, Pan Yang:

Path-enhanced Pre-trained Language Model for Knowledge Graph Completion. 4528-4540 - Zhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou:

Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective. 4541-4557 - Chuming Shen, Wei Wei, Dong Wang, Zhong-Hao Wang:

Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought Reasoning. 4558-4573 - Song Yu, Xiaofei Xu, Ke Deng, Li Li, Lin Tian:

Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning. 4574-4592 - Saeed Almheiri, Rania Elbadry, Mena Attia, Chenxi Wang, Preslav Nakov, Timothy Baldwin, Fajri Koto:

Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World. 4593-4614 - Long Zhang, Peipei Song, Jianfeng Dong, Kun Li, Xun Yang:

Enhancing Partially Relevant Video Retrieval with Robust Alignment Learning. 4615-4629 - Yebin Lim, Susik Yoon:

Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language Models. 4630-4655 - Jianing Wang, Jin Jiang, Yang Liu, Mengdi Zhang, Xunliang Cai:

Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning. 4656-4673 - Zijian Li, Xiaocheng Feng, Huixin Liu, Yichong Huang, Ting Liu, Bing Qin:

FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging. 4674-4687 - Boyu Qiao, Kun Li, Wei Zhou, Songlin Hu:

Dynamic Simulation Framework for Disinformation Dissemination and Correction With Social Bots. 4688-4710 - Zhaohui Yang, Chenghua He, Xiaowen Shi, Shihong Deng, Linjing Li, Qiyue Yin, Daxin Jiang:

Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning. 4711-4728 - Youneng Ma, Junyi He, Haojun Fei:

PrAd: Prompt Adaptive Tuning for Decoder-only Language Models. 4729-4743 - Hang Su, Yun Yang, Tianyang Liu, Xin Liu, Peng Pu, Xuesong Lu:

Personalized Question Answering with User Profile Generation and Compression. 4744-4763 - Yue Zhao, Xiaoyu Wang, Dan Wang, Zhonglin Jiang, Qingqing Gu, Teng Chen, Ningyuan Xi, Jinxian Qu, Yong Chen, Luo Ji:

Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling. 4764-4781 - Junxi Wang, Yaxiong Wang, Lechao Cheng, Zhun Zhong:

FakeSV-VLM: Taming VLM for Detecting Fake Short-Video News via Progressive Mixture-Of-Experts Adapter. 4782-4798 - Zhen Wang, Xi Zhou, Yating Yang, Bo Ma, Lei Wang, Rui Dong, Azmat Anwar:

Beyond Inherent Cognition Biases in LLM-Based Event Forecasting: A Multi-Cognition Agentic Framework. 4799-4818 - Tzu-Ling Lin, Wei-Chih Chen, Teng-Fang Hsiao, Hou-I Liu, Ya-Hsin Yeh, Yu Kai Chan, Wen-Sheng Lien, Po-Yen Kuo, Philip S. Yu, Hong-Han Shuai:

Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks. 4819-4839 - He Li, Xiaojun Chen, Zhendong Zhao, Yunfei Yang, Xin Zhao, Jingcheng He:

Watermarking with Low-Entropy POS-Guided Token Partitioning and Z-Score-Driven Dynamic Bias for Large Language Models. 4840-4859 - Jinhu Fu, Kun Wang, Chongye Guo, Junfeng Fang, Wentao Zhang, Sen Su:

Knowledge Graph-Driven Memory Editing with Directional Interventions. 4860-4874 - Bofan Wei, Hongyuan Xu, Yuhang Niu, Jiarui Ren, Yanlong Wen, Xiaojie Yuan:

DTDES-KGE: Dual-Teacher Knowledge Distillation with Distinct Embedding Spaces for Knowledge Graph Embeddings. 4875-4887 - Ming Zhang, Yujiong Shen, Zelin Li, Huayu Sha, Binze Hu, Yuhui Wang, Chenhao Huang, Shichun Liu, Jingqi Tong, Changhao Jiang, Mingxu Chai, Zhiheng Xi, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang:

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation. 4888-4914 - Hongyan Chang, Hamed Hassani, Reza Shokri:

Watermark Smoothing Attacks against Language Models. 4915-4941 - Wenbin Hua, Rui Fan, Tingting He, Ming Dong:

PICD-Instruct: A Generative Instruction Learning Framework for Few-Shot Multi-Intent Spoken Language Understanding. 4942-4956 - Sheng Liu, Qiang Sheng, Danding Wang, Yang Liu, Guang Yang, Juan Cao:

Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks. 4957-4974 - Xi Ai, Mahardika Krisna Ihsani, Min-Yen Kan:

Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent? 4975-5011 - Dimitris Roussis, Leon Voukoutis, Georgios Paraskevopoulos, Sokratis Sofianopoulos, Prokopis Prokopidis, Vassilis P. Plagianakos, Athanasios Katsamanis, Stelios Piperidis, Vassilis Katsouros:

Krikri: Advancing Open Large Language Models for Greek. 5012-5033 - Quoc-An Nguyen, Xuan-Hung Le, Thi-Minh-Thu Vu, Hoang-Quynh Le:

Beyond the Scientific Document: A Citation-Aware Multi-Granular Summarization Approach with Heterogeneous Graphs. 5034-5046 - Haoyu Ma, Qinliang Su, Minhua Huang, Wu Kai:

Detecting Continuously Evolving Scam Calls under Limited Annotation: A LLM-Augmented Expert Rule Framework. 5047-5068 - Ziyang Zeng, Dun Zhang, Jiacheng Li, Panxiang Zou, Yudong Zhou, Yuqing Yang:

An Empirical Study of Position Bias in Modern Information Retrieval. 5069-5081 - Xuebing Liu, Shanbao Qiao, Seung-Hoon Na:

GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs. 5082-5097 - Wenhan Liu, Xinyu Ma, Yutao Zhu, Lixin Su, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou:

CoRanking: Collaborative Ranking with Small and Large Ranking Agents. 5098-5110 - YiHan Jiao, ZheHao Tan, Dan Yang, DuoLin Sun, Jie Feng, Yue Shen, Jian Wang, Peng Wei:

HIRAG: Hierarchical-Thought Instruction-Tuning Retrieval-Augmented Generation. 5111-5130 - Tongyoung Kim, Jeongeun Lee, Soojin Yoon, SungHwan Kim, Dongha Lee:

Towards Personalized Conversational Sales Agents: Contextual User Profiling for Strategic Action. 5131-5154 - Minda Hu, Tianqing Fang, Jianshu Zhang, Jun-Yu Ma, Zhisong Zhang, Jingyan Zhou, Hongming Zhang, Haitao Mi, Dong Yu, Irwin King:

WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback. 5155-5173 - Yuxuan Zhang, Yangfu Zhu, Haorui Wang, Bin Wu:

Interesting Culture: Social Relation Recognition from Videos via Culture De-confounding. 5174-5184 - Guosheng Liang, Longguang Zhong, Ziyi Yang, Xiaojun Quan:

ThinkSwitcher: When to Think Hard, When to Think Fast. 5185-5201 - Nguyen Manh Hieu, Vu Lam Anh, Hung Pham Van, Nam Le Hai, Ngo Van Linh, Nguyen Thi Ngoc Diep, Thien Huu Nguyen:

MaGiX: A Multi-Granular Adaptive Graph Intelligence Framework for Enhancing Cross-Lingual RAG. 5202-5219 - Claire Barale, Leslie Barrett, Vikram Sunil Bajaj, Michael Rovatsos:

LexTime: A Benchmark for Temporal Ordering of Legal Events. 5220-5236 - Shiwen Zhang, Lingxiang Wang, Hainan Zhang, Ziwei Wang, Sijia Wen, Zhiming Zheng:

Beyond the Surface: A Solution-Aware Retrieval Model for Competition-level Code Generation. 5237-5246 - Xiaoya Lu, Dongrui Liu, Yi Yu, Luxin Xu, Jing Shao:

X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising Usability. 5247-5272 - Sagiv Antebi, Edan Habler, Asaf Shabtai, Yuval Elovici:

Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack. 5273-5286 - Xinyi Mou, Chen Qian, Wei Liu, Ling Yan, Yao Hu, Xuanjing Huang, Zhongyu Wei:

EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation. 5287-5304 - Seokhyun An, Minji Kim, Hyounghun Kim:

Revealing the Inherent Instructability of Pre-Trained Language Models. 5305-5336 - Shijia Zhou, Siyao Peng, Simon Luebke, Jörg Haßler, Mario Haim, Saif M. Mohammad, Barbara Plank:

What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change Discourse. 5337-5356 - Baiqiao Zhang, Zhifeng Liao, Xiangxian Li, Chao Zhou, Juan Liu, Xiaojuan Ma, Yulong Bian:

Rethinking Personality Assessment from Human-Agent Dialogues: Fewer Rounds May Be Better Than More. 5357-5380 - Zhenpeng Gao, Xiaofen Xing, Xiangmin Xu:

TailorRPA: A Retrieval-Based Framework for Eliciting Personalized and Coherent Role-Playing Agents in General Domain. 5381-5412 - Yanwen Huang, Yao Liu, Qiao Liu, Rui Hou, Tingting Dai:

SCE: Semantic Consistency Enhanced Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning. 5413-5425 - Soohyeong Kim, Seok Jun Hwang, JungHyoun Kim, Jeonghyeon Park, Yong Suk Choi:

ReGraphRAG: Reorganizing Fragmented Knowledge Graphs for Multi-Perspective Retrieval-Augmented Generation. 5426-5443 - Manuel Frank, Haithem Afli:

GASE: Generatively Augmented Sentence Encoding. 5444-5461 - Arianna Muti, Chris Emmery, Debora Nozza, Alberto Barrón-Cedeño, Tommaso Caselli:

The "r" in "woman" stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny. 5462-5479 - Yuanzhen Hao, Desheng Wu:

Fact Verification on Knowledge Graph via Programmatic Graph Reasoning. 5480-5495 - Tianmi Ma, Jiawei Du, Wenxin Huang, Wenjie Wang, Liang Xie, Xian Zhong, Joey Tianyi Zhou:

Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents. 5496-5514 - Arnav Attri, Anuj Attri, Suman Banerjee, Amey Patil, Muthusamy Chelliah, Nikesh Garera, Pushpak Bhattacharyya:

Why We Feel What We Feel: Joint Detection of Emotions and Their Opinion Triggers in E-commerce. 5515-5532 - Ján Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Mária Bieliková, Peter Brusilovsky:

Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation. 5533-5550 - Pramit Bhattacharyya, Arnab Bhattacharya:

BanglaByT5: Byte-Level Modelling for Bangla. 5551-5560 - Tien-Phat Nguyen, Vu Minh Ngo, Tung Nguyen, Linh Ngo Van, Duc Anh Nguyen, Dinh Viet Sang, Trung Le:

XTRA: Cross-Lingual Topic Modeling with Topic and Representation Alignments. 5561-5575 - Zihan Wang, Siyao Liu, Yang Sun, Ming Ding, Hongyan Li:

CodeContests+: High-Quality Test Case Generation for Competitive Programming. 5576-5600 - Yuhao Sun, Yifan Zhang, Quandong Wang, Qinzhuo Wu, Wei Liu, Jian Luan:

SPO: Self Preference Optimization with Self Regularization. 5601-5614 - Yijiong Yu, Zhixiao Qi, Yongfeng Huang, Wei Wang, Weifeng Liu, Ran Chen, Ji Pei:

Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps. 5615-5634 - Blanca Calvo Figueras, Rodrigo Agerri:

Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models. 5635-5652 - Hao Kang, Chenyan Xiong:

ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research Agents. 5653-5671 - Zipeng Ye, Wenjian Luo:

LLMs are Privacy Erasable. 5672-5692 - Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt:

How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models. 5693-5709 - Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt:

DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation. 5710-5723 - Ruiling Guo, Xinwei Yang, Chen Huang, Tong Zhang, Yong Hu:

CANDY: Benchmarking LLMs' Limitations and Assistive Potential in Chinese Misinformation Fact-Checking. 5724-5758 - Zeyang Liu, Jingfeng Xue, Xiuqi Yang, Wenbiao Du, Jiarun Fu, Junbao Chen, Wenjie Guo, Yong Wang:

E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality Verification. 5759-5776 - Guorui Chen, Yifan Xia, Xiaojun Jia, Zhijiang Li, Philip Torr, Jindong Gu:

LLM Jailbreak Detection for (Almost) Free! 5777-5807 - Xiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Haodong Zhao, Hao Li, Jiansong Chen, Ke Zeng, Xunliang Cai:

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning. 5808-5828 - Xixi Wang, Miguel Costa, Jordanka Kovaceva, Shuai Wang, Francisco C. Pereira:

Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance. 5829-5842 - Sheng Jin, Haoming Wang, Zhiqi Gao, Yongbo Yang, Bao Chunjia, Chengliang Wang:

Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics. 5843-5857 - Jiaan Wang, Fandong Meng, Yingxue Zhang, Jie Zhou:

Retrieval-Augmented Machine Translation with Unstructured Knowledge. 5858-5871 - Chenghao Yang, Yinbo Luo, Zhoufutu Wen, Qi Chu, Tao Gong, Longxiang Liu, Kaiyuan Zhang, Jianpeng Jiao, Ge Zhang, Wenhao Huang, Nenghai Yu:

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation. 5872-5898 - Bo Yang, Qingping Yang, Yingwei Ma, Runtao Liu:

UTMath: A Benchmark for Math Evaluation with Unit Test. 5899-5915 - Andreas Guta, Frithjof Petrick, Peter Polák:

The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers. 5916-5931 - Zhen Yang, Ping Jian, Chengzhi Li, Chenxu Wang, Xinyue Zhang, Wenpeng Lu:

Constructing Your Model's Value Distinction: Towards LLM Alignment with Anchor Words Tuning. 5932-5948 - Caiyu Hu, Yikai Zhang, Tinghui Zhu, Yiwei Ye, Yanghua Xiao:

MCiteBench: A Multimodal Benchmark for Generating Text with Citations. 5949-5966 - Sijia Shen, Feiyan Jiang, Peiyan Wang, Yubo Feng, Yuchen Jiang, Chang Liu:

Do LLMs Know and Understand Domain Conceptual Knowledge? 5967-5976 - Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Michael Moor, Zicheng Liu, Emad Barsoum:

Agent Laboratory: Using LLM Agents as Research Assistants. 5977-6043 - Haoyu Huang, Yongfeng Huang, Junjie Yang, Zhenyu Pan, Yongqiang Chen, Kaili Ma, Hongzhi Chen, James Cheng:

Retrieval-Augmented Generation with Hierarchical Knowledge. 6044-6060 - Haonan Sheng, Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu:

Regularized Contrastive Decoding with Hard Negative Samples for LLM Hallucination Mitigation. 6061-6073 - Xuyan Yin, Xinran Yang, Zihao Li, Lixin Zou, Chenliang Li:

CharacterCraft: Bridging the Literature-Reality Dialogue Gap for Practical Role-Playing Agents. 6074-6106 - Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung:

Drift: Decoding-time Personalized Alignments with Implicit User Preferences. 6107-6126 - Yunhao Zhang, Shaonan Wang, Nan Lin, Xinyi Dong, Chong Li, Chengqing Zong:

Discovering Semantic Subdimensions through Disentangled Conceptual Representations. 6127-6144 - Sheng Lu, Ilia Kuznetsov, Iryna Gurevych:

Identifying Aspects in Peer Reviews. 6145-6167 - Pengyu Ji, Yufei Liu, Xiang Hu, Kewei Tu:

Tree-Structured Non-Autoregressive Decoding for Sequence-to-Sequence Text Generation. 6168-6174 - Yijia Fan, Jusheng Zhang, Keze Wang:

Towards More Efficient Post-training via Fourier Domain Adapter Framework. 6175-6193 - Yushi Sun, Kai Sun, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang, Lei Chen:

KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering. 6194-6216 - Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci:

Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models. 6217-6242 - Yijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang:

CCG: Rare-Label Prediction via Neural SEM-Driven Causal Game. 6243-6256 - Chengyan Wu, Yiqiang Cai, Yang Liu, Pengxu Zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma:

Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects. 6257-6274 - Jiahao Zhang, Baoshuo Kan, Tao Gong, Fu Lee Wang, Tianyong Hao:

When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks. 6275-6300 - Guandong Li, Zhaobin Chu:

EditID: Training-Free Editable ID Customization for Text-to-Image Generation. 6301-6319 - Jusheng Zhang, Yijia Fan, Kaitong Cai, Xiaofei Sun, Keze Wang:

OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration. 6320-6337 - Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Jiansheng Wei, Huishuai Zhang, Dongyan Zhao:

VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format. 6338-6359 - Yuchen Yan, Aakash Kolekar, Sahika Genc, Wenju Xu, Edward W. Huang, Anirudh Srinivasan, Mukesh Jain, Qi He, Hanghang Tong:

To Answer or Not to Answer (TAONA): A Robust Textual Graph Understanding and Question Answering Approach. 6360-6376 - Wei Jie Yeo, Nirmalendu Prakash, Clement Neo, Ranjan Satapathy, Roy Ka-Wei Lee, Erik Cambria:

Understanding Refusal in Language Models with Sparse Autoencoders. 6377-6399 - Ori Ernst, Aviv Slobodkin, Meng Cao, Sihui Wei, Jackie CK Cheung:

Where Did That Come From? Sentence-Level Error-Tolerant Attribution. 6400-6417 - Haotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, Qian Zhang:

Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval. 6418-6427 - Leslie Barrett, Vikram Sunil Bajaj, Robert J. Kingan:

Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling. 6428-6435 - Xiaochen Wang, Heming Xia, Jialin Song, Longyu Guan, Qingxiu Dong, Rui Li, Yixin Yang, Yifan Pu, Weiyao Luo, Yiru Wang, Xiangdi Meng, Wenjie Li, Zhifang Sui:

Beyond Single Frames: Can LMMs Comprehend Implicit Narratives in Comic Strip? 6436-6452 - Zijie Lin, Bryan Hooi:

Enhancing Multi-Agent Debate System Performance via Confidence Expression. 6453-6471 - Aysan Aghazadeh, Adriana Kovashka:

The Face of Persuasion: Analyzing Bias and Generating Culture-Aware Ads. 6472-6500 - Zihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng:

SIFT: Grounding LLM Reasoning in Contexts via Stickers. 6501-6513 - Mengyi Deng, Xin Li

, Tingyu Zhu, Zhicheng Yang, Zhijiang Guo, Wei Wang:
When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning. 6514-6523 - Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta:

LUME: LLM Unlearning with Multitask Evaluations. 6524-6535 - Siyang Wu, Zhewei Sun:

How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages. 6536-6559 - Siqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang:

Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning. 6560-6578 - Md. Shahidul Salim, Lian Fu, Arav Adikesh Ramakrishnan, Zonghai Yao, Hong Yu:

MedCOD: Enhancing English-to-Spanish Medical Translation of Large Language Models Using Enriched Chain-of-Dictionary Framework. 6579-6597 - Won Seok Jang, Hieu Tran, Manav Mistry, SaiKiran Gandluri, Yifan Zhang, Sharmin Sultana, Sunjae Kwon, Yuan Zhang, Zonghai Yao, Hong Yu:

Chatbot To Help Patients Understand Their Health. 6598-6627 - Alex Duchnowski, Ellie Pavlick, Alexander Koller:

A Knapsack by Any Other Name: Presentation impacts LLM performance on NP-hard problems. 6628-6651 - Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park:

Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models. 6652-6671 - Amit Levi, Rom Himelstein, Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin:

Jailbreak Attack Initializations as Extractors of Compliance Directions. 6672-6705 - Xinmeng Hou, Lingyue Fu, Chenhao Meng, Kounianhua Du, Hai Hu:

Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet Extraction. 6706-6719 - Manar Aljohani, Jun Hou, Sindhura Kommu, Xuan Wang:

A Comprehensive Survey on the Trustworthiness of Large Language Models in Healthcare. 6720-6748 - Ziyan Zhang, Yang Hou, Chen Gong, Zhenghua Li:

Self-Correction Makes LLMs Better Parsers. 6749-6762 - Zhengyu Hu, Linxin Song, Jieyu Zhang, Zheyuan Xiao, Tianfu Wang, Zhengyu Chen, Nicholas Jing Yuan, Jianxun Lian, Kaize Ding, Hui Xiong:

Explaining Length Bias in LLM-Based Preference Evaluations. 6763-6794 - Maxwell A. Weinzierl, Sanda M. Harabagiu:

Investigating Controversy Framing across Topics on Social Media. 6795-6814 - Ruochang Li, Xiao Luo, Zhiping Xiao, Wei Ju, Ming Zhang:

HEAL: Hybrid Enhancement with LLM-based Agents for Text-attributed Hypergraph Self-supervised Representation Learning. 6815-6829 - Danlong Yuan, Jiahao Liu, Bei Li, Huishuai Zhang, Jingang Wang, Xunliang Cai, Dongyan Zhao:

ReMamba: Equip Mamba with Effective Long-Sequence Modeling. 6830-6840 - Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng:

QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory. 6841-6856 - Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Zhuoyan Xu, Jiale Zhao, Zhen Zhuang:

Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers. 6857-6894 - Kangda Wei, Hasnat Md Abdullah, Ruihong Huang:

Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs. 6895-6917 - Wanqiang Wang, Longzhu He, Wei Zheng:

Beyond the Textual: Generating Coherent Visual Options for MCQs. 6918-6935 - Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Heng Ji, Denghui Zhang:

SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals. 6936-6955 - Gleb V. Solovev, Alina B. Zhidkovskaya, Anastasia Orlova, Nina Gubina, Anastasia Vepreva, Rodion Golovinskii, Ilya Tonkii, Ivan Dubrovsky, Ivan Gurev, Dmitry Gilemkhanov, Denis Chistiakov, Timur A. Aliev, Ivan Poddiakov, Galina Zubkova, Ekaterina V. Skorb, Vladimir Vinogradov, Alexander Boukhanovsky, Nikolay O. Nikitin, Andrei Dmitrenko, Anna V. Kaluzhnaya, Andrey V. Savchenko:

MADD: Multi-Agent Drug Discovery Orchestra. 6956-6998 - Vinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik R. Narasimhan, Vishvak Murahari:

PersonaGym: Evaluating Persona Agents and LLMs. 6999-7022 - Chang Zhou, Yuheng Shan, Pengan Chen, Xiangyu Shi, Zikang Wang, Yanting Li, Jiyue Jiang:

LM2Protein: A Structure-to-Token Protein Large Language Model. 7023-7029 - Sohee Yang, Sang-Woo Lee, Nora Kassner, Daniela Gottesman, Sebastian Riedel, Mor Geva:

How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts? 7030-7047 - Dohyeon Lee, Yeonseok Jeong, Seung-won Hwang:

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval. 7048-7064 - Zeping Yu, Sophia Ananiadou:

Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs. 7065-7078 - Qirun Dai, Dylan Zhang, Jiaqi W. Ma, Hao Peng:

Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities. 7079-7102 - Guangliang Liu, Zimo Qi, Xitong Zhang, Lei Jiang, Kristen Marie Johnson:

Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization. 7103-7117 - Guangliang Liu, Zimo Qi, Xitong Zhang, Kristen Marie Johnson:

Discourse Heuristics For Paradoxically Moral Self-Correction. 7118-7132 - Junjie Xiong, Changjia Zhu, Shuhang Lin, Chong Zhang, Yongfeng Zhang, Yao Liu, Lingyao Li:

Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models. 7133-7147 - Wei Zhang, Jian Yang, Jiaxi Yang, Ya Wang, Zhoujun Li, Zeyu Cui, Binyuan Hui, Junyang Lin:

Turning the Tide: Repository-based Code Reflection. 7148-7164 - João Luís Lins, Jia Xu:

Reinforcement Learning with Supervised Alignment. 7165-7181 - Shenglan Li, Jia Xu, Mengjiao Zhang:

EmByte: Decomposition and Compression Learning for Small yet Private NLP. 7182-7201 - Yuanhao Ding, Esteban Garces Arias, Meimingwei Li, Julian Rodemann, Matthias Aßenmacher, Danlu Chen, Gaojuan Fan, Christian Heumann, Chongsheng Zhang:

GUARD: Glocal Uncertainty-Aware Robust Decoding for Effective and Efficient Open-Ended Text Generation. 7202-7226 - Yifei He, Yang Liu, Chen Liang, Hany Hassan Awadalla:

Efficiently Editing Mixture-of-Experts Models with Compressed Experts. 7227-7238 - Ying Li, Mengyu Wang, Miguel de Carvalho, Sotirios Sabanis, Tiejun Ma:

FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval. 7239-7255 - Amirhossein Abaskohi, Spandana Gella, Giuseppe Carenini, Issam H. Laradji:

FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering. 7256-7282 - Jinsung Yoon, Junhao Zeng, Sercan Ö. Arik:

SQUARE: Unsupervised Retrieval Adaptation via Synthetic Data. 7283-7297 - Che Liu, Cheng Ouyang, Zhongwei Wan, Haozhe Wang, Wenjia Bai, Rossella Arcucci:

Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs. 7298-7316 - Mahammed Kamruzzaman, Amanda Cercas Curry, Alba Cercas Curry, Flor Miriam Plaza del Arco:

Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models. 7317-7351 - Zahidul Islam, Mrigank Rochan:

AdaptMerge: Inference Time Adaptive Visual and Language-Guided Token Merging for Efficient Large Multimodal Models. 7352-7361 - Abhijit Chakraborty, Chahana Dahal, Vivek Gupta:

Federated Retrieval-Augmented Generation: A Systematic Mapping Study. 7362-7374 - Yuchen Su, Yonghua Zhu, Ruofan Wang, Zijian Huang, Diana Benavides-Prado, Michael J. Witbrock:

A Survey of Pun Generation: Datasets, Evaluations and Methodologies. 7375-7395 - Mansour Al Ghanim, Jiaqi Xue, Rochana Prih Hastuti, Mengxin Zheng, Yan Solihin, Qian Lou:

Evaluating the Robustness and Accuracy of Text Watermarking Under Real-World Cross-Lingual Manipulations. 7396-7416 - Xiangfeng Luo, Ruoxin Zheng, Jianqiang Huang, Hang Yu:

HDiff: Confidence-Guided Denoising Diffusion for Robust Hyper-relational Link Prediction. 7417-7434 - Yutong Gao, Maoyuan Shao, Xinyang Huang, Chuang Zhu, Yu Weng, Xuan Liu, Lijuan Sun, Guoshun Nan:

Spotlighter: Revisiting Prompt Tuning from a Representative Mining View. 7435-7449 - Ishan Jindal, Jayant Taneja, Chandana Badrinath, Vikas Kapur, Sachin Dev Sharma:

Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement. 7450-7458 - Chenlong Wang, Yuanning Feng, Dongping Chen, Zhaoyang Chu, Ranjay Krishna, Tianyi Zhou:

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency. 7459-7482 - Xinpeng Ti, Wentao Ye, Zhifang Zhang, Junbo Zhao, Chang Yao, Lei Feng, Haobo Wang:

Towards Reverse Engineering of Language Models: A Survey. 7483-7502 - Wenhao Zheng, Liaoyaqi Wang, Dongsheng Peng, Hongxia Xu, Yun Li, Hongtu Zhu, Tianfan Fu, Huaxiu Yao:

LIFTED: Multimodal Clinical Trial Outcome Prediction via Large Language Models and Mixture-of-Experts. 7503-7517 - Yao Yan:

Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs. 7518-7532 - Jinyuan Feng, Chaopeng Wei, Tenghai Qiu, Tianyi Hu, Zhiqiang Pu:

CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning. 7533-7551 - Xinrong Chen, Hengyuan Zhang, Yingmin Qiu, Xiao Liang, Ziyue Li, Guanyu Wang, Weiping Li, Tong Mo, Hayden Kwok-Hay So, Ngai Wong:

GuiLoMo: Allocating Experts and Ranks for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors. 7552-7567 - Euntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo:

Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer. 7568-7590 - Chengbing Wang, Yang Zhang, Zhicheng Wang, Tianhao Shi, Keqin Bao, Fuli Feng, Tat-Seng Chua:

Decoding in Latent Spaces for Efficient Inference in LLM-based Recommendation. 7591-7603 - Yanhong Li, Min Yang, Xiping Hu, Chengming Li:

Forget for Get: A Lightweight Two-phase Gradient Method for Knowledge Editing in Large Language Models. 7604-7623 - Ding-Chu Zhang, Xiaowen Zhang, Yue Fei, Renjun Hu, Xiao-Wen Yang, Zhi Zhou, Baixuan Li, Yu-Feng Li, Xing Shi, Wei Lin:

AutoEvolve: Automatically Evolving Queries for Applicable and Scalable Retrieval-Augmented Generation Benchmarking. 7624-7639 - Sanjay Govindan, Maurice Pagnucco, Yang Song:

Temporal Alignment of Time Sensitive Facts with Activation Engineering. 7640-7657 - Kyungmin Kim, Youngbin Choi, Hyounghun Kim, Dongwoo Kim, Sangdon Park:

ChronoBias: A Benchmark for Evaluating Temporal Group Bias in the Time-sensitive Knowledge of Large Language Models. 7658-7693 - Ziyao Xu, Zhe Yang, Houfeng Wang:

MC²: A Minimum-Coverage and Dataset-Agnostic Framework for Compositional Generalization of LLMs on Semantic Parsing. 7694-7706 - Yunzhe Qi, Jinjin Tian, Tianci Liu, Ruirui Li, Tianxin Wei, Hui Liu, Xianfeng Tang, Monica Xiao Cheng, Jingrui He:

Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs. 7707-7733 - Lekang Jiang, Chengzu Li, Stefan Goetz:

Enriching Patent Claim Generation with European Patent Dataset. 7734-7751 - Jaewook Lee, Dahyun Jung, Heuiseok Lim:

StepKE: Stepwise Knowledge Editing for Multi-Hop Question Answering. 7752-7765 - Lan Li, Liri Fang, Bertram Ludäscher, Vetle I. Torvik:

AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark. 7766-7780 - Pengzhou Cheng, Haowen Hu, Zheng Wu, Zongru Wu, Tianjie Ju, Daizong Ding, Zhuosheng Zhang, Gongshen Liu:

Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents. 7781-7805 - Zhuoyue Chen, Jihai Zhang, Ben Liu, Fangquan Lin, Wotao Yin:

Scale Down to Speed Up: Dynamic Data Selection for Reinforcement Learning. 7806-7817 - Jianzhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Yang Xiang, Buzhou Tang:

Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales. 7818-7835 - Seunghyuk Cho, Zhenyue Qin, Yang Liu, Youngbin Choi, Seungbeom Lee, Dongwoo Kim:

GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder. 7836-7851 - Jiang Li, Xiangdong Su, Guanglai Gao:

Leveraging 3D Gaussian for Temporal Knowledge Graph Embedding. 7852-7865 - Liangqi Yuan, Dong-Jun Han, Christopher G. Brinton, Sabine Brunswicker:

LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences. 7866-7894 - Jeesu Jung, Chanjun Park, Sangkeun Jung:

ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction. 7895-7911 - Jieyong Wang, Chunyao Song, Tingjian Ge:

Token Knowledge: A New Perspective For Knowledge in Large Language Models. 7912-7926 - Sheng Liang, Hang Lv, Zhihao Wen, Yaxiong Wu, Yongyue Zhang, Hao Wang, Yong Li:

Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation. 7927-7946 - Yuhan Chen, Bowei Zou, Yifan Fan, Yuchong Chen, Shujun Cao, Yu Hong:

Enhancing Attributed Question Answering using Tailored Progressive Curriculum Learning. 7947-7956 - Jianwen Luo, Yu Hong, Shuai Yang, Jianmin Yao:

REAR: Reinforced Reasoning Optimization for Event Argument Extraction with Relation-Aware Support. 7957-7972 - Rajvee Sheth, Himanshu Beniwal, Mayank Singh:

COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing. 7973-7992 - Aakash Sen Sharma, Debdeep Sanyal, Priyansh Srivastava, Sundar Athreya H, Shirish S. Karande, Mohan Kankanhalli, Murari Mandal:

Nine Ways to Break Copyright Law and Why Our LLM Won't: A Fair Use Aligned Generation Framework. 7993-8023 - Yifu Chen, Shengpeng Ji, Ziqing Wang, Hanting Wang, Zhou Zhao:

InteractSpeech: A Speech Dialogue Interaction Corpus for Spoken Dialogue Model. 8024-8033 - Shixin Liu, Haoyu Xu, Yu Hong:

Enhancing SQL Table Acquisition with Reverse Engineering for Text-to-SQL. 8034-8041 - Xiabin Zhou, Wenbin Wang, Minyan Zeng, Jiaxian Guo, Xuebo Liu, Li Shen, Min Zhang, Liang Ding:

DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs. 8042-8057 - Shuzhong Lai, Chenxi Li, Junhong Lai, Yucun Zhong, Chenyu Yan, Xiang Li, Haifeng Li, Gang Pan, Lin Yao, Yueming Wang:

ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention Dataset. 8058-8079 - Jie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li:

GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction. 8080-8091 - Jiebin Zhang, Dawei Zhu, Yifan Song, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu, Sujian Li:

More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression. 8092-8105 - Yilin Zhang, Xinran Zhao, Zora Zhiruo Wang, Chenyang Yang, Jiayi Wei, Tongshuang Wu:

cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree. 8106-8116 - Guanqun Bi, Yuqiang Xie, Lei Shen, Yanan Cao:

A Group Fairness Lens for Large Language Models. 8117-8139 - Zhanpeng Chen, Chengjin Xu, Yiyan Qi, Xuhui Jiang, Jian Guo:

VLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training. 8140-8158 - Jae Hyeon Cho, JunHyeok Oh, Myunsoo Kim, Byung-Jun Lee:

Rethinking DPO: The Role of Rejected Responses in Preference Misalignment. 8159-8176 - Jingsen Zhang, Zihang Tian, Xueyang Feng, Xu Chen, Chong Chen:

Enhancing Recommendation Explanations through User-Centric Refinement. 8177-8191 - Bao Nguyen, Binh T. Nguyen, Duy Nguyen, Viet Anh Nguyen:

Distributional Surgery for Language Model Activations. 8192-8212 - Sihan Yang, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao:

Improving Alignment in LVLMs with Debiased Self-Judgment. 8213-8232 - Hongyi Cai, Jie Li, Mohammad Mahdinur Rahman, Wenzhen Dong:

Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning. 8233-8240 - Yujin Choi, Youngjoo Park, Junyoung Byun, Jaewook Lee, Jinseong Park:

Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home? 8241-8258 - Amartya Roy, Devharish N, Shreya Ganguly, Kripabandhu Ghosh:

Causal-LLM: A Unified One-Shot Framework for Prompt- and Data-Driven Causal Graph Discovery. 8259-8279 - Om Dehlan T. Karthikeyan, Manish Gupta Mausam:

LRPLAN: A Multi-Agent Collaboration of Large Language and Reasoning Models for Planning with Implicit & Explicit Constraints. 8280-8310 - Dengyun Peng, Yuhang Zhou, Qiguang Chen, JinHao Liu, Jingjing Chen, Libo Qin, Wanxiang Che:

DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective. 8311-8334 - Mengting Hu, Jianfeng Wu, Ming Jiang, Yalan Xie, Zhunheng Wang, Rui Ying, Xiaoyi Liu, Ruixuan Xu, Hang Gao, Renhong Cheng:

Towards Robust Few-Shot Relation Classification: Incorporating Relation Description with Agreement. 8335-8349 - Julien Bezançon, Gaël Lejeune:

For a Fistful of Puns: Evaluating a Puns in Multiword Expressions Identification Algorithm Without Dedicated Dataset. 8350-8370 - Kyungryul Back, Seongbeom Park, Milim Kim, Mincheol Kwon, SangHyeok Lee, Hyunyoung Lee, Junhee Cho, Seunghyun Park, Jinkyu Kim:

Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding. 8371-8387 - Lui Yoshida:

Are the Reasoning Models Good at Automated Essay Scoring? 8388-8394 - Donghee Han, Hwanjun Song, Mun Yong Yi:

Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel Integration. 8395-8419 - Aviv Slobodkin, Hagai Taitelbaum, Yonatan Bitton, Brian Gordon, Michal Sokolik, Nitzan Bitton Guetta, Almog Gueta, Royi Rassin, Dani Lischinski, Idan Szpektor:

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation. 8420-8438 - Zoey Liu, Masoud Jasbi, Christan Grant, Kenji Sagae, Emily Prud'hommeaux:

What data should I include in my POS tagging training set? 8439-8455 - Lvzhou Luo, Yixuan Cao, Ping Luo:

AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation. 8456-8472 - Jiaqi Wu, Chen Chen, Chunyan Hou, Xiaojie Yuan:

SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention. 8473-8488 - Mengxiang Zhang, Lingyuan Liu:

Staged Knowledge Distillation Through Least-to-Most Prompting: Optimizing Teacher Guidance via Difficulty-Aware Training. 8489-8501 - Patrick Sutanto, Joan Santoso, Esther Irawati Setiawan, Aji Prasetya Wibawa:

LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering. 8502-8530 - Tianlong Wang, Junzhe Chen, Weibin Liao, Xueting Han, Jing Bai:

Teaching LLMs to Plan, Not Just Solve: Plan Learning Boosts LLMs Generalization in Reasoning Tasks. 8531-8545 - Tao Fan, Weijing Chen, Yan Kang, Guoqiang Ma, Hanlin Gu, Yuanfeng Song, Lixin Fan, Qiang Yang:

FedCoT: Federated Chain-of-Thought Distillation for Large Language Models. 8546-8557 - Yue Xin, Chen Shen, Shaotian Yan, Xiaosong Yuan, Yaoming Wang, Xiaofeng Zhang, Chenxi Huang, Jieping Ye:

SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning. 8558-8577 - Idan Kashani, Avi Mendelson, Yaniv Nemcovsky:

Representing LLMs in Prompt Semantic Task Space. 8578-8597 - Zheni Zeng, Jiayi Chen, Huimin Chen, Yukun Yan, Yuxuan Chen, Zhenghao Liu, Zhiyuan Liu, Maosong Sun:

PersLLM: A Personified Training Approach for Large Language Models. 8598-8617 - Zihao Guo, Hongtao Lv, Chaoli Zhang, Yibowen Zhao, Yixin Zhang, Lizhen Cui:

The Illusion of Randomness: How LLMs Fail to Emulate Stochastic Decision-Making in Rock-Paper-Scissors Games? 8618-8637 - Mingrui Xie, Tianxiang Xu, Qianhai Tang, Shanming Yao, Xiaofeng Zhang, Junliang Du:

DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMs. 8638-8649 - Alina Fastowski, Bardh Prenkaj, Gjergji Kasneci:

From Confidence to Collapse in LLM Factual Robustness. 8650-8667 - Yifei Xu, Yingjie Zong, Wang Zhonghua, Sirui Wu, Yuan Rao, Dan Zhang, Shuiguang Deng:

CtrlNews: LLM-based Multi-Agent Controllable News Writing via Knowledge Gravitational Field. 8668-8705 - Zhirui Chen, Wei Shen, Jiashui Huang, Ling Shao:

Joint Enhancement of Relational Reasoning for Long-Context LLMs. 8706-8720 - Yue Qiu, Yujan Ting, Pei Dong, Terrence Chen, Weijing Huang:

Training Medical QA Models Based on Mixed Rewards from Multiple-Choice and Open-Ended Questions. 8721-8729 - Chang Yang, Peng Zhang, Jing Zhang, Hui Gao, Changhao Song:

Rethink Rumor Detection in the Era of LLMs: A Review. 8730-8749 - Dongwon Noh, Donghyeok Koh, Junghun Yuk, Gyuwan Kim, Jae Yong Lee, KyungTae Lim, Cheoneum Park:

ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts. 8750-8782 - Jungyeon Lee, Kangmin Lee, Taeuk Kim:

MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation. 8783-8803 - Qingyun Jin, Xiaohui Song, Feng Zhou, Zengchang Qin:

Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA. 8804-8816 - Nuo Chen, Yufei Gao, Yongnan Jin, Yan Hu, Anningzhe Gao, Lingyong Yan, Benyou Wang:

DRBO: Mitigating Short Board Effect via Dynamic Reward Balancing in Multi-reward LLM Optimization. 8817-8841 - Mingkang Zhu, Xi Chen, Zhongdao Wang, Bei Yu, Hengshuang Zhao, Jiaya Jia:

Enhancing LLM Knowledge Learning through Generalization. 8842-8855 - Mingyang Song, Mao Zheng, Zheng Li, Wenjie Yang, Xuan Luo:

FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models. 8856-8866 - Mehmet Selman Baysan, Tunga Gungor:

TR-MTEB: A Comprehensive Benchmark and Embedding Model Suite for Turkish Sentence Representations. 8867-8887 - Wenzheng Zhang, Xi Victoria Lin, Karl Stratos, Wen-tau Yih, Mingda Chen:

ImpRAG: Retrieval-Augmented Generation with Implicit Queries. 8888-8900 - Yifu Huo, Chenglong Wang, Qiren Zhu, Shunjie Xing, Tong Xiao, Chunliang Zhang, Tongran Liu, JingBo Zhu:

HEAL: A Hypothesis-Based Preference-Aware Analysis Framework. 8901-8919 - Akash Ghosh, Debayan Datta, Sriparna Saha, Chirag Agarwal:

A Survey of Multilingual Reasoning in Language Models. 8920-8936 - Qi Xu, Qian Liu, Hao Fei, Hang Yu, Shuhao Guan, Xiao Wei:

CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs. 8937-8953 - Shuo Huang, William MacLean, Xiaoxi Kang, Qiongkai Xu, Zhuang Li, Xingliang Yuan, Gholamreza Haffari, Lizhen Qu:

NAP2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human. 8954-8970 - Long Li, Weiwen Xu, Jiayan Guo, Ruochen Zhao, Xingxuan Li, Yuqian Yuan, Boqiang Zhang, Yuming Jiang, Yifei Xin, Ronghao Dang, Yu Rong, Deli Zhao, Tian Feng, Lidong Bing:

Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents. 8971-9004 - Chuan Wu, Meng Su, Youxuan Fang, Shaolin Zhu:

Unveiling Multimodal Processing: Exploring Activation Patterns in Multimodal LLMs for Interpretability and Efficiency. 9005-9016 - Jinyu Xiang, Jiayi Zhang, Zhaoyang Yu, Xinbing Liang, Fengwei Teng, Jinhao Tu, Fashen Ren, Xiangru Tang, Sirui Hong, Chenglin Wu, Yuyu Luo:

Self-Supervised Prompt Optimization. 9017-9041 - Lukasz Grzybowski, Jakub Pokrywka, Michal Ciesiólka, Jeremi Kaczmarek, Marek Kubis:

Polish-English medical knowledge transfer: A new benchmark and results. 9042-9063 - Nandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin:

Hard Negatives, Hard Lessons: Revisiting Training Data Quality for Robust Information Retrieval with LLMs. 9064-9083 - Jie Gong, Biaoshuai Zheng, Qiwang Hu:

EventRelBench: A Comprehensive Benchmark for Evaluating Event Relation Understanding in Large Language Models. 9084-9099 - Liang Cheng, Tianyi Li, Zhaowei Wang, Mark Steedman:

S2LPP: Small-to-Large Prompt Prediction across LLMs. 9100-9115 - Weikai Xie, Li Zhang, Shihe Wang, Rongjie Yi, Mengwei Xu:

DroidCall: A Dataset for LLM-powered Android Intent Invocation. 9116-9134 - Yirong Zeng, Xiao Ding, Yutai Hou, Yuxian Wang, Li Du, Juyi Dai, Qiuyang Ding, Duyu Tang, Dandan Tu, Weiwen Liu, Bing Qin, Ting Liu:

Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch. 9135-9147 - Yuanlei Wang, Liuzhou Zhang, Haohao Luo, Ying Shen:

INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent. 9148-9160 - Juraj Vladika, Mahdi Dhaini, Florian Matthes:

Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models. 9161-9174 - Shuo Huang, Xingliang Yuan, Gholamreza Haffari, Lizhen Qu:

Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search. 9175-9190 - Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim:

KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval. 9191-9217 - Yunsoo Kim, Michal W. S. Ong, Alex Shavick, Honghan Wu, Adam P. Levine:

HARE: an entity and relation centric evaluation framework for histopathology reports. 9218-9233 - Rishanth Rajendhran, Amir Zadeh, Matthew Sarte, Chuan Li, Mohit Iyyer:

VeriFastScore: Speeding up long-form factuality evaluation. 9234-9259 - Md. Tanzib Hosain, Md. Kishor Morol:

B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models. 9260-9274 - Nitesh Kumar, Usashi Chatterjee, Steven Schockaert:

Extracting Conceptual Spaces from LLMs Using Prototype Embeddings. 9275-9298 - Ziyi Zhang, Zhen Sun, Zongmin Zhang, Jihui Guo, Xinlei He:

FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts. 9299-9316 - Jonas Waldendorf, Barry Haddow, Alexandra Birch, Mateusz Klimaszewski:

Multilingual Data Filtering using Synthetic Data from Large Language Models. 9317-9334 - Samir Abdaljalil, Filippo Pallucchini, Andrea Seveso, Hasan Kurban, Fabio Mercorio, Erchin Serpedin:

SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs. 9335-9346 - Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra:

Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment. 9347-9364 - Gemma Boleda:

LLMs as a synthesis between symbolic and distributed approaches to language. 9365-9379 - Yujia Chen, Changsong Li, Yiming Wang, Tianjie Ju, Qingqing Xiao, Nan Zhang, Zifan Kong, Peng Wang, Binyu Yan:

MIND: Towards Immersive Psychological Healing with Multi-Agent Inner Dialogue. 9380-9413 - Davood Wadi, Marc Fredette:

A Monte-Carlo Sampling Framework For Reliable Evaluation of Large Language Models Using Behavioral Analysis. 9414-9432 - Yi Su, Jiayi Zhang, Shu Yang, Xinhai Wang, Lijie Hu, Di Wang:

Understanding How Value Neurons Shape the Generation of Specified Values in LLMs. 9433-9452 - Momose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira:

Likelihood Variance as Text Importance for Resampling Texts to Map Language Models. 9453-9465 - Hoang Phan, Victor Li, Qi Lei:

Think Twice, Generate Once: Safeguarding by Progressive Self-Reflection. 9466-9483 - Chang Yang, Xinrun Wang, Qinggang Zhang, Qi Jiang, Xiao Huang:

Efficient Integration of External Knowledge to LLM-based World Models via Retrieval-Augmented Generation and Reinforcement Learning. 9484-9501 - Tyler Loakman, William Thorne, Chenghua Lin:

Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes. 9502-9518 - Iago Alves Brito, Julia Soares Dollis, Fernanda Bufon Färber, Pedro Schindler Freire Brasil Ribeiro, Rafael Teixeira Sousa, Arlindo Rodrigues Galvão Filho:

Modeling, Evaluating, and Embodying Personality in LLMs: A Survey. 9519-9532 - Shanshan Wang, Junchao Wu, Fengying Ye, Derek F. Wong, Jingming Yao, Lidia S. Chao:

Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry. 9533-9552 - Pooja Singh, Shashwat Bhardwaj, Vaibhav Sharma, Sandeep Kumar:

Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus. 9553-9579 - Mete Ismayilzada, Antonio Laverghetta Jr., Simone Luchini, Reet Patel, Antoine Bosselut, Lonneke van der Plas, Roger E. Beaty:

Creative Preference Optimization. 9580-9609 - Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, Fuli Feng:

Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge. 9610-9631 - Changle Qu, Sunhao Dai, Hengyi Cai, Yiyang Cheng, Jun Xu, Shuaiqiang Wang, Dawei Yin:

Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented Generation. 9632-9644 - Yuhang Wu, Yu-Jie Xiong, Hao Zhang, Jia-Chen Zhang, Zheng Zhou:

Sugar-Coated Poison: Benign Generation Unlocks Jailbreaking. 9645-9665 - Zhaowei Wang, Hongming Zhang, Tianqing Fang, Ye Tian, Yue Yang, Kaixin Ma, Xiaoman Pan, Yangqiu Song, Dong Yu:

DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes. 9666-9686 - Joykirat Singh, Subhabrata Dutta, Tanmoy Chakraborty:

Data-scarce Behavior Editing of Language Models. 9687-9701 - Dongwei Wang, Zijie Liu, Song Wang, Yuxin Ren, Jianing Deng, Jingtong Hu, Tianlong Chen, Huanrui Yang:

FIER: Fine-Grained and Efficient KV Cache Retrieval for Long-context LLM Inference. 9702-9713 - Massa Baali, Sarthak Bisht, Francisco Teixeira, Kateryna Shapovalenko, Rita Singh, Bhiksha Raj:

SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions. 9714-9731 - Massa Baali, Xiang Li, Hao Chen, Syed Abdul Hannan, Rita Singh, Bhiksha Raj:

CAARMA: Class Augmentation with Adversarial Mixup Regularization. 9732-9742 - Li Siyan, Zhen Xu, Vethavikashini Chithrra Raghuram, Xuanming Zhang, Renzhe Yu, Zhou Yu:

Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants' Question-Answering in Asynchronous Learning Environments. 9743-9774 - Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch:

Demystifying Multilingual Reasoning in Process Reward Modeling. 9775-9788 - Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, Mingyu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park:

BehaviorSFT: Behavioral Token Conditioning for Health Agents Across the Proactivity Spectrum. 9789-9817 - Ho Yin Sam Ng, Edward Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee, Tong Yu, Sungchul Kim, Ryan A. Rossi, Ting-Hao Kenneth Huang:

LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles. 9818-9832 - Weitao Li, Xiangyu Zhang, Kaiming Liu, Xuanyu Lei, Weizhi Ma, Yang Liu:

Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation. 9833-9849 - Guy Mor-Lan, Naama Rivlin-Angert, Yael R. Kaplan, Tamir Sheafer, Shaul R. Shenhav:

HebID: Detecting Social Identities in Hebrew-language Political Text. 9850-9870 - Jeongsoo Choi, Jaehun Kim, Joon Son Chung:

Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing. 9871-9881 - Islam Eldifrawi, Shengrui Wang, Amine Trabelsi:

FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-Checking. 9882-9901 - Alexander Gill, Abhilasha Ravichander, Ana Marasovic:

What Has Been Lost with Synthetic Evaluation? 9902-9945 - Dongyu Zhang, Qingqing Hong, Bingxuan Hou, Jiayi Lin, Chenyang Zhang, Jialin Li, Junli Wang:

Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State. 9946-9959 - Pedro Schindler Freire Brasil Ribeiro, Iago Alves Brito, Rafael Teixeira Sousa, Fernanda Bufon Färber, Julia Soares Dollis, Arlindo Rodrigues Galvão Filho:

Proxy Barrier: A Hidden Repeater Layer Defense Against System Prompt Leakage and Jailbreaking. 9960-9975 - Hamdy Mubarak, Abubakr Mohamed, Majd Hawasly:

AraSafe: Benchmarking Safety in Arabic LLMs. 9976-9992 - Alberto Muñoz-Ortiz, David Vilares, Caio COrro, Carlos Gómez-Rodríguez:

Nested Named Entity Recognition as Single-Pass Sequence Labeling. 9993-10002 - Aryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Alexander Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran:

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations. 10003-10039 - Zhengxiang Wang, Nafis Irtiza Tripto, Solha Park, Zhenzhen Li, Jiawei Zhou:

Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors. 10040-10055 - Elaf Alhazmi, Quan Z. Sheng, Wei Emma Zhang, Mohammed I. Thanoon, Haojie Zhuang, Behnaz Soltani, Munazza Zaib:

Fine-Tuning Encoder-Decoder Models with Contrastive Learning for In-Context Distractor Generation. 10056-10072 - Siyi Liu, Dan Roth:

Conflicts in Texts: Data, Implications and Challenges. 10073-10091 - Wenbo Zhang, Zihang Xu, Hengrui Cai:

Recognizing Limits: Investigating Infeasibility in Large Language Models. 10092-10112 - Zhihui Zhang, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang:

VQA-Augmented Machine Translation with Cross-Modal Contrastive Learning. 10113-10124 - Zixin Guo, Jiayang Sun, Tzu-Jui Julius Wang, Abduljalil Radman, Selen Pehlivan, Min Cao, Jorma Laaksonen:

Learning to Describe Implicit Changes: Noise-robust Pre-training for Image Difference Captioning. 10125-10145 - Zichen Yuan, Lifan Sun, Yucen Zhuang, Yue Wang, Xinyuan Song, Tianqi Xu, Siyuan Li, Junchen Fu, Youhua Li, Sirui Hong, Jiaqi Chen, Joemon M. Jose, Yongxin Ni:

SOLAR: Serendipity Optimized Language Model Aligned for Recommendation. 10146-10169 - Qiuhai Zeng, Claire Jin, Xinyue Wang, Yuhan Zheng, Qunhua Li:

AIRepr: An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science. 10170-10201 - Ye Yang, Donghe Li, Zuchen Li, Fengyuan Li, Jingyi Liu, Li Sun, Qingyu Yang:

MisinfoBench: A Multi-Dimensional Benchmark for Evaluating LLMs' Resilience to Misinformation. 10202-10229 - Ping Chen, Xiang Li, Zhaoxiang Liu, Zezhou Chen, Xingpeng Zhang, Huan Hu, Zipeng Wang, Kai Wang, Shuming Shi, Shiguo Lian:

Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity. 10230-10240 - Yan Liu, Minghui Zhang, Bojian Xiong, Yifan Xiao, Yinong Sun, Yating Mei, Longyu Zeng, Jingchao Yang, Yang Wang, Deyi Xiong:

HighMATH: Evaluating Math Reasoning of Large Language Models in Breadth and Depth. 10241-10253 - Mingyu Chen, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen, Xiangmin Xu:

CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling. 10254-10286 - Debanjan Goswami, Ronast Subedi, Shayok Chakraborty:

MediVLM: A Vision Language Model for Radiology Report Generation from Medical Images. 10287-10304 - Yinghao Song, Xiangji Zeng, Shuai Cui, Lu Sun, Zhaowei Liu, Yuan Yuan, Yulu Wang, Hai Zhou, Zhaohan Gong:

AdDriftBench: A Benchmark for Detecting Data Drift and Label Drift in Short Video Advertising. 10305-10321 - Prawaal Sharma, Poonam Goyal, Navneet Goyal, Vidisha Sharma:

NIM: Neuro-symbolic Ideographic Metalanguage for Inclusive Communication. 10322-10340 - Zikang Liu, Kun Zhou, Wayne Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen:

ViFT: Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models. 10341-10366 - Jian Wang, Xiaofei Xie, Qiang Hu, Shangqing Liu, Yi Li:

Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models. 10367-10385 - Zikai Xiao, Fei Huang, Jianhong Tu, Jianhui Wei, Wen Ma, Yuxuan Zhou, Jian Wu, Bowen Yu, Zuozhu Liu, Junyang Lin:

LongWeave: A Long-Form Generation Benchmark Bridging Real-World Relevance and Verifiability. 10386-10417 - Vivek Iyer, Pinzhen Chen, Ricardo Rei, Alexandra Birch:

XL-Suite: Cross-Lingual Synthetic Training and Evaluation Data for Open-Ended Generation. 10418-10432 - Seyyed Saeid Cheshmi, Azal Ahmad Khan, Xinran Wang, Zirui Liu, Ali Anwar:

Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling. 10433-10447 - Xinyu Zhang, Pei Zhang, Shuang Luo, Jialong Tang, Yu Wan, Baosong Yang, Fei Huang:

CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis. 10448-10467 - Zhu Wang, Homaira Huda Shomee, Sathya N. Ravi, Sourav Medya:

DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding. 10468-10490 - Yuan Li, Qi Luo, Xiaonan Li, Bufan Li, Qinyuan Cheng, Bo Wang, Yining Zheng, Yuxin Wang, Zhangyue Yin, Xipeng Qiu:

R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning. 10491-10507 - Sunwoo Kim, Soo Yong Lee, Jaemin Yoo, Kijung Shin:

'Hello, World!': Making GNNs Talk with LLMs. 10508-10526 - Dingjie Song, Sicheng Lai, Mingxuan Wang, Shunian Chen, Lichao Sun, Benyou Wang:

Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM. 10527-10542 - Aritra Dutta, Swapnanil Mukherjee, Deepanway Ghosal, Somak Aditya:

NLKI: A Lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks. 10543-10563 - Yanhong Li, Zixuan Lan, Jiawei Zhou:

Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text Inputs. 10564-10578 - Kyubyung Chae, Gihoon Kim, Gyuseong Lee, Taesup Kim, Jaejin Lee, Heejin Kim:

Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMs. 10579-10600 - Van Dai Do, Quan Hung Tran, Ahmed Kirmani, Lu Zhang, Hung Le:

Sample Efficient Alignment Learning With Episodic Control. 10601-10618 - ChaeHun Park, Hojun Cho, Jaegul Choo:

Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts. 10619-10627 - Seonho Lee, Jiho Choi, Inha Kang, Jiwook Kim, Junsung Park, Hyunjung Shim:

3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation. 10628-10647 - Jivnesh Sandhan, Fei Cheng, Tushar Sandhan, Yugo Murawaki:

CAPE: Context-Aware Personality Evaluation Framework for Large Language Models. 10648-10662 - Kangan Qian, Sicong Jiang, Yang Zhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang, Mengmeng Yang, Zheng Fu, Jinyu Miao, Yining Shi, He Zhe Lim, Li Liu, Tianbao Zhou, Hongyi Wang, Huang Yu, Yifei Hu, Guang Li, Guang Chen, Hao Ye, Lijun Sun, Diange Yang:

AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving. 10663-10682 - Bolei He, Xinran He, Run Shao, Shanfu Shu, Xianwei Xue, Mingquan Cheng, Haifeng Li, Zhen-Hua Ling:

Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering. 10683-10703 - Beom Jin Kang, Hyun Kim:

GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation. 10704-10718 - Xueyi Zhou, Qi Lu, Dong-Kyu Chae:

"Where Does This Strange Smell Come from?": Enabling Conversational Interfaces for Artificial Olfaction. 10719-10745 - Zirui Guo, Lianghao Xia, Yanhua Yu, Tu Ao, Chao Huang:

LightRAG: Simple and Fast Retrieval-Augmented Generation. 10746-10761 - Taehee Jeon:

Beyond Distribution: Investigating Language Models' Understanding of Sino-Korean Morphemes. 10762-10772 - Qi Yang, Jingjie Zeng, Liang Yang, Kai Ma, Hongfei Lin:

Sarcasm-R1: Enhancing Sarcasm Detection through Focused Reasoning. 10773-10785 - Guangwei Zhang, Qisheng Su, Jiateng Liu, Cheng Qian, Yanzhou Pan, Yanjie Fu, Denghui Zhang:

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage. 10786-10807 - Zhenglin Hua, Jinghan He, Zijun Yao, Tianxu Han, Haiyun Guo, Yuheng Jia, Junfeng Fang:

Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation. 10808-10828 - Junteng Liu, Weihao Zeng, Xiwen Zhang, Yijun Wang, Zifei Shan, Junxian He:

On the Perception Bottleneck of VLMs for Chart Understanding. 10829-10841 - Sijia Cui, Aiyao He, Shuai Xu, Hongming Zhang, Yanna Wang, Qingyang Zhang, Yajing Wang, Bo Xu:

Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall. 10842-10854 - Yuxin Huang, Simeng Wu, Ran Song, Yan Xiang, Yantuan Xian, Shengxiang Gao, Zhengtao Yu:

Multilingual Generative Retrieval via Cross-lingual Semantic Compression. 10855-10866 - Hui Huang, Julien Velcin, Yacine Kessaci:

Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation. 10867-10881 - Cunli Mao, Xiaofei Gao, Ran Song, Shizhu He, Shengxiang Gao, Kang Liu, Zhengtao Yu:

Multilingual Knowledge Graph Completion via Efficient Multilingual Knowledge Sharing. 10882-10896 - Nakyung Lee, Yeongoon Kim, Minhae Oh, Suhwan Kim, Jin Woo Koo, Hyewon Jo, Jungwoo Lee:

Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation. 10897-10912 - Zhuang Yu, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang:

Imagination and Contemplation: A Balanced Framework for Semantic-Augmented Multimodal Machine Translation. 10913-10928 - Yuqing Zhang, Ecesu Ürker, Tessa Verhoef, Gemma Boleda, Arianna Bisazza:

NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use. 10929-10945 - Auguste Poiroux, Antoine Bosselut, Viktor Kuncak:

RLMEval: Evaluating Research-Level Neural Theorem Proving. 10946-10957 - Ranran Bu, Jian Cao, Jianqi Gao, Shiyou Qian, Hongming Cai:

KaeDe: Progressive Generation of Logical Forms via Knowledge-Aware Question Decomposition for Improved KBQA. 10958-10973 - Jen-tse Huang, Yuhang Yan, Linqi Liu, Yixin Wan, Wenxuan Wang, Kai-Wei Chang, Michael R. Lyu:

Where Fact Ends and Fairness Begins: Redefining AI Bias Evaluation through Cognitive Biases. 10974-10993 - Junyi Chen, Mengjia Wu, Qian Liu, Jing Sun, Ying Ding, Yi Zhang:

Equal Truth: Rumor Detection with Invariant Group Fairness. 10994-11007 - Geunyeong Jeong, Juoh Sun, Seonghee Lee, Harksoo Kim:

STEAM: A Semantic-Level Knowledge Editing Framework for Large Language Models. 11008-11023 - Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang:

SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models. 11024-11039 - Xiyan Fu, Wei Liu:

How Reliable is Multilingual LLM-as-a-Judge? 11040-11053 - Qingsong Wang, Tao Wu, Wang Lin, Yueying Feng, Gongsheng Yuan, Chang Yao, Jingyuan Chen:

Cognitive-Level Adaptive Generation via Capability-Aware Retrieval and Style Adaptation. 11054-11069 - Essa Jan, Moiz Ali, Muhammad Saram Hassan, Muhammad Fareed Zaffar, Yasir Zaki:

Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs. 11070-11077 - Dekun Wu, Frederik Brudy, Bang Liu, Yi Wang:

INDOORWORLD : Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent Environment. 11078-11099 - Zeyu Zhang, Tianqi Cheng, Yuki Todo:

ARXSA: A General Negative Feedback Control Theory in Vision-Language Models. 11100-11110 - Xingcheng Ruan, Haoxiang Geng, Yunhui Xia, Bingran Zhao:

Breaking the Attention Trap in Code LLMs: A Rejection Sampling Approach to Enhance Code Execution Prediction. 11111-11120 - Shijie Zhang, Renhao Li, Songsheng Wang, Philipp Koehn, Min Yang, Derek F. Wong:

HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation. 11121-11145 - Gili Lior, Eliya Habba, Shahar Levy, Avi Caciularu, Gabriel Stanovsky:

ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments. 11146-11153 - Rares Dolga, Lucas Maystre, Tudor Berariu, David Barber:

From Characters to Tokens: Dynamic Grouping with Hierarchical BPE. 11154-11162 - Lei Shen, Xiaoyu Shen:

Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant. 11163-11174 - Or Shachar, Uri Katz, Yoav Goldberg, Oren Glickman:

NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings. 11175-11186 - Wenyang Luo, Wayne Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen:

MMATH: A Multilingual Benchmark for Mathematical Reasoning. 11187-11202 - Rrubaa Panchendrarajan, Rubén Míguez Pérez, Arkaitz Zubiaga:

MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters. 11203-11215 - Yongqiang Liu, Qiyao Peng, Binrong Liu, Hongtao Liu, XueWei Li, Wenjun Wang:

DS-MHP: Improving Chain-of-Thought through Dynamic Subgraph-Guided Multi-Hop Path. 11216-11230 - Robin Algayres, Charles-Éric Saint-James, Mahi Luthra, Jiayi Shen, Youssef Benchekroun, Dongyan Lin, Rashel Moritz, Juan Pino, Emmanuel Dupoux:

LongTail-Swap: benchmarking language models' abilities on rare words. 11231-11251 - Xiang Li, Xianfu Cheng, Dezhuang Miao, Xiaoming Zhang, Zhoujun Li:

TF-Mamba: Text-enhanced Fusion Mamba with Missing Modalities for Robust Multimodal Sentiment Analysis. 11252-11267 - Manon Reusens, Bart Baesens, David Jurgens:

Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs. 11268-11287 - Mohamad Ballout, Okajevo Wilfred, Seyedalireza Yaghoubi, Nohayr Abdelmoneim, Julius Mayer, Elia Bruni:

Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs. 11288-11309 - Sebastian Steindl, Fabian Brunner, Nada Sissouno, Dominik Schwagerl, Florian Schöler-Niewiera, Ulrich Schäfer:

On the Effectiveness of Prompt-Moderated LLMs for Math Tutoring at the Tertiary Level. 11310-11323 - Hairu Wang, Yuan Feng, Yukun Cao, Xike Xie, S. Kevin Zhou:

SkewRoute: Training-Free LLM Routing for Knowledge Graph Retrieval-Augmented Generation via Score Skewness of Retrieved Context. 11324-11340 - Daniel Braun:

Acquiescence Bias in Large Language Models. 11341-11355 - Niv Eckhaus, Uri Berger, Gabriel Stanovsky:

Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games. 11356-11368 - Matthieu Dubois, François Yvon, Pablo Piantanida:

How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study. 11369-11387 - Sebastian Steindl, André Kestler, Ulrich Schäfer, Bernd Ludwig:

An Improved, Strong Baseline for Pre-Trained Large Language Models as Task-Oriented Dialogue Systems. 11388-11398 - Marah Ghoummaid, Vladimir Tchuiev, Ofek Glick, Michal Moshkovitz, Dotan Di Castro:

MATCH: Task-Driven Code Evaluation through Contrastive Learning. 11399-11414 - Longfei Zuo, Pingjun Hong, Oliver Kraus, Barbara Plank, Robert Litschko:

Evaluating Large Language Models for Cross-Lingual Retrieval. 11415-11429 - Xiang Li, Keyu Yao, Gang Shen:

SGCD: Subtask-Guided Causal-Debiasing Framework for Robust Cross-Utterance Sentiment Quadruple Extraction in Dialogues. 11430-11440 - Erfan Zinvandi, Morteza Alikhani, Mehran Sarmadi, Zahra Pourbahman, Sepehr Arvin, Reza Kazemi, Arash Amini:

FaMTEB: Massive Text Embedding Benchmark in Persian Language. 11441-11468 - Kazuma Kobayashi, Zhen Wan, Fei Cheng, Tsuta Yuma, Xin Zhao, Junfeng Jiang, Jiahao Huang, Zhiyi Huang, Yusuke Oda, Rio Yokota, Yuki Arase, Daisuke Kawahara, Akiko Aizawa, Sadao Kurohashi:

Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training. 11469-11488 - Hu Xu, Zeyan Li, Rui Wang, Jianfeng Xu:

Structure Trumps Size: Rethinking Data Quality for LLM Reasoning. 11489-11513 - Prerna Agarwal, Srikanta Bedathur:

A Zero-Shot Neuro-Symbolic Approach for Complex Knowledge Graph Question Answering. 11514-11527 - Shuyang Hao, Yiwei Wang, Bryan Hooi, Jun Liu, Muhao Chen, Zi Huang, Yujun Cai:

Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization. 11528-11543 - Hyomin Kim, Yunhui Jang, Sungsoo Ahn:

MT-Mol: Multi Agent System with Tool-based Reasoning for Molecular Optimization. 11544-11573 - Qiyao Peng, Hongtao Liu, Hua Huang, Jian Yang, Qing Yang, Minglai Shao:

A Survey on LLM-powered Agents for Recommender Systems. 11574-11583 - Xuan Ren, Qi Chen, Lingqiao Liu:

Efficiently Selecting Response Generation Strategies for Synthetic Data Construction by Self-Aligned Perplexity. 11584-11605 - Rubing Chen, Jiaxin Wu, Jian Wang, Xulu Zhang, Wenqi Fan, Chenghua Lin, Xiaoyong Wei, Li Qing:

Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond. 11606-11619 - Chihiro Yano, Kosuke Yamada, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda:

FrameEOL: Semantic Frame Induction using Causal Language Models. 11620-11632 - Di Wu Hebeu, Zhizhi Yu:

CaTER: A Framework for Context-aware Topology Entity Retrieval Contrastive Learning in End-to-End Task-Oriented Dialogue Systems. 11633-11648 - Feiyu Wang, Ziran Zhao, Dong Yu, Pengyuan Liu:

Attribution and Application of Multiple Neurons in Multimodal Large Language Models. 11649-11662 - Elisei Rykov, Kseniia Petrushina, Maksim Savkin, Valerii Olisov, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, Julia Belikova:

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA. 11663-11682 - Yiyang Feng, Yichen Wang, Shaobo Cui, Boi Faltings, Mina Lee, Jiawei Zhou:

Unraveling Misinformation Propagation in LLM Reasoning. 11683-11707 - Qingsong Lv, Yangning Li, Zihua Lan, Zishan Xu, Jiwei Tang, Tingwei Lu, Yinghui Li, Wenhao Jiang, Hong-Gee Kim, Hai-Tao Zheng, Philip S. Yu:

RAISE: Reinforced Adaptive Instruction Selection For Large Language Models. 11708-11723 - Yangning Li, Tingwei Lu, Yinghui Li, Yankai Chen, Wei-Chieh Huang, Wenhao Jiang, Hui Wang, Hai-Tao Zheng, Philip S. Yu:

Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning. 11724-11741 - Mingqian Zheng, Wenjia Hu, Patrick Zhao, Motahhare Eslami, Jena D. Hwang, Faeze Brahman, Carolyn Rose, Maarten Sap:

Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences. 11742-11772 - Zekun Zhou, Xiaocheng Feng, Lei Huang, Xiachong Feng, Ziyun Song, Ruihan Chen, Liang Zhao, Weitao Ma, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Ting Liu, Bing Qin:

From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems. 11773-11803 - Zhibo Xu, Jianhao Zhu, Jingwen Xu, Changze Lv, Zhenghua Wang, Zisu Huang, Xiaohua Wang, Muling Wu, Qi Qian, Xiaoqing Zheng, Xuanjing Huang:

Enhancing Model Privacy in Federated Learning with Random Masking and Quantization. 11804-11816 - Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, Rossella Arcucci:

SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning. 11817-11844 - Pala Tej Deep, Vernon Toh, Rishabh Bhardwaj, Soujanya Poria:

Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique. 11845-11860 - Wen-Han Hsieh, Elvis Hsieh, Dantong Niu, Trevor Darrell, Roei Herzig, David M. Chan:

Do What? Teaching Vision-Language-Action Models to Reject the Impossible. 11861-11869 - Chunhao Tian, Yutong Wang, Xuebo Liu, Zhexuan Wang, Liang Ding, Miao Zhang, Min Zhang:

AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration. 11870-11902 - Auss Abbood, Zaiqiao Meng, Nigel Collier:

Time to Revisit Exact Match. 11903-11926 - Liyao Li, Jiaming Tian, Hao Chen, Wentao Ye, Chao Ye, Haobo Wang, Ningtao Wang, Xing Fu, Gang Chen, Junbo Zhao:

LongTableBench: Benchmarking Long-Context Table Reasoning across Real-World Formats and Domains. 11927-11965 - Boyu Jia, Junzhe Zhang, Huixuan Zhang, Xiaojun Wan:

Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language Models. 11966-11981 - Matthieu Tehenan, Eric Chamoun, Andreas Vlachos:

MPTA: MultiTask Personalization Assessment. 11982-11992 - Matthieu Tehenan:

Semantic Geometry of Sentence Embeddings. 11993-12004 - Ruijun Chen, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang, Xiaojun Quan:

ReAlign: Structured Revision for Small Language Model Alignment. 12005-12020 - Huilin Deng, Ding Zou, Xinghao Zhao, Rui Ma, Yanming Guo, Yang Cao, Yu Kang:

Curr-ReFT: Overcoming Training Bottlenecks in Small-scale Vision-Language Models via Curriculum Reinforcement Finetuning. 12021-12032 - Yan-Lun Chen, Yi-Ru Wei, Chia-Yi Hsu, Chi-Yu Li, Chun-Ying Huang, Ying-Dar Lin, Yu-Sung Wu, Wei-Bin Lee:

Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge. 12033-12054 - Zihan Zhou, Simon Kurz, Zhixue Zhao:

Revisiting Pruning vs Quantization for Small Language Models. 12055-12070 - Xinzhe Xu, Liang Zhao, Hongshen Xu, Chen Chen:

CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis. 12071-12103 - Anagha Savit, Harikrishna Sahu, Shivank Shukla, Wei Xiong, Rampi Ramprasad:

polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design. 12104-12119 - Yangning Li, Weizhi Zhang, Yuyao Yang, Wei-Chieh Huang, Yaozu Wu, Junyu Luo, Yuanchen Bei, Henry Peng Zou, Xiao Luo, Yusheng Zhao, Chunkit Chan, Yankai Chen, Zhongfen Deng, Yinghui Li, Hai-Tao Zheng, Dongyuan Li, Renhe Jiang, Ming Zhang, Yangqiu Song, Philip S. Yu:

A Survey of RAG-Reasoning Systems in Large Language Models. 12120-12145 - Omar Sharif, Joseph Gatto, Madhusudan Basak, Sarah Masud Preum:

REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction. 12146-12168 - Enshi Zhang, Christian Poellabauer:

Mitigating Interviewer Bias in Multimodal Depression Detection: An Approach with Adversarial Learning and Contextual Positional Encoding. 12169-12188 - Yuqi Zhang, Yuchun Miao, Zuchao Li, Liang Ding:

AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak Defenders. 12189-12199 - Khanh-Tung Tran, Nguyet-Hang Vu, Barry O'Sullivan, Hoang D. Nguyen:

Disentangling Language Understanding and Reasoning Structures in Cross-lingual Chain-of-Thought Prompting. 12200-12206 - Andrei-Marius Avram, Ema-Ioana Banescu, Anda-Teodora Robea, Dumitru-Clementin Cercel, Mihaela-Claudia Cercel:

MoRoVoc: A Large Dataset for Geographical Variation Identification of the Spoken Romanian Language. 12207-12216 - Lance Ying, Ryan Truong, Katherine M. Collins, Cedegao E. Zhang, Megan Wei, Tyler Brooke-Wilson, Tan Zhi-Xuan, Lionel Wong, Joshua B. Tenenbaum:

Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-the-fly. 12217-12235 - Zaid Alyafeai, Maged Saeed AlShaibani, Bernard Ghanem:

MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs. 12236-12264 - Mugilan Ganesan, Shane Segal, Ankur Aggarwal, Nish Sinnadurai, Sean Lie, Vithursan Thangarasa:

MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models. 12265-12276 - Debarpan Bhattacharya, Apoorva Kulkarni, Sriram Ganapathy:

FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs. 12277-12295 - Siying Zhou, Yiquan Wu, Hui Chen, Xueyu Hu, Kun Kuang, Adam Jatowt, Chunyan Zheng, Fei Wu:

ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation. 12296-12323 - Yifei Yuan, Jiatong Li, Weijia Zhang, Mohammad Aliannejadi, Evangelos Kanoulas, Renjun Hu:

Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular Classification. 12324-12348 - Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, Dan Roth:

Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty. 12349-12375 - Konstantine Arkoudas, Serafim Batzoglou:

Stress-Testing the Reasoning Competence of Language Models With Formal Proofs. 12376-12394 - Chuyuan Li, Austin Xu, Shafiq Joty, Giuseppe Carenini:

Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization. 12395-12412 - Deema Alnuhait, Neeraja Kirtane, Muhammad Khalifa, Hao Peng:

FACTCHECKMATE: Preemptively Detecting and Mitigating Hallucinations in LMs. 12413-12428 - Fahim Faisal, Md Mushfiqur Rahman, Antonios Anastasopoulos:

Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties. 12429-12452 - Pushkar Shukla, Aditya Chinchure, Emily Diana, Alexander Tolbert, Kartik Hosanagar, Vineeth N. Balasubramanian, Leonid Sigal, Matthew A. Turk:

Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models. 12453-12472 - Yuchun Fan, Yilin Wang, Yongyu Mu, Lei Huang, Bei Li, Xiaocheng Feng, Tong Xiao, JingBo Zhu:

Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models. 12473-12500 - Sikun Guo, Amir Hassan Shariatmadari, Peng Wang, Albert Huang, Aidong Zhang:

InfAL: Inference Time Adversarial Learning for Improving Research Ideation. 12501-12522 - Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li:

Speculative Decoding for Multi-Sample Inference. 12523-12533 - Hangliang Ren:

LSRL: Process-Supervised GRPO on Latent Recurrent States Improves Mathematical Reasoning. 12534-12545 - Meinan Liu, Yunfang Dong, Xixian Liao, Bonnie Webber:

Multi-token Mask-filling and Implicit Discourse Relations. 12546-12560 - Bohui Zhang, Yuan He, Lydia Pintscher, Albert Meroño-Peñuela, Elena Simperl:

Schema Generation for Large Knowledge Graphs Using Large Language Models. 12561-12580 - Yunhai Hu, Yilun Zhao, Chen Zhao, Arman Cohan:

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search. 12581-12597 - Xinyi Chen, Yifei Yuan, Jiaang Li, Serge J. Belongie, Maarten de Rijke, Anders Søgaard:

What if Othello-Playing Language Models Could See? 12598-12609 - Thomas Berkane, Marie-Laure Charpignon, Maimuna S. Majumder:

LLM-Based Web Data Collection for Research Dataset Creation. 12610-12622 - Shang Ma, Tianyi Ma, Jiahao Liu, Wei Song, Zhenkai Liang, Xusheng Xiao, Yanfang Ye:

PsyScam: A Benchmark for Psychological Techniques in Real-World Scams. 12623-12637 - Zhangming Li, Qinghao Hu, Yiqun Chen, Peisong Wang, Yifan Zhang, Jian Cheng:

LoRaDA: Low-Rank Direct Attention Adaptation for Efficient LLM Fine-tuning. 12638-12655 - Cheng Yan, Feng Zhao, Ruilin Zhao, Hong Zhang:

Inductive Reasoning on Few-Shot Knowledge Graphs with Task-Aware Language Models. 12656-12666 - Zi Yu, Shaoxiang Wang, Guozheng Li, Yu Zhang, Chi Harold Liu:

ForestCast: Open-Ended Event Forecasting with Semantic News Forest. 12667-12681 - Mohammad R. Rezaei, Reza Saadati Fard, Jayson Parker, Rahul G. Krishnan, Milad Lankarany:

Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge. 12682-12701 - Yang Cao, Sikun Yang, Yu-Jiu Yang, Lianyong Qi, Ming Liu:

Text Anomaly Detection with Simplified Isolation Kernel. 12702-12713 - Shin-nosuke Ishikawa, Masato Todo, Taiki Ogihara, Hirotsugu Ohba:

Idola Tribus of AI: Large Language Models tend to perceive order where none exists. 12714-12727 - Sungeun Hahm, Heejin Kim, Gyuseong Lee, Hyunji M. Park, Jaejin Lee:

Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments. 12728-12755 - Yaozu Wu, Dongyuan Li, Yankai Chen, Renhe Jiang, Henry Peng Zou, Wei-Chieh Huang, Yangning Li, Liancheng Fang, Zhen Wang, Philip S. Yu:

Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances, Resources, and Future Directions. 12756-12773 - Shohei Higashiyama, Masao Utiyama:

Comprehensive Evaluation on Lexical Normalization: Boundary-Aware Approaches for Unsegmented Languages. 12774-12799 - Huaming Du, Lei Yuan, Cancan Feng, Guisong Liu, Gang Kou, Carl Yang:

Explainable Text Classification with LLMs: Enhancing Performance through Dialectical Prompting and Explanation-Guided Training. 12800-12816 - Qing Wang, Xue Han, Jiahui Wang, Lehao Xing, Qian Hu, Lianlian Zhang, Chao Deng, Junlan Feng:

MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts. 12817-12828 - Ryan Shea, Zhou Yu:

AutoSpec: An Agentic Framework for Automatically Drafting Patent Specification. 12829-12840 - Hyeonseok Moon, Jaehyung Seo, Seonmin Koo, Jinsung Kim, Young-kyoung Ham, Jiwon Moon, Heuiseok Lim:

LimaCost: Data Valuation for Instruction Tuning of Large Language Models. 12841-12854 - Lanxin Bi, Yunqi Zhang, Luyi Wang, Yake Niu, Hui Zhao:

Two Challenges, One Solution: Robust Multimodal Learning through Dynamic Modality Recognition and Enhancement. 12855-12867 - Yuhan Kang, Yang Shi, Mei Wen, Jun He, Jianchao Yang, Zeyu Xue, Jing Feng, Xinwang Liu:

SwiftPrune: Hessian-Free Weight Pruning for Large Language Models. 12868-12879 - Yang Wu, Yifan Zhang, Yurong Wu, Yuran Wang, Junkai Zhang, Jian Cheng:

Training LLMs for Optimization Modeling via Iterative Data Synthesis and Structured Validation. 12880-12896 - Meina Chen, Yihong Tang, Kehai Chen:

Exploiting Prompt-induced Confidence for Black-Box Attacks on LLMs. 12897-12903 - Wei Huang, Anda Cheng, Zhao Zhang, Yinggui Wang:

DPF-CM: A Data Processing Framework with Privacy-Preserving Vector Databases for Chinese Medical LLMs Training and Deployment. 12904-12916 - Han Weng, Puzhen Wu, Longjie Cui, Yi Zhan, Boyi Liu, Yuanfeng Song, Dun Zeng, Yingxiang Yang, Qianru Zhang, Dong Huang, Xiaoming Yin, Yang Sun, Xing Chen:

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward. 12917-12943 - Dan Zhu, Tianqiao Liu, Zitao Liu:

StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts. 12944-12954 - Chengyao Wen, Qiang Cheng, Shaofei Wang, Zhizhen Liu, Deng Zhao, Lei Liang:

Logic-Thinker: Teaching Large Language Models to Think more Logically. 12955-12969 - Chen Chen, Xinlong Hao, Weiwen Liu, Xu Huang, Xingshan Zeng, Shuai Yu, Dexun Li, Yuefeng Huang, Xiangcheng Liu, Xinzhi Wang, Wu Liu:

ACEBench: A Comprehensive Evaluation of LLM Tool Usage. 12970-12998 - Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai:

RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis. 12999-13011 - Wei Huang, Huang Wei, Yinggui Wang:

DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression. 13012-13027 - Jianfeng Pan, Senyou Deng, Shaomang Huang:

CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning. 13028-13045 - Duo Xu, Hao Cheng, Xin Lin, Zhen Xie, Hao Henry Wang:

ChartM³: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension. 13046-13068 - Gayeon Jung, HyeonSeok Lim, Minjun Kim, Joon-ho Lim, KyungTae Lim, Hansaem Kim:

Can LLMs Truly Plan? A Comprehensive Evaluation of Planning Capabilities. 13069-13084 - Donghai Zhang, Shuangtao Yang, Xiaozheng Dong, Wei Song, Bo Fu:

MARIO-0.5B: A Multi-Agent Lightweight Model for Real-Time Open Information Extraction in Low-Resource Settings. 13085-13094 - Xiaotian Wang, Takehito Utsuro, Masaaki Nagata:

BiMax: Bidirectional MaxSim Score for Document-Level Alignment. 13095-13116 - Zirui Li, Siwei Wu, Yizhi Li, Xingyu Wang, Yi Zhou, Chenghua Lin:

DocMMIR: A Framework for Document Multi-modal Information Retrieval. 13117-13130 - Hailay Kidu Teklehaymanot, Dren Fazlija, Wolfgang Nejdl:

MoVoC: Morphology-Aware Subword Construction for Ge'ez Script Languages. 13131-13144 - Kehang Jia, Juntao Li, Xiaobo Liang, Yisheng Xiao, Yixuan Yang, Min Zhang:

MMA: Cross-Domain Knowledge Integration via Mixture of Multi-Domain Agents. 13145-13160 - Seonmin Koo, Jinsung Kim, Chanjun Park, Heuiseok Lim:

HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts. 13161-13184 - Hao Zhang, Bo Huang, Zhenjia Li, Xi Xiao, Hui Yi Leong, Zumeng Zhang, Xinwei Long, Tianyang Wang, Hao Xu:

Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language Models. 13185-13199 - Yang Wu, Huayi Zhang, Yizheng Jiao, Lin Ma, Xiaozhong Liu, Jinhong Yu, Dongyu Zhang, Dezhi Yu, Wei Xu:

ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning. 13200-13219 - Nishant Subramani, Alfredo Gomez, Mona T. Diab:

SimBA: Simplifying Benchmark Analysis Using Performance Matrices Alone. 13220-13233 - Anuj Kumar, Mohammed Faisal Sayed, Satyadev Ahlawat, Yamuna Prasad:

MarathiEmoExplain: A Dataset for Sentiment, Emotion, and Explanation in Low-Resource Marathi. 13234-13243 - Yang Wu, Raha Moraffah, Rujing Yao, Jinhong Yu, Zhimin Tao, Xiaozhong Liu:

Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains. 13244-13257 - Mengyang Chen, Lingwei Wei, Wei Zhou, Songlin Hu:

Structure-aware Propagation Generation with Large Language Models for Fake News Detection. 13258-13272 - Sangmin Lee, Woojin Chung, Seyun Um, Hong-Goo Kang:

UniCoM: A Universal Code-Switching Speech Generator. 13273-13288 - Yunhai Hu, Zining Liu, Zhenyuan Dong, Tianfan Peng, Bradley McDanel, Sai Qian Zhang:

Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models. 13289-13304 - Nathan Inkiriwang, Necva Bölücü, Garth Tarr, Maciej Rybinski:

Do We Really Need All Those Dimensions? An Intrinsic Evaluation Framework for Compressed Embeddings. 13305-13323 - Zitao Wang, Xinyi Wang, Wei Hu:

Mixture of LoRA Experts for Continual Information Extraction with LLMs. 13324-13339 - Tatsuya Hiraoka, Kentaro Inui:

Spelling-out is not Straightforward: LLMs' Capability of Tokenization from Token to Characters. 13340-13353 - He Zhu, Tianrui Qin, King Zhu, Heyuan Huang, Yeyi Guan, Jinxiang Xia, Hanhao Li, Yi Yao, Ningning Wang, Pai Liu, Tianhao Peng, Xin Gui, Xiaowan Li, Yuhui Liu, Xiangru Tang, Jian Yang, Ge Zhang, Xitong Gao, Yuchen Eleanor Jiang, Changwang Zhang, Jun Wang, Jiaheng Liu, Wangchunshu Zhou:

OAgents: An Empirical Study of Building Effective Agents. 13354-13369 - Vildan Saburov, Daniil Vodolazsky, Danil Sazanakov, Alena Fenogenova:

2Columns1Row: A Russian Benchmark for Textual and Multimodal Table Understanding and Reasoning. 13370-13389 - Wenrui Bao, Kai Wang, Siqiang Luo, Xiang Li:

Permitted Knowledge Boundary: Evaluating the Knowledge-Constrained Responsiveness of Large Language Models. 13390-13405 - Sriram Balasubramanian, Samyadeep Basu, Soheil Feizi:

A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models. 13406-13439 - Geng Zhang, Yizhou Ying, Sihang Jiang, Jiaqing Liang, Guanglei Yue, Yifei Fu, Hailin Hu, Yanghua Xiao:

From Remembering to Metacognition: Do Existing Benchmarks Accurately Evaluate LLMs? 13440-13457 - Tatsuro Inaba, Go Kamoda, Kentaro Inui, Masaru Isonuma, Yusuke Miyao, Yohei Oseki, Yu Takagi, Benjamin Heinzerling:

How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders. 13458-13470 - Xuan Lu, Sifan Liu, Bochao Yin, Yongqi Li, Xinghao Chen, Hui Su, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen:

MultiConIR: Towards Multi-Condition Information Retrieval. 13471-13494 - Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao:

HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning. 13495-13518 - Zheni Zeng, Yuxuan Chen, Shi Yu, Ruobing Wang, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun:

KBAlign: Efficient Self Adaptation on Specific Textual Knowledge Bases. 13519-13532 - Xiang Cheng, Chengyan Pan, Minjun Zhao, Deyang Li, Fangchao Liu, Xinyu Zhang, Xiao Zhang, Yong Liu:

Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot. 13533-13554 - Hao Xiang, Tianyi Tang, Yang Su, Bowen Yu, An Yang, Fei Huang, Yichang Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Jingren Zhou, Junyang Lin, Le Sun:

RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing. 13555-13571 - Huatong Song, Jinhao Jiang, Wenqing Tian, Zhipeng Chen, Yuhuan Wu, Jiahao Zhao, Yingqian Min, Xin Zhao, Lei Fang, Ji-Rong Wen:

Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning. 13572-13586 - Cheng Jiayang, Qianqian Zhuang, Haoran Li, Chunkit Chan, Xin Liu, Lin Qiu, Yangqiu Song:

InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding. 13587-13602 - Gailun Zeng, Ziyang Luo, Hongzhan Lin, Yuchen Tian, Kaixin Li, Ziyang Gong, Jianxiong Guo, Jing Ma:

MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique. 13603-13630 - Enrique Amigó, Adrián Ghajari, Alejandro Benito-Santos, Diego De La Fuente Rodríguez:

On the Correspondence between the Squared Norm and Information Content in Text Embeddings. 13631-13643 - Fenghua Weng, Jian Lou, Jun Feng, Minlie Huang, Wenjie Wang:

Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training. 13644-13657 - Mengxue Yang, Chun Yang, Jiaqi Zhu, Jiafan Li, Jingqi Zhang, Yuyang Li, Ying Li:

SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph Completion. 13658-13671 - Yiqun Shen, Song Yuan, Zhengze Zhang, Xiaoliang Wang, Daxin Jiang, Cam-Tu Nguyen:

LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation. 13672-13692 - Yining Huang, Bin Li, Keke Tang, Meilian Chen:

LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning. 13693-13704 - Shuang Sun, Huatong Song, Yuhao Wang, Ruiyang Ren, Jinhao Jiang, Junjie Zhang, Fei Bai, Jia Deng, Xin Zhao, Zheng Liu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen:

SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis. 13705-13720 - Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Jinsong Su:

LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning. 13721-13735 - Xiangyu Xi, Deyang Kong, Jian Yang, Jiawei Yang, Zhengyu Chen, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye:

SampleMix: A Sample-wise Pre-training Data Mixing Strategy by Coordinating Data Quality and Diversity. 13736-13758 - Yinghao Hu, Yaoyao Yu, Leilei Gan, Bin Wei, Kun Kuang, Fei Wu:

Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek-R1, and Beyond. 13759-13781 - Zhendong Chu, Shen Wang, Jian Xie, Tinghui Zhu, Yibo Yan, Jingheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang, Philip S. Yu, Qingsong Wen:

LLM Agents for Education: Advances and Applications. 13782-13810 - Yuxiang Zhou, Hainiu Xu, Desmond C. Ong, Maria Liakata, Petr Slovák, Yulan He:

Modeling Subjectivity in Cognitive Appraisal with Language Models. 13811-13833 - Lotem Peled-Cohen, Maya Zadok, Nitay Calderon, Hila Gonen, Roi Reichart:

Dementia Through Different Eyes: Explainable Modeling of Human and LLM Perceptions for Early Awareness. 13834-13860 - Yifan Lu, Ziqi Zhang, Chunfeng Yuan, Jun Gao, Congxuan Zhang, Xiaojuan Qi, Bing Li, Weiming Hu:

Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations. 13861-13877 - Kunhang Li, Jason Naradowsky, Yansong Feng, Yusuke Miyao:

How Much Do Large Language Models Know about Human Motion? A Case Study in 3D Avatar Control. 13878-13921 - Garima Gaur, Oana Balalau, Ioana Manolescu, Prajna Upadhyay:

The Search for Conflicts of Interest: Open Information Extraction in Scientific Publications. 13922-13936 - Peng Chen, Bang Wang:

On Collaborating Small and Large Models For Few-shot Intent Detection. 13937-13953 - Maria Teleki, Vedangi Bengali, Xiangjue Dong, Sai Janjur, Haoran Liu, Tian Liu, Cong Wang, Ting Liu, Yin Zhang, Frank Shipman, James Caverlee:

A Survey on LLMs for Story Generation. 13954-13966 - Chengrui Xiang, Tengfei Ma, Xiangzheng Fu, Yiping Liu, Bosheng Song, Xiangxiang Zeng:

From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug Repurposing. 13967-13982 - Xiaotong Xu, Yizhao Wang, Yunfei Liu, Shengyang Li:

SKRAG: A Retrieval-Augmented Generation Framework Guided by Reasoning Skeletons over Knowledge Graphs. 13983-13994 - Changjiang Zhou, Ruqing Zhang, Jiafeng Guo, Yuan Liu, Fan Zhang, Ganyuan Luo, Xueqi Cheng:

A Generative Framework for Personalized Sticker Retrieval. 13995-14009 - Zhiyue Liu, Wenkai Zhou:

Bridging Semantic and Modality Gaps in Zero-Shot Captioning via Retrieval from Synthetic Data. 14010-14023 - Yuriel Ryan, Rui Yang Tan, Kenny Tsu Wei Choo, Roy Ka-Wei Lee:

Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics. 14024-14050 - Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled Aldahmani, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman H. Khan, Timothy Baldwin, Hisham Cholakkal:

BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical Modalities. 14051-14071 - Yuhan Liu, Cong Xu, Lu Liu, Yihua Wang, Feiyu Chen, Qi Jia, Yaqian Zhao, Zhichun Wang, Xiang Li:

DeMAC: Enhancing Multi-Agent Coordination with Dynamic DAG and Manager-Player Feedback. 14072-14098 - Paul Piwek, Jacopo Amidei, Svetlana Stoyanchev:

Coherence of Argumentative Dialogue Snippets: A New Method for Large Scale Evaluation with an Application to Inference Anchoring Theory. 14099-14119 - Evgeniia Tokarchuk, Sergey Troshin, Vlad Niculae:

Angular Dispersion Accelerates k-Nearest Neighbors Machine Translation. 14120-14132 - Qiongqiong Wang, Hardik B. Sailor, Tianchi Liu, Wenyu Zhang, Muhammad Huzaifah, Nattadaporn Lertcheva, Shuo Sun, Nancy F. Chen, Jinyang Wu, AiTi Aw:

Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data. 14133-14148 - Joshua Jose Dias Barreto, Abhik Jana:

This is not a Disimprovement: Improving Negation Reasoning in Large Language Models via Prompt Engineering. 14149-14156 - Robert Litschko, Verena Blaschke, Diana Burkhardt, Barbara Plank, Diego Frassinelli:

Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora. 14157-14174 - Yuqing Huang, Rongyang Zhang, Qimeng Wang, Chengqiang Lu, Yan Gao, Yi Wu, Yao Hu, Xuyang Zhi, Guiquan Liu, Xin Li, Hao Wang, Enhong Chen:

SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment. 14175-14190 - Matej Martinc, Tran Thi Hong Hanh, Senja Pollak, Boshko Koloski:

SEKE: Specialised Experts for Keyword Extraction. 14191-14205 - Zeliang Zong, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, Yiyan Zhai, Jilin Hu:

1+1\ensuremath>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models. 14206-14220 - Xiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, Quanzeng You:

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning. 14221-14231 - Wei Zhao, Zhe Li, Yige Li, Jun Sun:

Zero-Shot Defense Against Toxic Images via Inherent Multimodal Alignment in LVLMs. 14232-14246 - Dimitrios Siskos, Stavros Papadopoulos, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Anastasios Drosou:

Retrieval Augmented Generation based context discovery for ASR. 14247-14254 - Hangyu He, Xin Yuan, Kai Wu, Ren Ping Liu, Wei Ni:

pFedRAG: A Personalized Federated Retrieval-Augmented Generation System with Depth-Adaptive Tiered Embedding Tuning. 14255-14268 - Zhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu, Qi Shi, Yukun Yan, Shuo Wang, Furong Peng, Ge Yu:

ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization. 14269-14282 - Aysenur Kocak, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci:

CURE: Controlled Unlearning for Robust Embeddings - Mitigating Conceptual Shortcuts in Pre-Trained Language Models. 14283-14297 - Yunfei Wang, Yeqin Zhang, Yuyang Wu, Liang Lu, Phi Le Nguyen, Xiaoliang Wang, Nguyen Cam-Tu:

MLAlgo-Bench: Can Machines Implement Machine Learning Algorithms? 14298-14329 - Ruilin Luo, Tianle Gu, Lin Wang, Yunfeng Zhou, Songtao Jiang, Lei Wang, Yujiu Yang:

Fair Text-Attributed Graph Representation Learning. 14330-14353 - Zekun Wang, Jingjie Zeng, Yingxu Li, Liang Yang, Hongfei Lin:

Human-Inspired Obfuscation for Model Unlearning: Local and Global Strategies with Hyperbolic Representations. 14354-14366 - Zhe Li, Wei Zhao, Yige Li, Jun Sun:

Do Influence Functions Work on Large Language Models? 14367-14382 - Jiho Park, Jongyoon Song, Minjin Choi, Kyuho Heo, Taehun Huh, Ji Won Kim:

TRUEBench: Can LLM Response Meet Real-world Constraints as Productivity Assistant? 14383-14409 - Qi Chai, Zhang Zheng, Junlong Ren, Deheng Ye, Zichuan Lin, Hao Wang:

CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks. 14410-14426 - Bang Trinh Tran To, Thai Le:

Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models. 14427-14439 - Nicola Arici, Luca Putelli, Ejdis Gjinika, Ivan Serina, Alfonso Gerevini:

Learning Trajectories of Figurative Language for Pre-Trained Language Models. 14440-14461 - Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei:

BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion. 14462-14472 - Guimin Hu, Daniel Hershcovich, Hasti Seifi:

HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals. 14473-14489 - Hanghai Hong, Yibo Xie, Jiawei Zheng, Xiaoli Wang:

SubDocTrans: Enhancing Document-level Machine Translation with Plug-and-play Multi-granularity Knowledge Augmentation. 14490-14506 - Rem Hida, Masahiro Kaneko, Naoaki Okazaki:

Social Bias Evaluation for Large Language Models Requires Prompt Variations. 14507-14530 - Taowen Liu, Marta Andronic, Deniz Gündüz, George Anthony Constantinides:

Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding. 14531-14546 - Radu Marinescu, Debarun Bhattacharjya, Junkyu Lee, Tigran T. Tchrakian, Javier Carnerero-Cano, Yufang Hou, Elizabeth M. Daly, Alessandra Pascale:

FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models. 14547-14577 - Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao:

Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA. 14578-14586 - Ruihan Jin, Pengpeng Shao, Zhengqi Wen, Jinyang Wu, Mingkuan Feng, Shuai Zhang, Jianhua Tao:

RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing. 14587-14600 - Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe:

Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models. 14601-14613 - Hiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, Byron C. Wallace:

Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare. 14614-14631 - Yerin Hwang, Dongryeol Lee, Taegwan Kang, Yongil Kim, Kyomin Jung:

Can You Trick the Grader? Adversarial Persuasion of LLM Judges. 14632-14651 - Yusuf Sali, Sitki Can Toraman:

Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language Models. 14652-14664 - Adi Simhi, Itay Itzhak, Fazl Barez, Gabriel Stanovsky, Yonatan Belinkov:

Trust Me, I'm Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer. 14665-14688 - Mohamed Imed Eddine Ghebriout, Gaël Guibon, Ivan Lerner, Emmanuel Vincent:

QUARTZ: QA-based Unsupervised Abstractive Refinement for Task-oriented Dialogue Summarization. 14689-14706 - Yinhong Liu, Jianfeng He, Hang Su, Ruixue Lian, Yi Nian, Jake W. Vincent, Srikanth Vishnubhotla, Robinson Piramuthu, Saab Mansour:

MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue Summarization. 14707-14727 - Chenzhuo Zhao, Ziqian Liu, Xinda Wang, Junting Lu, Chaoyi Ruan:

PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models. 14728-14761 - Armin Tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh:

Evaluating the Creativity of LLMs in Persian Literary Text Generation. 14762-14774 - Taichi Aida, Danushka Bollegala:

SCDTour: Embedding Axis Ordering and Merging for Interpretable Semantic Change Detection. 14775-14785 - Bhiman Kumar Baghel, Emma Jordan, Zheyuan Ryan Shi, Xiang Lorraine Li:

Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing. 14786-14808 - Yongju Jia, Jiarui Ma, Xiangxian Li, Baiqiao Zhang, Xianhui Cao, Juan Liu, Yulong Bian:

LLM-empowered Dynamic Prompt Routing for Vision-Language Models Tuning under Long-Tailed Distributions. 14809-14822 - Guang Yang, Yujie Zhu:

HGAdapter: Hypergraph-based Adapters in Language Models for Code Summarization and Clone Detection. 14823-14833 - Takateru Yamakoshi, Thomas L. Griffiths, R. Thomas McCoy, Robert D. Hawkins:

Evaluating distillation methods for data-efficient syntax learning. 14834-14847 - Eojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, SangKeun Lee:

"Going to a trap house" conveys more fear than "Going to a mall": Benchmarking Emotion Context Sensitivity for LLMs. 14848-14869 - Dimitra Niaouri, Rayane Ghilene, Michele Linardi, Julien Longhi:

[MASK]ED - Language Modeling for Explainable Classification and Disentangling of Socially Unacceptable Discourse. 14870-14883 - Archie Sage, Jeroen Keppens, Helen Yannakoudakis:

A Survey of Cognitive Distortion Detection and Classification in NLP. 14884-14899 - Weiyuan Li, Xintao Wang, Siyu Yuan, Rui Xu, Jiangjie Chen, Qingqing Dong, Yanghua Xiao, Deqing Yang:

Curse of Knowledge: Your Guidance and Provided Knowledge are biasing LLM Judges in Complex Evaluation. 14900-14924 - Hyosoon Jang, Yunhui Jang, Sungjae Lee, Jungseul Ok, Sungsoo Ahn:

Self-Training Large Language Models with Confident Reasoning. 14925-14939 - Pala Tej Deep, Panshul Sharma, Amir Zadeh, Chuan Li, Soujanya Poria:

Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision. 14940-14954 - Weicheng Ma, Hefan Zhang, Shiyu Ji, Farnoosh Hashemi, Qichao Wang, Ivory Yang, Joice Chen, Juanwen Pan, Michael Macy, Saeed Hassanpour, Soroush Vosoughi:

Enhancing LLM-Based Persuasion Simulations with Cultural and Speaker-Specific Information. 14955-14976 - Yang Han, Jacqueline C. K. Lam, Victor On Kwok Li, Lawrence Y. L. Cheung:

An LLM-based Temporal-spatial Data Generation and Fusion Approach for Early Detection of Late Onset Alzheimer's Disease (LOAD) Stagings Especially in Chinese and English-speaking Populations. 14977-14990 - Shaswati Saha, Sourajit Saha, Manas Gaur, Tejas Gokhale:

Side Effects of Erasing Concepts from Diffusion Models. 14991-15007 - Jiashi Lin, Changhong Jiang, Yixiao Wang, Xinyi Zhu, Zhongtian Hu, Wei Zhang:

SaCa: A Highly Compatible Reinforcing Framework for Knowledge Graph Embedding via Structural Pattern Contrast. 15008-15021 - Yitong Wang, Zhongping Zhang, Margherita Piana, Zheng Zhou, Peter Gerstoft, Bryan A. Plummer:

Real, Fake, or Manipulated? Detecting Machine-Influenced Text. 15022-15037 - Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao:

Character is Destiny: Can Persona-assigned Language Models Make Personal Choices? 15038-15059 - Saba Ghanbari Haez, Mauro Dragoni:

Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios. 15060-15088 - Jun Hou, Le Wang, Xuan Wang:

BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model Integration. 15089-15103 - Kaustubh Olpadkar, Vikram Sunil Bajaj, Leslie Barrett:

Can LLMs Be Efficient Predictors of Conversational Derailment? 15104-15112 - Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, Zhenhua Dong:

Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision. 15113-15128 - Rochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou:

Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical Texts. 15129-15147 - Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, Yinghui Xu:

Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents. 15148-15168 - Yixiao Zhou, Ziyu Zhao, Dongzhou Cheng, Zhiliang Wu, Jie Gui, Yi Yang, Fei Wu, Yu Cheng, Hehe Fan:

Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs. 15169-15186 - Xiaoqing Cheng, Ruizhe Chen, Hongying Zan, Yuxiang Jia, Min Peng:

BiasFilter: An Inference-Time Debiasing Framework for Large Language Models. 15187-15205 - Wenqi Zhou, Kai Cao, Hao Zheng, Yunze Liu, Xinyi Zheng, Miao Liu, Per Ola Kristensson, Walterio W. Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen:

X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding. 15206-15222 - Zhihong Zhu, Fan Zhang, Yunyan Zhang, Jinghan Sun, Zhiqi Huang, Qingqing Long, Bowen Xing, Xian Wu:

A Survey on Multi-modal Intent Recognition: Recent Advances and New Frontiers. 15223-15236 - Amir Homayounirad, Enrico Liscio, Tong Wang, Catholijn M. Jonker, Luciano Cavalcante Siebert:

Will Annotators Disagree? Identifying Subjectivity in Value-Laden Arguments. 15237-15252 - Sho Takishita, Jay P. Gala, Abdelrahman Mohamed, Kentaro Inui, Yova Kementchedjhieva:

LLMs Can Compensate for Deficiencies in Visual Representations. 15253-15272 - Dylan Gaines, Keith Vertanen:

Adapting Large Language Models for Character-based Augmentative and Alternative Communication. 15273-15291 - Elena Merdjanovska, Alan Akbik:

Token-Level Metrics for Detecting Incorrect Gold Annotations in Named Entity Recognition. 15292-15304 - Eugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio, Massimo Zancanaro:

Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMs. 15305-15318 - Zhexiong Liu, Diane J. Litman:

Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction. 15319-15334 - Ahatsham Hayat, Bilal Khan, Mohammad Rashedul Hasan:

ConText-LE: Cross-Distribution Generalization for Longitudinal Experiential Data via Narrative-Based LLM Representations. 15335-15360 - Weixiang Zhao, Xingyu Sui, Xinyang Han, Yang Deng, Yulin Hu, Jiahe Guo, Libo Qin, Qianyun Du, Shijin Wang, Yanyan Zhao, Bing Qin, Ting Liu:

Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter. 15361-15381 - Luca Rolshoven, Vishvaksenan Rasiah, Srinanda Brügger Bose, Sarah Hostettler, Lara Burkhalter, Matthias Stürmer, Joel Niklaus:

Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland. 15382-15411 - Nahid Hossain, Md Faisal Kabir:

Context Minimization for Resource-Constrained Text Classification: Optimizing Performance-Efficiency Trade-offs through Linguistic Features. 15412-15426 - Gunjan Jalori, Preetika Verma, Sercan Ö Arik:

FLAIRR-TS - Forecasting LLM-Agents with Iterative Refinement and Retrieval for Time Series. 15427-15437 - Longfei Yun, Letian Peng, Jingbo Shang:

ULTRABENCH: Benchmarking LLMs under Extreme Fine-grained Text Generation. 15438-15453 - Longfei Yun, Chenyang An, Zilong Wang, Letian Peng, Jingbo Shang:

The Price of Format: Diversity Collapse in LLMs. 15454-15468 - Nikolay Mikhaylovskiy:

Zipf's and Heaps' Laws for Tokens and LLM-generated Texts. 15469-15481 - Rushil Gupta, Jason Hartford, Bang Liu:

LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet? 15482-15510 - Roxana Petcu, Samarth Bhargav, Maarten de Rijke, Evangelos Kanoulas:

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers. 15511-15533 - Maeda F. Hanafi, Ishan Jindal, Yannis Katsis, Lucian Popa, Huaiyu Zhu:

Identifying Noise in Human-Created Datasets using Training Dynamics from Generative Models. 15534-15550 - Yang Nan, Pengfei He, Ravi Tandon, Han Xu:

Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty? 15551-15569 - Tadesse Destaw Belay, Israel Abebe Azime, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Idris Abdulmumin, Abinew Ali Ayele, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam:

AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text. 15570-15587 - Tenghao Huang, Sihao Chen, Muhao Chen, Jonathan May, Longqi Yang, Mengting Wan, Pei Zhou:

Teaching Language Models To Gather Information Proactively. 15588-15599 - Dorothea French, Robert G. Moulder, Kelechi Ezema, Katharina von der Wense, Sidney K. D'Mello:

Linguistic Alignment Predicts Learning in Small Group Tutoring Sessions. 15600-15611 - Sanchit Ahuja, Praneetha Vaddamanu, Barun Patra:

EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning. 15612-15624 - Elahe Rahimi, Hassan Sajjad, Domenic Rosati, Abeer Badawi, Elham Dolatabadi, Frank Rudzicz:

Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM Summarization. 15625-15641 - Yifeng He, Jicheng Wang, Yuyang Rong, Hao Chen:

FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation. 15642-15655 - Fenglin Liu, Zheng Li, Hongjian Zhou, Qingyu Yin, Jingfeng Yang, Xin Liu, Zhengyang Wang, Xianfeng Tang, Shiyang Li, Xiang He, Ruijie Wang, Bing Yin, Xiao Gu, Lei A. Clifton, David A. Clifton:

DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical Reasoning. 15656-15668 - Wei Liu, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, Felix Hieber:

XRAG: Cross-lingual Retrieval-Augmented Generation. 15669-15690 - Dhananjay Ashok, Ashutosh Chaubey, Hirona Jacqueline Arai, Jonathan May, Jesse Thomason:

Can VLMs Recall Factual Associations From Visual References? 15691-15708 - Jackson Trager, Francielle Vargas, Diego Alves, Matteo Guida, Mikel K. Ngueajio, Ameeta Agrawal, Yalda Daryani, Farzan Karimi-Malekabadi, Flor Miriam Plaza del Arco:

MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech Explanation. 15709-15740 - Ivan Vykopal, Matús Pikuliak, Simon Ostermann, Tatiana Anikina, Michal Gregor, Marián Simko:

Large Language Models for Multilingual Previously Fact-Checked Claim Detection. 15741-15765 - Ashutosh Adhikari, Mirella Lapata:

Debating for Better Reasoning in Vision-Language Models. 15766-15784 - Farsheed Haque, Zhe Fu, Depeng Xu, Shuhan Yuan, Xi Niu:

Fine-tuning LLMs with Cross-Attention-based Weight Decay for Bias Mitigation. 15785-15798 - Jikai Long, Ming Liu, Xiusi Chen, Jialiang Xu, Shenglan Li, Zhaozhuo Xu, Denghui Zhang:

Profiling LLM's Copyright Infringement Risks under Adversarial Persuasive Prompting. 15799-15823 - Peter Zeng, Pegah Alipoormolabashi, Jihu Mun, Gourab Dey, Nikita Soni, Niranjan Balasubramanian, Owen Rambow, H. Andrew Schwartz:

Residualized Similarity for Faithfully Explainable Authorship Verification. 15824-15837 - Tunazzina Islam, Dan Goldwasser:

Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation. 15838-15859 - Haonan Ge, Yiwei Wang, Ming-Hsuan Yang, Yujun Cai:

MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMs. 15860-15879 - Debarun Bhattacharjya, Balaji Ganesan, Junkyu Lee, Radu Marinescu, Katsiaryna Mirylenka, Michael R. Glass, Xiao Shou:

SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models. 15880-15894 - Abdulla Alshabanah, Murali Annavaram:

Mind the Dialect: NLP Advancements Uncover Fairness Disparities for Arabic Users in Recommendation Systems. 15895-15903 - Mustafa Eyceoz, Nikhil Shivakumar Nayak, Hao Wang, Ligong Han, Akash Srivastava:

Hopscotch: Discovering and Skipping Redundancies in Language Models. 15904-15913 - Yuyang Jiang, Chacha Chen, Shengyuan Wang, Feng Liu, Zecong Tang, Benjamin M. Mervak, Lydia Chelala, Christopher M. Straus, Reve Chahine, Samuel G. Armato III, Chenhao Tan:

CLEAR: A Clinically Grounded Tabular Framework for Radiology Report Evaluation. 15914-15933 - Olga Kellert, Nemika Tyagi, Muhammad Imran, Nelvin Licona-Guevara, Carlos Gómez-Rodríguez:

Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource Languages. 15934-15949 - Runsong Jia, Mengjia Wu, Ying Ding, Jie Lu, Yi Zhang:

HetGCoT: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Academic Question Answering. 15950-15963 - Dacheng Li, Shiyi Cao, Chengkun Cao, Xiuyu Li, Shangyin Tan, Kurt Keutzer, Jiarong Xing, Joseph E. Gonzalez, Ion Stoica:

S*: Test Time Scaling for Code Generation. 15964-15978 - Dacheng Li, Shiyi Cao, Tyler Griggs, Shu Liu, Xiangxi Mo, Eric Tang, Sumanth Hegde, Kourosh Hakhamaneshi, Shishir G. Patil, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica:

Language Models Can Easily Learn to Reason from Demonstrations. 15979-15997 - Ximena Gutierrez, Mikel Segura Elizalde, Victor Mijangos:

FSTs vs ICL: Generalisation in LLMs for an under-resourced language. 15998-16006 - Fu Zhang, Panfeng Zhang, Jingwei Cheng:

SRM-LLM: Semantic Relationship Mining with LLMs for Temporal Knowledge Graph Extrapolation. 16007-16021 - Ji Soo Lee, Byungoh Ko, Jaewon Cho, Howoong Lee, Jaewoon Byun, Hyunwoo J. Kim:

Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization. 16022-16039 - Chimaobi Okite, Naihao Deng, Kiran Bodipati, Huaidian Hou, Joyce Chai, Rada Mihalcea:

Benchmarking and Improving LLM Robustness for Personalized Generation. 16040-16072 - Jeongsik Park, Khoi P. N. Nguyen, Jihyung Park, Minseok Kim, Jaeheon Lee, Jae Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, Vincent Ng:

MemeInterpret: Towards an All-in-One Dataset for Meme Understanding. 16073-16087 - Zaiyi Zheng, Song Wang, Zihan Chen, Yaochen Zhu, Yinhan He, Liangjie Hong, Qi Guo, Jundong Li:

CoRAG: Enhancing Hybrid Retrieval-Augmented Generation through a Cooperative Retriever Architecture. 16088-16101 - Miaoran Li, Jiangning Chen, Minghua Xu, Xiaolong Wang:

Hallucination Detection in Structured Query Generation via LLM Self-Debating. 16102-16113 - Jongwoo Kim, SeongYeub Chu, Bryan Wong, Mun Yong Yi:

Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge Tracing. 16114-16128 - Seongho Joo, Hyukhun Koh, Kyomin Jung:

Public Data Assisted Differentially Private In-Context Learning. 16129-16152 - Jian Wang, Yanjie Liang, Yuqing Sun, Bin Gong:

Inducing Argument Facets for Faithful Opinion Summarization. 16153-16166 - Nicholas Lourie, Michael Y. Hu, Kyunghyun Cho:

Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check. 16167-16180 - Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao Chen:

Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation. 16181-16196 - Huu Tuong Tu, Huan Vu, Nguyen Tien Cuong, Dien Hy Ngo, Nguyen Thi Thu Trang:

O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion. 16197-16208 - Jiatong Han, Neil Band, Muhammed Razzak, Jannik Kossen, Tim G. J. Rudner, Yarin Gal:

Simple Factuality Probes Detect Hallucinations in Long-Form Natural Language Generation. 16209-16226 - Yifan Wang, Shen Gao, Jiabao Fang, Rui Yan, Billy Chiu, Shuo Shang:

CESRec: Constructing Pseudo Interactions for Sequential Recommendation via Conversational Feedback. 16227-16239 - Chengrui Huang, Shen Gao, Zhengliang Shi, Dongsheng Wang, Shuo Shang:

TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation. 16240-16255 - Yi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu:

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition. 16256-16272 - Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T. Rogers, Lalit K. Jain, Robert D. Nowak, Bob Mankoff, Jifan Zhang:

Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs. 16273-16287 - Iva Bojic, Qi Chwen Ong, Stephanie Hilary Xinyi Ma, Lin Ai, Zheng Liu, Ziwei Gong, Julia Hirschberg, Andy Hau Yan Ho, Andy W. H. Khong:

SMARTMiner: Extracting and Evaluating SMART Goals from Low-Resource Health Coaching Notes. 16288-16305 - Jialin Chen, Houyu Zhang, Seongjun Yun, Alejandro Mottini, Rex Ying, Xiang Song, Vassilis N. Ioannidis, Zheng Li, Qingjun Cui:

GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models. 16306-16319 - Jiabao Kang, Xinye Li, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui:

Exploring Deductive and Inductive Reasoning Capabilities of Large Language Models in Procedural Planning. 16320-16341 - Xian Peng, Pan Yuan, Dong Li, Junlong Cheng, Qin Fang, Zhi Liu:

KELE: A Multi-Agent Framework for Structured Socratic Teaching with Large Language Models. 16342-16362 - Hao Chen, Tianyu Shi, Pengran Huang, Zeyuan Li, Jiahui Pan, Qianglong Chen, Lewei He:

VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation. 16363-16394 - Yulong Hui, Yihao Liu, Yao Lu, Huanchen Zhang:

OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Question Answering. 16395-16409 - Jiuzhou Han, Wray L. Buntine, Ehsan Shareghi:

VerifiAgent: a Unified Verification Agent in Language Model Reasoning. 16410-16431 - Yongkang Xiao, Sinian Zhang, Yi Dai, Huixue Zhou, Jue Hou, Jie Ding, Rui Zhang:

DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains. 16432-16445 - Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu:

Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism. 16446-16474 - Jingxian Xu, Mengyu Zhou, Weichang Liu, Hanbing Liu, Shi Han, Dongmei Zhang:

TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance. 16475-16489 - Minh Pham Dinh, Michael G. Yankoski, Munira Syed, Trenton W. Ford:

DAVIS: Planning Agent with Knowledge Graph-Powered Inner Monologue. 16490-16505 - Keno Harada, Yudai Yamazaki, Masachika Taniguchi, Edison Marrese-Taylor, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo:

When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following. 16506-16526 - Kaiying Kevin Lin, Hsi-Yu Chen, Haopeng Zhang:

FormosanBench: Benchmarking Low-Resource Austronesian Languages in the Era of Large Language Models. 16527-16539 - Jun Rao, Yunjie Liao, Xuebo Liu, Zepeng Lin, Lian Lian, Dong Jin, Shengjun Cheng, Jun Yu, Min Zhang:

SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models. 16540-16557 - Jifeng Song, Arun Das, Ge Cui, Yufei Huang:

FigEx: Aligned Extraction of Scientific Figures and Captions. 16558-16571 - Wanru Zhuang, Wenbo Li, Zhibin Lan, Xu Han, Peng Li, Jinsong Su:

PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models. 16572-16588 - Hua Farn, Hsuan Su, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee:

Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging. 16589-16602 - Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu, Vladimir Braverman:

Self-Ensemble: Mitigating Confidence Distortion for Large Language Models. 16603-16615 - Yuu Jinnai, Ukyo Honda:

Annotation-Efficient Language Model Alignment via Diverse and Representative Response Texts. 16616-16659 - Sheldon Yu, Yuxin Xiong, Junda Wu, Xintong Li, Tong Yu, Xiang Chen, Ritwik Sinha, Jingbo Shang, Julian J. McAuley:

Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics. 16660-16667 - Xiusi Chen, Shanyong Wang, Cheng Qian, Hongru Wang, Peixuan Han, Heng Ji:

DecisionFlow: Advancing Large Language Model as Principled Decision Maker. 16668-16692 - Jiaxin Guo, Daimeng Wei, Yuanchang Luo, Hengchao Shang, Zongyao Li, Jinlong Yang, Zhanglin Wu, Zhiqiang Rao, Shimin Tao, Hao Yang:

M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models. 16693-16711 - Qian Xiong, Yuekai Huang, Ziyou Jiang, Zhiyuan Chang, Yu Zheng, Tianhao Li, Mingyang Li:

Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems. 16712-16729 - Yitao Long, Tiansheng Hu, Yilun Zhao, Arman Cohan, Chen Zhao:

FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering. 16730-16750 - Xu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei Li, Shujian Huang, Fei Yuan:

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models. 16751-16774 - Ramya Keerthy Thatikonda, Wray L. Buntine, Ehsan Shareghi:

Assessing the Sensitivity and Alignment of FOL Closeness Metrics. 16775-16785 - Juli Bakagianni, Korbinian Randl, Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Salvatore Trani, Aron Henriksson, Anna Romanova, John Pavlopoulos:

FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and Analysis. 16786-16804 - Jingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao:

Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios. 16805-16829 - Abhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed, Kevin Zhu, Austen Liao, Sean O'Brien:

EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models. 16830-16855 - Runchao Li, Yao Fu, Mu Sheng, Xianxuan Long, Haotian Yu, Pan Li:

FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache Compression. 16856-16866 - Ikuya Yamada, Ryokan Ri, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo:

Dynamic Injection of Entity Knowledge into Dense Retrievers. 16867-16879 - Yijiang River Dong, Tiancheng Hu, Yinhong Liu, Ahmet Üstün, Nigel Collier:

When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning. 16880-16894 - Yifan Zhu, Chao Zhang, Xin Shi, Xueqiao Zhang, Yi Yang, Yawei Luo:

MASTER: Multi-Agent Security Through Exploration of Roles and Topological Structures - A Comprehensive Framework. 16895-16921 - Patara Trirat, Jae-Gil Lee:

MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices. 16922-16950 - Valentin Barrière, Nahuel Gomez, Léo Hemamou, Sofía Callejas, Brian Ravenet:

StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos. 16951-16959 - Zhihui Yang, Yupei Wang, Kaijie Mo, Zhe Zhao, Renfen Hu:

Does Visual Grounding Enhance the Understanding of Embodied Knowledge in Large Language Models? 16960-16978 - Qinhong Lin, Zhongliang Yang, Yuang Cai, Dingfu Yu, Xuan Xu, Yu Li, Linna Zhou:

Semantic Contribution-Aware Adaptive Retrieval for Black-Box Models. 16979-16994 - Elias Bassani, Ignacio Sanchez:

On Guardrail Models' Robustness to Mutations and Adversarial Attacks. 16995-17006 - Bo Peng, Zhiheng Wang, Heyang Gong, Chaochao Lu:

IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data. 17007-17040 - Hanqing Li, Sharika Mahadevan, Kiran Sheena Jyothi, Henry Liang, Diego Klabjan:

Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs. 17041-17054 - Shouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan:

Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents. 17055-17074 - Yujun Zhou, Jiayi Ye, Zipeng Ling, Yufei Han, Yue Huang, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang:

Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study. 17075-17098 - Razvan-Gabriel Dumitru, Darius Peteleaza, Vikas Yadav, Liangming Pan:

ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models. 17099-17123 - Zili Wang, Tianyu Zhang, Haoli Bai, Lu Hou, Xianzhi Yu, Wulong Liu, Shiming Xiang, Lei Zhu:

Faster and Better LLMs via Latency-Aware Test-Time Scaling. 17124-17137 - Zonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang, Xiangzheng Zhang, Xianglong Liu, Dacheng Tao:

Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models. 17138-17157 - Ukyo Honda, Soichiro Murakami, Peinan Zhang:

Distilling Many-Shot In-Context Learning into a Cheat Sheet. 17158-17178 - Xiaofan Zheng, Huixuan Zhang, Xiaojun Wan:

Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models. 17179-17191 - Charlott Jakob, David Harbecke, Patrick Parschan, Pia Wenzel Neves, Vera Schmitt:

PolBiX: Detecting LLMs' Political Bias in Fact-Checking through X-phemisms. 17192-17210 - Ruiqi Yan, Xiquan Li, Wenxi Chen, Zhikang Niu, Chen Yang, Ziyang Ma, Kai Yu, Xie Chen:

URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models. 17211-17242 - Yujian Gan, Yuan Liang, Jinxia Xie, Yanni Lin, Juntao Yu, Massimo Poesio:

Low-Hallucination and Efficient Coreference Resolution with LLMs. 17243-17256 - Yishan Wang, Amanda Cercas Curry, Flor Miriam Plaza del Arco:

Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM Responses. 17257-17270 - Weihang Wang, Xinhao Li, Ziyue Wang, Yan Pang, Jielei Zhang, Peiyi Li, Qiang Zhang, Longwen Gao:

Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models. 17271-17289 - Song Dai, Yibo Yan, Jiamin Su, Dongfang Zihao, Yubo Gao, Yonghua Hei, Jungang Li, Junyan Zhang, Sicheng Tao, Zhuoran Gao, Xuming Hu:

PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions. 17290-17316 - Yongil Kim, Heuiyeen Yeen, Hyeongu Yun, Jinsik Lee:

Ko-LongRAG: A Korean Long-Context RAG Benchmark Built with a Retrieval-Free Approach. 17317-17329 - Annika Bush, Meltem Aksoy, Markus Pauly, Greta Ontrup:

Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI. 17330-17341 - Yuxuan Ye, Raúl Santos-Rodríguez, Edwin Simpson:

Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect Metrics. 17342-17355 - Chiyu Ma, Enpei Zhang, Yilun Zhao, Wenjun Liu, Yaning Jia, Peijun Qing, Lin Shi, Arman Cohan, Yujun Yan, Soroush Vosoughi:

Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplification and Resistance in Multi-Agent Based LLM-as-Judge. 17356-17392 - Meghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüllermeier, Debanjan Ghosh, Henning Wachsmuth:

Investigating the Impact of Conceptual Metaphors on LLM-based NLI through Shapley Interactions. 17393-17403 - Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski, Evangelos Eleftheriou, Martino Dazzi:

KurTail : Kurtosis-based LLM Quantization. 17404-17419 - Zhe Hu, Yixiao Ren, Guanzhong Liu, Jing Li, Yu Yin:

VIVA+: Human-Centered Situational Decision-Making. 17420-17437 - Xiangyu Li, Yawen Zeng, Xiaofen Xing, Jin Xu, Xiangmin Xu:

QuantAgents: Towards Multi-agent Financial System via Simulated Trading. 17438-17464 - Ruby Ostrow, Adam Lopez:

LLMs Reproduce Stereotypes of Sexual and Gender Minorities. 17465-17477 - Israel Abebe Azime, Deborah Dormah Kanubala, Tejumade Afonja, Mario Fritz, Isabel Valera, Dietrich Klakow, Philipp Slusallek:

Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches. 17478-17503 - Linzhu Yu, Huan Li, Ke Chen, Lidan Shou:

Transfer-Aware Data Selection for Domain Adaptation in Text Retrieval. 17504-17519 - Weronika Lajewska, Momchil Hardalov, Laura Aina, Neha Anna John, Hang Su, Lluís Màrquez:

Understanding and Improving Information Preservation in Prompt Compression for LLMs. 17520-17541 - Kanishka Jain, Ashwini Vaidya:

A Benchmark for Hindi Verb-Argument Structure Alternations. 17542-17549 - Jingyang Deng, Ran Chen, Jo-Ku Cheng, Jinwen Ma:

Beyond Binary Preferences: Semi-Online Label-Free GRACE-KTO with Group-Wise Adaptive Calibration for High-Quality Long-Text Generation. 17550-17562 - Zuzanna Dubanowska, Maciej Zelaszczyk, Michal Brzozowski, Paolo Mandica, Michal P. Karpowicz:

Representation-based Broad Hallucination Detectors Fail to Generalize Out of Distribution. 17563-17575 - Mingrui Xie, Lulu Xu, Junliang Du:

MAFMO: Multi-modal Adaptive Fusion with Meta-template Optimization for Vision-Language Models. 17576-17585 - Yejin Son, Saejin Kim, Dongjun Min, Youngjae Yu:

Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to Odd. 17586-17609 - Manuel Couto, Marcos Fernández-Pichel, Mario Ezra Aragón, David E. Losada:

Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior. 17610-17619 - Yuanbo Xie, Yingjie Zhang, Tianyun Liu, Duohe Ma, Tingwen Liu:

Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction. 17620-17632 - Lewei Jin, Kui Zhang, Yongqi Chen, Yifan Zhuo, Renjie Li, Yi Gao, Bowei Yang, Zhengong Cai, Wei Dong:

Distributed LLM Serving on Consumer-Grade GPUs by Reconciling Computation and Communication. 17633-17642 - Hongfei Xia, Hongru Wang, Zeming Liu, Qian Yu, Yuhang Guo, Haifeng Wang:

SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs. 17643-17660 - An Wang, Ruobing Xie, Shuaipeng Li, Xingwu Sun, Zhanhui Kang:

Sparsifying Mamba. 17661-17667 - Heehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang:

Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts. 17668-17700 - Xin Li, Huangming Xu, Fu Zhang, Jingwei Cheng:

ET-MIER: Entity Type-guided Key Mention Identification and Evidence Retrieval for Document-level Relation Extraction. 17701-17714 - Runsong Zhao, Xin Liu, Xinyu Liu, Pengcheng Huang, Chunyang Xiao, Tong Xiao, JingBo Zhu:

Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models. 17715-17734 - Daniele Potertì, Andrea Seveso, Fabio Mercorio:

Can Role Vectors Affect LLM Behaviour? 17735-17747 - Florian Eichin, Carolin M. Schuster, Georg Groh, Michael A. Hedderich:

Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling. 17748-17771 - Yibin Lei, Tao Shen, Andrew Yates:

ThinkQE: Query Expansion via an Evolving Thinking Process. 17772-17781 - Jiwei Zhang, Jianxun Lian, Haiming Qin, Mingyang Zhou, Kezhong Lu, Rui Mao, Hao Liao:

Hierarchical Reward Modeling for Fault Localization in Large Code Repositories. 17782-17796 - Neo Eyal, Nachum Dershowitz, Kfir Bar:

Layer Duplication in LLMs. 17797-17807 - Yangyang Zhao, Ben Niu, Yuxuan Tan, Shihan Wang, Libo Qin:

Semantic-Aware Action Space Compression via LLM-DRL Synergy for Efficient Task-oriented Dialogue Policy Exploration. 17808-17820 - Jianshu She, Xinyue Li, Eric P. Xing, Zhengzhong Liu, Qirong Ho:

Linear Steerability in Language Models: When It Emerges and How It Evolves. 17821-17846 - Xiaobao Wu:

A Comprehensive Survey on Learning from Rewards for Large Language Models: Reward Models and Learning Strategies. 17847-17875 - Roi Cohen, Russa Biswas, Gerard de Melo:

InFact: Informativeness Alignment for Improved LLM Factuality. 17876-17888 - Yifei Dong, Fengyi Wu, Kunlin Zhang, Yilong Dai, Sanjian Zhang, Wanghao Ye, Sihan Chen, Zhi-Qi Cheng:

Large Language Model Agents in Finance: A Survey Bridging Research, Practice, and Real-World Deployment. 17889-17907 - Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst:

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs. 17908-17927 - Zehan Li, Fu Zhang, Tianyue Peng, He Liu, Jingwei Cheng:

Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation Extraction. 17928-17941 - Wei Chen, Zhi Zheng, Lili Zhao, Huijun Hou, Tong Xu:

Following Occam's Razor: Dynamic Combination of Structured Knowledge for Multi-Hop Question Answering using LLMs. 17942-17956 - Xuan Luo, Jing Li, Zhong Wenzhong, Geng Tu, Ruifeng Xu:

Large Language Models as Reader for Bias Detection. 17957-17967 - Jiawen Xie, Haiyang Wu, Deyi Ji, Yuekui Yang, Shaoping Ma:

LOHRec: Leveraging Order and Hierarchy in Generative Sequential Recommendation. 17968-17983 - Haonan He, Yuchen Ren, Yining Tang, Ziyang Xu, Junxian Li, Minghao Yang, Di Zhang, Dong Yuan, Tao Chen, Shufei Zhang, Yuqiang Li, Nanqing Dong, Wanli Ouyang, Dongzhan Zhou, Peng Ye:

Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models. 17984-18016 - An Luo, Xun Xian, Jin Du, Fangqiao Tian, Ganghua Wang, Ming Zhong, Shengchun Zhao, Xuan Bi, Zirui Liu, Jiawei Zhou, Jayanth Srinivasa, Ashish Kundu, Charles Fleming, Mingyi Hong, Jie Ding:

AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science. 18017-18060 - Alessandra Urbinati, Mirko Lai, Simona Frenda, Marco Stranisci:

Are you sure? Measuring models bias in content moderation through uncertainty. 18061-18076 - Sabrina McCallum, Amit Parekh, Alessandro Suglia:

FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks. 18077-18101 - Bhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md. Shad Akhtar:

Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities. 18102-18118 - Hengwei Liu, Yongliang Shen, Zhe Zheng, Haoyuan Ma, Xingyu Wu, Yin Zhang, Weiming Lu:

Logic: Long-form Outline Generation via Imitative and Critical Self-refinement. 18119-18144 - Mengxuan Hu, Hongyi Wu, Ronghang Zhu, Zihan Guan, Dongliang Guo, Daiqing Qi, Sheng Li:

No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users. 18145-18170 - Rao Ma, Tongzhou Chen, Kartik Audhkhasi, Bhuvana Ramabhadran:

LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors. 18171-18186 - Xing Zhang, Jiaheng Wen, Fangkai Yang, Yu Kang, Pu Zhao, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang:

Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation. 18187-18198 - Wenchao Dong, Megha Sundriyal, Seongchan Park, Jaehong Kim, Meeyoung Cha, Tanmoy Chakraborty, Wonjae Lee:

Parallel Communities Across the Surface Web and the Dark Web. 18199-18218 - Olia Toporkov, Alan Akbik, Rodrigo Agerri:

Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data. 18219-18232 - Zelong Yu, Xiaoming Zhang, Litian Zhang, Yu Yuan, Chaozhuo Li:

LlmFixer: Fix the Helpfulness of Defensive Large Language Models. 18233-18247 - Rao Ma, Mengjie Qian, Vyas Raina, Mark J. F. Gales, Kate M. Knill:

Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs. 18248-18262 - Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Man Luo, Sungduk Yu, Chendi Xue, Vasudev Lal:

Probing Semantic Routing in Large Mixture-of-Expert Models. 18263-18278 - Siyu Tian, Kaijie Mo, Yupei Wang, Renfen Hu:

CMT-Eval: A Novel Chinese Multi-turn Dialogue Evaluation Dataset Addressing Real-world Conversational Challenges. 18279-18303 - Yixiong Fang, Tianran Sun, Yuling Shi, Min Wang, Xiaodong Gu:

LastingBench: Defend Benchmarks Against Knowledge Leakage. 18304-18317 - Bhrij Patel, Ashish Jagmohan, Aditya Vempaty:

Learning API Functionality from In-Context Demonstrations for Tool-based Agents. 18318-18336 - Kevin Ren, Santiago Cortes-Gomez, Carlos Miguel Patiño, Ananya Joshi, Ruiqi Lyu, Jingjing Tang, Alistair Turcan, Khurram Yamin, Steven Wu, Bryan Wilder:

Predicting Language Models' Success at Zero-Shot Probabilistic Prediction. 18337-18363 - Ali Al-Lawati, Jason Lucas, Zhiwei Zhang, Prasenjit Mitra, Suhang Wang:

GAMIC: Graph-Aligned Molecular In-context Learning for Molecule Analysis via LLMs. 18364-18378 - Keren Artiaga, Sabyasachi Kamila, Haithem Afli, Conor Lynch, Mohammed Hasanuzzaman:

Rethinking Sign Language Translation: The Impact of Signer Dependence on Model Evaluation. 18379-18391 - Tong Li, Shu Yang, Junchao Wu, Jiyao Wei, Lijie Hu, Mengdi Li, Derek F. Wong, Joshua R. Oltmanns, Di Wang:

Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation. 18392-18413 - Anthony Sicilia, Malihe Alikhani:

Adaptive Platt Scaling with Causal Interpretations for Self-Reflective Language Model Uncertainty Estimates. 18414-18422 - Li Li, Jiashu Qu, Linxin Song, Yuxiao Zhou, Yuehan Qin, Tiankai Yang, Yue Zhao:

Treble Counterfactual VLMs: A Causal Approach to Hallucination. 18423-18434 - Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal:

Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning. 18435-18449 - Mohsinul Kabir, Tasfia Tahsin, Sophia Ananiadou:

From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling. 18478-18498 - Mitchell Plyler, Yilun Zhang, Alexander Tuzhilin, Saoud Khalifah, Sen Tian:

SENTRA: Selected-Next-Token Transformer for LLM Text Detection. 18499-18516 - Zhizhuo Kou, Holam Yu, Junyu Luo, Jingshu Peng, Xujia Li, Chengzhong Liu, Juntao Dai, Lei Chen, Sirui Han, Yike Guo:

Automate Strategy Finding with LLM in Quant Investment. 18517-18533 - Xuyang Wu, Jinming Nian, Ting-Ruen Wei, Zhiqiang Tao, Hsin-Tai Wu, Yi Fang:

Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning. 18534-18555 - Zhaopeng Feng, Jiahan Ren, Jiayuan Su, Jiamei Zheng, Hongwei Wang, Zuozhu Liu:

MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling. 18556-18567 - Nivedha Sivakumar, Natalie Mackraz, Samira Khorshidi, Krishna Patel, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff:

Bias after Prompting: Persistent Discrimination in Large Language Models. 18568-18593 - Dayin Gou, Sanghyun Byun, Nilesh Malpeddi, Gabrielle De Micheli, Prathamesh Vaste, Jacob Song, Woo Seong Chung:

CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression. 18594-18604 - Yi Fan, Michael Strube:

Consistent Discourse-level Temporal Relation Extraction Using Large Language Models. 18605-18622 - Afrina Tabassum, Bin Guo, Xiyao Ma, Hoda Eldardiry, Ismini Lourentzou:

MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State Reasoning. 18623-18639 - Dmitrii Troitskii, Koyena Pal, Chris Wendler, Callum McDougall:

Internal states before wait modulate reasoning patterns. 18640-18649 - Jesus Rios, Pierre L. Dognin, Ronny Luss, Karthikeyan Natesan Ramamurthy:

Sparsity May Be All You Need: Sparse Random Parameter Adaptation. 18650-18666 - Panagiotis Kaliosis, John Pavlopoulos:

Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition. 18667-18684 - Zhaopeng Feng, Shaosheng Cao, Jiahan Ren, Jiayuan Su, Ruizhe Chen, Yan Zhang, Jian Wu, Zuozhu Liu:

MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning. 18685-18702 - Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari:

Discrete Minds in a Continuous World: Do Language Models Know Time Passes? 18703-18729 - Xiaoke Wang, Fu Zhang, Jingwei Cheng, Yiwen Chi, Jiashun Peng, Yingsong Ning:

DLTKG: Denoising Logic-based Temporal Knowledge Graph Reasoning. 18730-18743 - Pengcheng Li, Botao Zhao, Zuheng Kang, Junqing Peng, Xiaoyang Qu, Yayun He, Jianzong Wang:

EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition. 18744-18754 - Heuiyeen Yeen, Seokhee Hong, Hyeongu Yun, Jinsik Lee:

MANTA: A Scalable Pipeline for Transmuting Massive Web Corpora into Instruction Datasets. 18755-18770 - Wei Huang, Yizhe Xiong, Xin Ye, Zhijie Deng, Hui Chen, Zijia Lin, Guiguang Ding:

Fast Quiet-STaR: Thinking Without Thought Tokens. 18771-18781 - Yuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shen Gao, Shuo Shang:

Lock on Target! Precision Unlearning via Directional Control. 18782-18794 - Gun Il Kim, Jong Wook Kim, Beakcheol Jang:

UniRAG: A Unified RAG Framework for Knowledge-Intensive Queries with Decomposition, Break-Down Reasoning, and Iterative Rewriting. 18795-18810 - Zhiyuan Chang, Mingyang Li, Xiaojun Jia, Junjie Wang, Yuekai Huang, Ziyou Jiang, Yang Liu, Qing Wang:

One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems. 18811-18825 - Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong:

From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment. 18826-18853 - Jingyuan Deng, Yujiu Yang:

MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding. 18854-18866 - Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang:

ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs. 18867-18880 - Hyundong Jin, Sicheol Sung, Shinwoo Park, Seung-Yeop Baik, Yo-Sub Han:

TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents. 18881-18897 - Ahmed Abul Hasanaath, Aisha Alansari, Ahmed Ashraf, Salmane Chafik, Hamzah Luqman, Saad Ezzini:

AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP. 18898-18914 - Rezvaneh Rezapour, Sullam Jeoung, Zhiwen You, Jana Diesner:

Tales of Morality: Comparing Human- and LLM-Generated Moral Stories from Visual Cues. 18915-18933 - Wenfeng Feng, Chuzhan Hao, Yuewei Zhang, Guochao Jiang, Jingyi Song:

AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented Generation. 18934-18953 - Mohammadtaher Safarzadeh, Afshin Oroojlooy, Dan Roth:

Evaluating NL2SQL via SQL2NL. 18954-18968 - Haoyuan Ma, Yongliang Shen, Hengwei Liu, Wenqi Zhang, Haolei Xu, Qiuying Peng, Jun Wang, Weiming Lu:

DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL. 18969-18979 - Junyan Zhang, Yiming Huang, Shuliang Liu, Yubo Gao, Xuming Hu:

Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs? 18980-18989 - Jiale Liu, Yifan Zeng, Shaokun Zhang, Chi Zhang, Malte Højmark-Bertelsen, Marie Normann Gadeberg, Huazheng Wang, Qingyun Wu:

Divide, Optimize, Merge: Scalable Fine-Grained Generative Optimization for LLM Agents. 18990-19012 - Atharva Kulkarni, Yuan Zhang, Joel Ruben Antony Moniz, Xiou Ge, Bo-Hsiang Tseng, Dhivya Piraviperumal, Swabha Swayamdipta, Hong Yu:

Evaluating Evaluation Metrics - The Mirage of Hallucination Detection. 19013-19032 - Tianruo Rose Xu, Vedant Gaur, Liu Leqi, Tanya Goyal:

The Progress Illusion: Revisiting meta-evaluation standards of LLM evaluators. 19033-19043 - Yupeng Qi, Ziyu Lyu, Min Yang, Yanlin Wang, Lu Bai, Lixin Cui:

MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework. 19044-19066 - Seokhee Hong, Sunkyoung Kim, Guijin Son, Soyeon Kim, Yeonjung Hong, Jinsik Lee:

From KMMLU-Redux to Pro: A Professional Korean Benchmark Suite for LLM Evaluation. 19067-19096 - Fei Zhao, Chengqiang Lu, Yufan Shen, Qimeng Wang, Yicheng Qian, Haoxin Zhang, Yan Gao, Wu Yi, Yao Hu, Zhen Wu, Shangyu Xing, Xinyu Dai:

RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios. 19097-19115 - Mong Yuan Sim, Wei Emma Zhang, Xiang Dai, Biaoyan Fang, Sarbin Ranjitkar, Arjun Burlakoti, Jamie Taylor, Haojie Zhuang:

The More, The Better? A Critical Study of Multimodal Context in Radiology Report Summarization. 19116-19131 - Mayukh Borana, Junyi Liang, Sai Sathiesh Rajan, Sudipta Chattopadhyay:

Localizing Malicious Outputs from CodeLLM. 19132-19143 - Chunhui Zhang, Zhongyu Ouyang, Xingjian Diao, Zheyuan Liu, Soroush Vosoughi:

Knowing More, Acting Better: Hierarchical Representation for Embodied Decision-Making. 19144-19155 - Juhyun Oh, Inha Cha, Michael Saxon, Hyunseung Lim, Shaily Bhatt, Alice Oh:

Culture is Everywhere: A Call for Intentionally Cultural Evaluation. 19156-19168 - Hend Elghazaly, Bahman Mirheidari, Heidi Christensen, Nafise Sadat Moosavi:

Fairness in Automatic Speech Recognition Isn't a One-Size-Fits-All. 19169-19178 - Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, Inha Cha, William Yang Wang, Alice Oh:

Uncovering Factor-Level Preference to Improve Human-Model Alignment. 19179-19203 - Xiaobo Wang, Zixia Jia, Jiaqi Li, Qi Liu, Zilong Zheng:

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor. 19204-19225 - Oussama Gabouj, Kamel Charaf, Ivan Zakazov, Nicolas Mario Baldwin, Robert West:

GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning. 19226-19244 - Yingqi Peng, Kaijie Gong, Yi Gao, Hao Wang, Wei Dong:

IoTMigrator: LLM-driven Embedded IoT Code Migration across Different OSes for Cloud-device Integration. 19245-19257 - Hao Chen, Yukun Yan, Sen Mei, Wanxiang Che, Zhenghao Liu, Qi Shi, Xinze Li, Yuchun Fan, Pengcheng Huang, Qiushi Xiong, Zhiyuan Liu, Maosong Sun:

ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation. 19258-19278 - Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori:

BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text. 19279-19294 - Liyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, Jie Zhou:

Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings. 19295-19305 - HyeongSik Kim, Xu Yanheng, Chaoqun Dong, Fei Du:

Over-Generation and Compaction: A Prompting Strategy for Procedural Text Adaptation with Large Language Models. 19306-19337 - Julien Knafou, Luc Mottin, Anaïs Mottaz, Alexandre Flament, Patrick Ruch:

TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling. 19338-19354 - Jaehoon Oh, Dokwan Oh:

Beyond Fixed-Length Calibration for Post-Training Compression of LLMs. 19355-19366 - Guangzeng Han, Weisi Liu, Xiaolei Huang:

Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation. 19367-19389 - Hannah Sterz, Fabian David Schmidt, Goran Glavas, Ivan Vulic:

ReCoVeR the Target Language: Language Steering without Sacrificing Task Performance. 19390-19405 - Sheikh Jubair, Arwa Omayrah, Amal Alshammari, Alhanoof Althnian, Abdulhamed Alothaimen, Norah A. Alzahrani, Shahad D. Alzaidi, Nora Al-Twairesh, Abdulmohsen Al-Thubaity:

LC-Eval: A Bilingual Multi-Task Evaluation Benchmark for Long-Context Understanding. 19406-19439 - Monika Wysoczanska, Shyamal Buch, Anurag Arnab, Cordelia Schmid:

OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models. 19440-19457 - Yang Chen, Shuwan Yang, Yan Xiang, Ran Song, Yuxin Huang, Zhengtao Yu:

GRPO-Guided Modality Selection Enhanced LoRA-Tuned LLMs for Multimodal Emotion Recognition. 19458-19471 - Tongyu Wen, Chenglong Wang, Xiyuan Yang, Haoyu Tang, Yueqi Xie, Lingjuan Lyu, Zhicheng Dou, Fangzhao Wu:

Defending against Indirect Prompt Injection by Instruction Detection. 19472-19487 - Seyoung Song, Seogyeong Jeong, Eunsu Kim, Jiho Jin, Dongkwan Kim, Jay Shin, Alice Oh:

MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language. 19488-19514 - Sunguk Choi, Yonghoon Kwon, Heondeuk Lee:

CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks. 19515-19530 - Ikhyun Cho, Gaeul Kwon, Julia Hockenmaier:

On the Versatility of Sparse Autoencoders for In-Context Learning. 19531-19538 - Shahar Levy, Nir Mazor, Lihi Shalmon, Michael Hassid, Gabriel Stanovsky:

More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG. 19539-19547 - Thomas Huber, Christina Niklaus:

CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language Models. 19548-19568 - Shiyu Xiang, Tong Zhang, Ronghao Chen:

ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast & Slow Reasoning for Robust Agent Defense. 19569-19587 - SungHwan Kim, Kwangwook Seo, Tongyoung Kim, Jinyoung Yeo, Dongha Lee:

Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User Simulation. 19588-19605 - Jonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, Sarit Kraus:

Out-of-Context Reasoning in Large Language Models. 19606-19615 - Seung-Yeop Baik, Joonghyuk Hahn, Jungin Kim, Aditi, Mingi Jeon, Yo-Sub Han, Sang-Ki Ko:

CodeComplex: Dataset for Worst-Case Time Complexity Prediction. 19616-19638 - Jianing Lin, Yuanfang Guo, Shunning Liu, Zeming Liu, Yunhong Wang:

Weak2Wise: An Automated, Lightweight Framework for Weak-LLM-Friendly Reasoning Synthesis. 19639-19657 - Kshitij Ambilduke, Ben Peters, Sonal Sannigrahi, Anil Keshwani, Tsz Kin Lam, Bruno Martins, André F. T. Martins, Marcely Zanon Boito:

From Tower to Spire: Adding the Speech Modality to a Translation-Specialist LLM. 19658-19673 - Jinhee Jang, Ayoung Moon, Minkyoung Jung, YoungBin Kim, Seung Jin Lee:

LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring. 19674-19687 - Ruobing Wang, Qingfei Zhao, Yukun Yan, Daren Zha, Yuxuan Chen, Shi Yu, Zhenghao Liu, Yixuan Wang, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun:

DeepNote: Note-Centric Deep Retrieval-Augmented Generation. 19688-19715 - Aastik, Topu Sai Meghana, Chinmay Prakash Kulkarni, Pragya Paramita Sahu:

NormAL LoRA: What is the perfect size? 19716-19731 - Vindhya Singh, Sabine Schulte im Walde, Ksenia Keplinger:

Inclusive Leadership in the Age of AI: A Dataset and Comparative Study of LLMs vs. Real-Life Leaders in Workplace Action Planning. 19732-19753 - Jihao Gu, Yingyao Wang, Meng Cao, Pi Bu, Jun Song, Bo Zheng, Yancheng He, Shilong Li:

Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation. 19754-19767 - Advait Joglekar, Divyanshu Singh, Rooshil Rohit Bhatia, Srinivasan Umesh:

EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion. 19768-19774 - Sangjun Moon, Dasom Choi, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura:

Length Representations in Large Language Models. 19775-19793 - Nianqi Li, Zujie Liang, Siyu Yuan, Jiaqing Liang, Feng Wei, Yanghua Xiao:

MultiLingPoT: Boosting Mathematical Reasoning in LLMs through Multilingual Program Integration. 19794-19811 - Pia Sommerauer, Giulia Rambelli, Tommaso Caselli:

Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text. 19812-19831 - Zhikai Ding, Shiyu Ni, Keping Bi:

Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs. 19832-19848 - Utsav Maskey, Chencheng Zhu, Usman Naseem:

Benchmarking Large Language Models for Cryptanalysis and Side-Channel Vulnerabilities. 19849-19865 - Anshul Singh, Chris Biemann, Jan Strich:

MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space. 19866-19891 - Yiran Zhang, Mo Wang, Xiaoyang Li, Kaixuan Ren, Chencheng Zhu, Usman Naseem:

TurnBench-MS: A Benchmark for Evaluating Multi-Turn, Multi-Step Reasoning in Large Language Models. 19892-19924 - Hyeon Hwang, Yewon Cho, Chanwoong Yoon, Yein Park, Minju Song, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang:

Assessing LLM Reasoning Steps via Principal Knowledge Grounding. 19925-19948 - Paramita Mirza, Lucas Weber, Fabian Küch:

Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy. 19949-19974 - Lujie Niu, Haochen Sun, Fangkun Zhao, Sheng Chen, Zimeng Bai, Jiawei Zhang, Caixia Yuan, Xiaojie Wang:

CoTD-PO: Chain-of-Thought Distillation with Preference Optimization. 19975-19986 - Hangdi Xing, Feiyu Gao, Qi Zheng, Zhaoqing Zhu, Zirui Shao, Ming Yan:

Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout Grounding. 19987-19998 - Xiaoyi Wang, Jiwei Zhang, Guangtao Zhang, Honglei Guo:

Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT Sessions. 19999-20017 - Sangmin Song, Juhwan Choi, JungMin Yun, YoungBin Kim:

Beyond Single-User Dialogue: Assessing Multi-User Dialogue State Tracking Capabilities of Large Language Models. 20018-20029 - Davide Testa, Giovanni Bonetta, Raffaella Bernardi, Alessandro Bondielli, Alessandro Lenci, Alessio Miaschi, Lucia C. Passaro, Bernardo Magnini:

All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark. 20030-20050 - Filippo Momentè, Alessandro Suglia, Mario Giulianelli, Ambra Ferrari, Alexander Koller, Oliver Lemon, David Schlangen, Raquel Fernández, Raffaella Bernardi:

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests. 20051-20072 - Rumana Ferdous Munne, Md. Mostafizur Rahman, Yuji Matsumoto:

Entity Profile Generation and Reasoning with LLMs for Entity Alignment. 20073-20086 - Frederic Kirstein, Sonu Kumar, Terry Ruas, Bela Gipp:

Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions. 20087-20137 - Chongxin Li, Hanzhang Wang, Yuchun Fang:

Attack as Defense: Safeguarding Large Vision-Language Models from Jailbreaking by Adversarial Attacks. 20138-20152 - Bohao Yang, Kun Zhao, Dong Liu, Chen Tang, Liang Zhan, Chenghua Lin:

Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation. 20153-20169 - Minghang Liu, Yinghan Shen, Zihe Huang, Yuanzhuo Wang, Xuhui Jiang, Huawei Shen:

Differentiated Vision: Unveiling Entity-Specific Visual Modality Requirements for Multimodal Knowledge Graph. 20170-20183 - Yi-Pei Chen, Noriki Nishida, Hideki Nakayama, Yuji Matsumoto:

Post Persona Alignment for Multi-Session Dialogue Generation. 20184-20192 - Mayank Kulkarni, Vittorio Mazzia, Judith Gaspers, Chris Hench, Jack FitzGerald:

MASSIVE-Agents: A Benchmark for Multilingual Function-Calling in 52 Languages. 20193-20215 - Bohao Yang, Dong Liu, Chenghao Xiao, Kun Zhao, Chen Tang, Chao Li, Lin Yuan, Yang Guang, Chenghua Lin:

Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework. 20216-20240 - Priyanka Dey, Aayush Bothra, Yugal Khanter, Jieyu Zhao, Emilio Ferrara:

Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment. 20241-20262 - Guanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, Feng Chen:

Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them. 20263-20278 - Jirui Qi, Shan Chen, Zidi Xiong, Raquel Fernández, Danielle S. Bitterman, Arianna Bisazza:

When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of Accuracy. 20279-20296 - Xinyi Liu, Weiguang Wang, Hangfeng He:

The Role of Model Confidence on Bias Effects in Measured Uncertainties for Vision-Language Models. 20297-20313 - Horacio Jesús Jarquín-Vásquez, Hugo Jair Escalante, Manuel Montes, Mario Ezra Aragón:

GAttention: Gated Attention for the Detection of Abusive Language. 20314-20329 - Chu Fei Luo, Samuel Dahan, Xiaodan Zhu:

Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback. 20330-20339 - Seung-Won Seo, Soon-Sun Kwon:

ProtoXTM: Cross-Lingual Topic Modeling with Document-Level Prototype-based Contrastive Learning. 20340-20354 - Mengyu Wang, Sotirios Sabanis, Miguel de Carvalho, Shay B. Cohen, Tiejun Ma:

One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning. 20355-20369 - Mikhail Seleznyov, Mikhail Chaichuk, Gleb Ershov, Alexander Panchenko, Elena Tutubalina, Oleg Somov:

When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs. 20370-20385 - Kaishuai Xu, Wenjun Hou, Yi Cheng, Wenjie Li:

RAR²: Retrieval-Augmented Medical Reasoning via Thought-Driven Retrieval. 20386-20396 - Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang:

The Security Threat of Compressed Projectors in Large Vision-Language Models. 20397-20407 - Nuno Guimarães, Purificação Silvano, Ricardo Campos, Alípio Mário Jorge, Ana Filipa Pacheco, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Roman Yangarber, Elisa Sartori, Nicolas Stefanovitch, Preslav Nakov, Jakub Piskorski, Giovanni Da San Martino:

NarratEX Dataset: Explaining the Dominant Narratives in News Texts. 20408-20434 - Salam Khalifa, Nizar Habash, Owen Rambow:

Radical Allomorphy: Phonological Surface Forms without Phonology. 20435-20441 - Mihaela Petre-Vlad, Cornelia Caragea, Florentina Hristea:

Model Calibration for Emotion Detection. 20442-20457 - Volodymyr Mudryi, Yurii Laba:

From Benchmark to Better Embeddings: Leveraging Synonym Substitution to Enhance Multimodal Models in Ukrainian. 20458-20468 - Zineddine Tighidet, Andrea Mogini, Hédi Ben-Younes, Jiali Mei, Patrick Gallinari, Benjamin Piwowarski:

Context Copying Modulation: The Role of Entropy Neurons in Managing Parametric and Contextual Knowledge Conflicts. 20469-20481 - Shiyu Ji, Farnoosh Hashemi, Joice Chen, Juanwen Pan, Weicheng Ma, Hefan Zhang, Sophia Pan, Ming Cheng, Shubham Mohole, Saeed Hassanpour, Soroush Vosoughi, Michael Macy:

A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling. 20482-20503 - Jiayou Wang, Rundong Liu, Yue Hu, Huijia Wu, Zhaofeng He:

SecDecoding: Steerable Decoding for Safer LLM Generation. 20504-20521 - Tuo Wang, Adithya Kulkarni, Tyler Cody, Peter A. Beling, Yujun Yan, Dawei Zhou:

GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models. 20522-20541 - Madhav Krishan Garg, Tejash Prasad, Tanmay Singhal, Chhavi Kirtani, Murari Mandal, Dhruv Kumar:

ReviewEval: An Evaluation Framework for AI-Generated Reviews. 20542-20564 - Abhinay Shankar Belde, Rohit Ramkumar, Jonathan Rusert:

Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select Algorithms. 20565-20598 - Talia Sternberg, Michael London, David Omer, Yossi Adi:

GmSLM : Generative Marmoset Spoken Language Modeling. 20599-20618 - Jacob Dineen, Aswin RRV, Qin Liu, Zhikun Xu, Xiao Ye, Ming Shen, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen, Ben Zhou:

QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA. 20619-20642 - Patrick Schilcher, Dominik Karasin, Michael Schöpf, Haisam Saleh, Antonela Tommasel, Markus Schedl:

Characterizing Positional Bias in Large Language Models: A Multi-Model Evaluation of Prompt Order Effects. 20643-20664 - Yun Joon Soh, Hanxian Huang, Yuandong Tian, Jishen Zhao:

You Only Use Reactive Attention Slice When Retrieving From Long Context. 20665-20686 - Shuxin Lin, Dhaval C. Patel, Christodoulos Constantinides:

Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring. 20687-20700 - Zicong Tang, Ziyang Ma, Suqing Wang, Zuchao Li, Lefei Zhang, Hai Zhao, Yun Li, Qianren Wang:

CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models. 20701-20714 - Maya Kruse, Shiyue Hu, Nicholas Derby, Yifu Wu, Samantha Stonbraker, Bingsheng Yao, Dakuo Wang, Elizabeth M. Goldberg, Yanjun Gao:

Large Language Models with Temporal Reasoning for Longitudinal Clinical Summarization and Prediction. 20715-20735 - Benedikt Ebing, Christian Goldschmied, Goran Glavas:

TransAlign: Machine Translation Encoders are Strong Word Aligners, Too. 20736-20749 - Yao Fu, Runchao Li, Xianxuan Long, Haotian Yu, Xiaotian Han, Yu Yin, Pan Li:

Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMs. 20750-20768 - Yujian Liu, Jiabao Ji, Tong Yu, Ryan A. Rossi, Sungchul Kim, Handong Zhao, Ritwik Sinha, Yang Zhang, Shiyu Chang:

Augment before You Try: Knowledge-Enhanced Table Question Answering via Table Expansion. 20769-20786 - Trisevgeni Papakonstantinou, Antonina Zhiteneva, Ana Yutong Ma, Derek Powell, Zachary Horne:

Evaluating Large Language Models for Belief Inference: Mapping Belief Networks at Scale. 20787-20795 - Ahmad Jabbar, Cleo Condoravdi, Christopher Potts:

Distinguishing fair from unfair compositional generalization tasks. 20796-20807 - Guanlin Li, Wenhao Shao, Praboda Rajapaksha, Noël Crespi:

SA-CLIP: Language Guided Image Spatial and Action Feature Learning. 20808-20814 - Batu El, Mert Yüksekgönül, James Zou:

Inefficiencies of Meta Agents for Agent Design. 20815-20824 - Xinyu Zhang, Changzhi Zhou, Linmei Hu, Luhao Zhang, Xiancai Chen, Haomin Fu, Yang Yang, Mengdi Zhang:

SCoder: Progressive Self-Distillation for Bootstrapping Small-Scale Data Synthesizers to Empower Code LLMs. 20825-20841 - Mohamed Elgaar, Hadi Amiri:

Linguistically-Controlled Paraphrase Generation. 20842-20864 - Zeyu Liu, Souvik Kundu, Lianghao Jiang, Anni Li, Srikanth Ronanki, Sravan Babu Bodapati, Gourav Datta, Peter Anthony Beerel:

LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling. 20865-20881 - Eileen Pan, Anna Seo Gyeong Choi, Maartje ter Hoeve, Skyler Seto, Allison Koenecke:

Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks. 20882-20893 - Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Chenhao Zhu, Xinzhe Juan, Ling Yang, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang:

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling. 20894-20917 - Shravan Nayak, Mehar Bhatia, Xiaofeng Zhang, Verena Rieser, Lisa Anne Hendricks, Sjoerd van Steenkiste, Yash Goyal, Karolina Stanczak, Aishwarya Agrawal:

CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics. 20918-20953 - Chenkun Tan, Pengyu Wang, Shaojun Zhou, Botian Jiang, Zhaowei Li, Dong Zhang, Xinghao Wang, Yaqian Zhou, Xipeng Qiu:

Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLMs. 20954-20970 - JuneYoung Park, Minjae Kang, Seongbae Lee, Haegang Lee, Seongwan Kim, Jaeho Lee:

Riemannian Optimization for LoRA on the Stiefel Manifold. 20971-20985 - Suhas BN, Dominik Mattioli, Andrew M. Sherrill, Rosa I. Arriaga, Christopher W. Wiese, Saeed Abdullah:

How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues. 20986-20995 - Vishal Dey, Xiao Hu, Xia Ning:

Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization. 20996-21023 - Gauri Kambhatla, Chantal Shaib, Venkata Govindarajan:

Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting. 21024-21033 - Aofan Liu, Shiyuan Song, Haoxuan Li, Cehao Yang, Yiyan Qi:

Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification. 21034-21049 - Jiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, Ting Wang:

Watermark under Fire: A Robustness Evaluation of LLM Watermarking. 21050-21074 - Jikun Hu, Dongsheng Guo, Yuli Liu, Qingyao Ai, Lixuan Wang, Xuebing Sun, Qilei Zhang, Quan Zhou, Cheng Luo:

PEPE: Long-context Extension for Large Language Models via Periodic Extrapolation Positional Encodings. 21075-21085 - Yin Jou Huang, Rafik Hadfi:

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models. 21086-21101 - Jia-Huei Ju, Suzan Verberne, Maarten de Rijke, Andrew Yates:

Controlled Retrieval-augmented Context Evaluation for Long-form RAG. 21102-21121 - Xiangyang Li, Xiaopeng Li, Kuicai Dong, Quanhu Zhang, Rongju Ruan, Xinyi Dai, Yasheng Wang, Ruiming Tang:

Humanity's Last Code Exam: Can Advanced LLMs Conquer Human's Hardest Code Competition? 21122-21137 - Julie Kallini, Dan Jurafsky, Christopher Potts, Martijn Bartelds:

False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models. 21138-21154 - Yue Zuo, Yuxiao Fei, Wanting Ning, Jiayi Huang, Yubo Feng, Lishuang Li:

Rule-Guided Extraction: A Hierarchical Rule Optimization Framework for Document-Level Event Argument Extraction. 21155-21171 - Shuyang Wang, Somayeh Moazeni, Diego Klabjan:

SOPL: A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models. 21172-21185 - Xinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, Xiangxin Zhu, Xianzhi Du:

CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling. 21186-21200 - Shuhui Qu, Jie Wang, Kincho Law:

A Category-Theoretic Approach to Neural-Symbolic Task Planning with Bidirectional Search. 21201-21225 - Trishna Chakraborty, Udita Ghosh, Xiaopan Zhang, Fahim Faisal Niloy, Yue Dong, Jiachen Li, Amit Roy-Chowdhury, Chengyu Song:

HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models. 21226-21243 - Reza Sanayei, Srdjan Vesic, Eduardo Blanco, Mihai Surdeanu:

Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics. 21244-21262 - Zhuohan Long, Siyuan Wang, Shujun Liu, Yuhang Lai:

How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation. 21263-21290 - Jiamian Wang, Ziqi Zhou, Chaithanya Kumar Mummadi, Sohail A. Dianat, Majid Rabbani, Raghuveer Rao, Chen Qiu, Zhiqiang Tao:

Visual Self-Refinement for Autoregressive Models. 21291-21300 - Wenjie Yang, Ruiyuan Huang, Jiaxing Guo, Zicheng Lyu, Tongshan Xu, Shengzhong Zhang, Lun Du, Da Zheng, Zengfeng Huang:

Retrieval-Augmented Language Models are Mimetic Theorem Provers. 21301-21313 - Charles Yu, Qingyun Wang, Yuting Hu, Jinjun Xiong, Heng Ji:

LORE: Continual Logit Rewriting Fosters Faithful Generation. 21314-21328 - Namyoung Kim, Kai Tzu-iunn Ong, Yeonjun Hwang, Minseok Kang, Iiseo Jihn, Gayoung Kim, Minju Kim, Jinyoung Yeo:

PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents. 21329-21368 - Nghiem Thanh Pham, Tung Kieu, Duc-Manh Nguyen, Ha Xuan Son, Nghia Duong-Trung, Danh Le Phuoc:

SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts. 21369-21392 - Lingxi Zhang, Yu-Neng Chuang, Guanchu Wang, Ruixiang Tang, Xuanting Cai, Rajesh Shenoy, Xia Hu:

A Decoupled Multi-Agent Framework for Complex Text Style Transfer. 21393-21403 - Daewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati:

Mamba Drafters for Speculative Decoding. 21404-21418 - Xidong Wang, Dingjie Song, Shunian Chen, Junying Chen, Zhenyang Cai, Chen Zhang, Lichao Sun, Benyou Wang:

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture. 21419-21436 - Daewon Choi, Jimin Lee, Jihoon Tack, Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati:

Think Clearly: Improving Reasoning via Redundant Token Pruning. 21437-21451 - Zhaxi Zerong, Chenxi Li, Xinyi Liu, Ju-hui Chen, Fei Xia:

A Systematic Survey of Claim Verification: Corpora, Systems, and Case Studies. 21452-21474 - Ruizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao:

Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach. 21475-21488 - Shangyin Tan, Lakshya A. Agrawal, Arnav Singhvi, Liheng Lai, Michael J. Ryan, Daniel Klein, Omar Khattab, Koushik Sen, Matei Zaharia:

LangProBe: a Language Program Benchmark. 21489-21509 - Jingbao Luo, Ming Liu, Aoli Huo, Fujing Hu, Gang Li, Wu Peng:

Exploring and Detecting Self-disclosure in Multi-modal posts on Chinese Social Media. 21510-21527 - Sumin Ha, Jun Hyeong Kim, Yinhua Piao, Changyun Cho, Sun Kim:

MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language Model. 21528-21549 - Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent:

Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics. 21550-21564 - Bhavik Chandna, Mariam Aboujenane, Usman Naseem:

ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content. 21565-21579 - Kurt Micallef, Nizar Habash, Claudia Borg:

Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data. 21580-21590 - Yang Liu, Chenhui Chu:

Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs. 21591-21628 - Minsoo Kim, Seung-won Hwang:

CoEx - Co-evolving World-model and Exploration. 21629-21651 - Jiaqi Duan, Xiaoda Yang, Kaixuan Luan, Hongshun Qiu, Weicai Yan, Xueyi Zhang, Youliang Zhang, Zhaoyang Li, Donglin Huang, Junyu Lu, Ziyue Jiang, Xifeng Yang:

BrainLoc: Brain Signal-Based Object Detection with Multi-modal Alignment. 21652-21662 - Ning Wang, Lei Xie, Sanglu Lu, Shiwei Gan:

PVTNL: Prompting Vision Transformers with Natural Language for Generalizable Person Re-identification. 21663-21674 - Jaemu Heo, Eldor Fozilov, Hyunmin Song, Taehwan Kim:

RingFormer: Rethinking Recurrent Transformer with Adaptive Level Signals. 21675-21686 - Jiajun Chen, Yangyang Wu, Xiaoye Miao, Mengying Zhu, Meng Xi:

TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete Modalities. 21687-21699 - Kevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago, Francesca Toni:

Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models. 21700-21711 - Jiefu Ou, William Gantt Walden, Kate Sanders, Zhengping Jiang, Kaiser Sun, Jeffrey Cheng, William Jurayj, Miriam Wanner, Shaobo Liang, Candice Morgan, Seunghoon Han, Weiqi Wang, Chandler May, Hannah Recknor, Daniel Khashabi, Benjamin Van Durme:

CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers? 21712-21735 - Junbao Huang, Weizhen Li, Peijie Huang, Yuhong Xu:

From Noise to Clarity: Filtering Real and LLM-Generated Samples for Enhanced Intent Detection. 21736-21746 - Brihi Joshi, Xiang Ren, Swabha Swayamdipta, Rik Koncel-Kedziorski, Tim Paek:

Improving Language Model Personas via Rationalization with Psychological Scaffolds. 21747-21770 - Zhen Zhang, Xinyu Wang, Yong Jiang, Zile Qiao, Zhuo Chen, Guangyu Li, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang:

KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models. 21771-21782 - Manan Roy Choudhury, Anirudh Iyengar Kaniyar Narayana Iyengar, Shikhhar Siingh, Sugeeth Puranam, Vivek Gupta:

TABARD: A Novel Benchmark for Tabular Anomaly Analysis, Reasoning and Detection. 21783-21817 - Ge Chen, Zhongqing Wang, Guodong Zhou:

Aspect-based Sentiment Analysis via Synthetic Image Generation. 21818-21829 - Xingwei Tan, Mahathi Parvatham, Chiara Gambi, Gabriele Pergola:

IntrEx: A Dataset for Modeling Engagement in Educational Conversations. 21830-21845 - Minghang Zhu, Zhengliang Shi, Zhiwei Xu, Shiguang Wu, Lingjie Wang, Pengjie Ren, Zhaochun Ren, Zhumin Chen:

Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems. 21846-21861 - Makesh Narsimhan Sreedhar, Traian Rebedea, Christopher Parisien:

Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models. 21862-21880 - Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz:

Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables. 21881-21918 - Zehua Liu, Han Wu, Yuxuan Yao, Xiaojin Fu, Ruifeng She, Xiongwei Han, Tao Zhong, Mingxuan Yuan:

LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging. 21919-21926 - Shunfeng Zheng, Yudi Zhang, Meng Fang, Zihan Zhang, Zhitan Wu, Mykola Pechenizkiy, Ling Chen:

Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving. 21927-21956 - Akriti Jain, Saransh Sharma, Koyel Mukherjee, Soumyabrata Pal:

FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction. 21957-21975 - Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery:

PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels. 21976-21993 - Seunguk Yu, JungMin Yun, Jinhee Jang, YoungBin Kim:

From Ground Trust to Truth: Disparities in Offensive Language Judgments on Contemporary Korean Political Discourse. 21994-22014 - Zhijie Du, Daizong Liu, Pan Zhou:

Misalignment Attack on Text-to-Image Models via Text Embedding Optimization and Inversion. 22015-22032 - César González-Gutiérrez, Ariadna Quattoni:

Domain Pre-training Impact on Representations. 22033-22049 - Jun Seo Kim, Hye Hyeon Kim:

KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM Negotiation. 22050-22078 - Dmitry Popov, Vladislav Negodin, Ekaterina Enikeeva, Iana Matrosova, Nikolay Karpachev, Max Ryabinin:

Refined Assessment for Translation Evaluation: Rethinking Machine Translation Evaluation in the Era of Human-Level Systems. 22079-22095 - Sangyeop Kim, Yohan Lee, Sanghwa Kim, Hyunjong Kim, Sungzoon Cho:

Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue. 22096-22113 - Jiacheng Guo, Yue Wu, Jiahao Qiu, Kaixuan Huang, Xinzhe Juan, Ling Yang, Mengdi Wang:

Temporal Consistency for LLM Reasoning Process Error Identification. 22114-22129 - Zhijin Guo, Chenhao Xue, Zhaozhen Xu, Hongbo Bo, Yuxuan Ye, Janet B. Pierrehumbert, Martha Lewis:

Quantifying Compositionality of Classic and State-of-the-Art Embeddings. 22130-22146 - Siddhesh Pawar, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein:

Presumed Cultural Identity: How Names Shape LLM Responses. 22147-22172 - Mamta Mamta, Oana Cocarascu:

I-GUARD: Interpretability-Guided Parameter Optimization for Adversarial Defense. 22173-22188 - Chao Zhang, Xin Shi, Xueqiao Zhang, Yifan Zhu, Yi Yang, Yawei Luo:

DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization. 22189-22215 - Tom Kempton, Stuart Burrell:

Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models. 22216-22231 - Ainulla Khan, Moyuru Yamada, Akella Srinidhi:

BRIT: Bidirectional Retrieval over Unified Image-Text Graph. 22232-22248 - Boyoung Kim, Dosung Lee, Sumin An, Jinseong Jeong, Paul Hongsuck Seo:

ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking. 22249-22277 - Yongquan Ji, Jingwei Cheng, Fu Zhang, Chenglong Lu:

Capturing Latent Modal Association For Multimodal Entity Alignment. 22278-22293 - Mariia Fedorova, Andrey Kutuzov, Francesco Periti, Yves Scherrer:

Explaining novel senses using definition generation with open language models. 22294-22302 - Seoyeon Kim, Huiseo Kim, Chanjun Park, Jinyoung Yeo, Dongha Lee:

Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching. 22303-22327 - Armel Randy Zebaze, Benoît Sagot, Rachel Bawden:

Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation. 22328-22357 - Armel Randy Zebaze, Benoît Sagot, Rachel Bawden:

TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation. 22358-22381 - Mahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee:

Fast, Not Fancy: Rethinking G2P with Rich Data and Statistical Models. 22382-22408 - Ayan Banerjee, Sandeep Gupta:

Personalized open world plan generation for safety-critical human centered autonomous systems: A case study on Artificial Pancreas. 22409-22422 - Emilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, Vladimir Araujo, Israel Abebe Azime, Jinheon Baek, Frederico Belcavello, Fermin Cristobal, Jan Christian Blaise Cruz, Mary Dabre, Raj Dabre, Toqeer Ehsan, Naome A. Etori, Fauzan Farooqui, Jiahui Geng, Guido Ivetta, Thanmay Jayakumar, Soyeong Jeong, Zheng Wei Lim, Aishik Mandal, Sofía Martinelli, Mihail Minkov Mihaylov, Daniil Orel, Aniket Pramanick, Sukannya Purkayastha, Israfel Salazar, Haiyue Song, Tiago Timponi Torrent, Debela Desalegn Yadeta, Injy Hamed, Atnafu Lambebo Tonja, Thamar Solorio:

CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation. 22423-22441 - Seojin Kim, Hyeontae Song, Jaehyun Nam, Jinwoo Shin:

Training Text-to-Molecule Models with Context-Aware Tokenization. 22442-22460 - Sung Won Kim, Daniel Khashabi:

Challenging the Evaluator: LLM Sycophancy Under User Rebuttal. 22461-22478 - Yilin Cao, Ruike Zhang, Penghui Wei, Qingchao Kong, Wenji Mao:

Perspective-driven Preference Optimization with Entropy Maximization for Diverse Argument Generation. 22479-22496 - Sanjay Booshanam, Kelly Chen, Ondrej Klejch, Thomas Reitmaier, Dani Kalarikalayil Raju, Electra Wallington, Nina Markl, Jennifer Pearson, Matt Jones, Simon Robinson, Peter Bell:

Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati. 22497-22509 - MSVPJ Sathvik, Zuhair Hasan Shaik, Vivek Gupta:

M-Help: Using Social Media Data to Detect Mental Health Help-Seeking Signals. 22510-22520 - Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling:

Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models. 22521-22543 - Xiaojun Wu, Junxi Liu, Huan-Yi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo:

Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models. 22544-22560 - Mengying Wang, Andreas Spitz:

Quantifying the Risks of LLM- and Tool-assisted Rephrasing to Linguistic Diversity. 22561-22574 - Changyu Zeng, Yifan Wang, Zimu Wang, Wei Wang, Zhengni Yang, Muyi Bao, Jimin Xiao, Anh Nguyen, Yutao Yue:

NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning Abilities. 22575-22590 - Emilio Villa-Cueva, S. M. Masrur Ahmed, Rendi Chevi, Jan Christian Blaise Cruz, Kareem Elzeky, Fermin Cristobal, Alham Fikri Aji, Skyler Wang, Rada Mihalcea, Thamar Solorio:

MoMentS: A Comprehensive Multimodal Benchmark for Theory of Mind. 22591-22611 - Andreas Geert Motzfeldt, Joakim Edin, Casper L. Christensen, Christian Hardmeier, Lars Maaløe, Anna Rogers:

Code Like Humans: A Multi-Agent Solution for Medical Coding. 22612-22627 - Michal Stefánik, Timothee Mickus, Michal Spiegel, Marek Kadlcík, Josef Kuchar:

Can Out-of-Distribution Evaluations Uncover Reliance on Prediction Shortcuts? A Case Study in Question Answering. 22628-22635 - Shoubin Yu, Yue Zhang, Ziyang Wang, Jaehong Yoon, Mohit Bansal:

MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation. 22636-22652 - Akshat Gupta, Phudish Prateepamornkul, Maochuan Lu, Ahmed M. Alaa, Thomas Hartvigsen, Gopala Anumanchipalli:

Lifelong Knowledge Editing requires Better Regularization. 22653-22675 - Wenyan Li, Raphael Tang, Chengzu Li, Caiqi Zhang, Ivan Vulic, Anders Søgaard:

Lost in Embeddings: Information Loss in Vision-Language Models. 22676-22693 - Skyler Seto, Maartje ter Hoeve, Maureen de Seyssel, David Grangier:

Assessing the Role of Data Quality in Training Bilingual Language Models. 22694-22720 - Rongzhi Zhang, Chenwei Zhang, Xinyang Zhang, Liang Qiu, Haoming Jiang, Yuchen Zhuang, Qingru Zhang, Hyokun Yun, Xian Li, Bing Yin, Tuo Zhao, Chao Zhang:

DORM: Preference Data Weights Optimization for Reward Modeling in LLM Alignment. 22721-22739 - Marc Felix Brinner, Tarek Al Mustafa, Sina Zarrieß:

Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them. 22740-22754 - Don (Dong Won) Lee, Hae Won Park, Cynthia Breazeal, Louis-Philippe Morency:

Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition. 22755-22787 - Sarfraz Ahmad, Hasan Iqbal, Momina Ahsan, Numaan Naeem, Muhammad Ahsan Riaz Khan, Arham Riaz, Muhammad Arslan Manzoor, Yuxia Wang, Preslav Nakov:

UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking. 22788-22802 - Avneet Kaur:

Echoes of Agreement: Argument Driven Sycophancy in Large Language models. 22803-22812 - Derin Ozer, Nicolas Gutowski, Benoit Da Mota, Thomas Cauchy, Sylvain Lamprier:

Rethinking NLP for Chemistry: A Critical Look at the USPTO Benchmark. 22813-22825 - Aashaka Desai, Daniela Massiceti, Richard E. Ladner, Hal Daumé III, Danielle Bragg, Alex Xijie Lu:

Investigating Dictionary Expansion for Video-based Sign Language Dictionaries. 22826-22841 - Najrin Sultana, Md. Rafi Ur Rashid, Kang Gu, Shagufta Mehnaz:

From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation. 22842-22859 - Reza Esfandiarpoor, George Zerveas, Ruochen Zhang, Macton Mgonzo, Carsten Eickhoff, Stephen H. Bach:

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance. 22860-22882 - Yuto Nishida, Masaru Isonuma, Yusuke Oda:

Instability in Downstream Task Performance During LLM Pretraining. 22883-22895 - Neal Lawton, Alfy Samuel, Anoop Kumar, Daben Liu:

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation. 22896-22904 - William P. McCarthy, Saujas Vaduguru, Karl D. D. Willis, Justin Matejka, Judith E. Fan, Daniel Fried, Yewen Pu:

mrCAD: Multimodal Communication to Refine Computer-aided Designs. 22905-22921 - Muntasir Wahed, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Nirav Diwan, Gang Wang, Dilek Hakkani-Tür, Ismini Lourentzou:

MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts? 22922-22948 - Venkatesh Mishra, Amir Saeidi, Satyam Raj, Mutsumi Nakamura, Gaowen Liu, Ali Payani, Jayanth Srinivasa, Chitta Baral:

How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on tau-bench. 22949-22972 - Xuyang Wu, Yuan Wang, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang:

Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts. 22973-22991 - Tania Chakraborty, Eylon Caplan, Dan Goldwasser:

VIBE: Can a VLM Read the Room? 22992-23008 - Hongyi Liu, Shaochen Zhong, Xintong Sun, Minghao Tian, Mohsen Hariri, Zirui Liu, Ruixiang Tang, Zhimeng Jiang, Jiayi Yuan, Yu-Neng Chuang, Li Li, Soo-Hyun Choi, Rui Chen, Vipin Chaudhary, Xia Hu:

LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem. 23009-23047 - Fakhraddin Alwajih, Samar Mohamed Magdy, Abdellah El Mekki, Omer Nacar, Youssef Nafea, Safaa Taher Abdelfadil, Abdulfattah Mohammed Yahya, Hamzah Luqman, Nada AlMarwani, Samah Aloufi, Baraah Qawasmeh, Houdaifa Atou, Serry Sibaee, Hamzah A. Alsayadi, Walid Al-Dhabyani, Maged Saeed AlShaibani, Aya El aatar, Nour Qandos, Rahaf Alhamouri, Samar Ahmad, Mohammed Anwar Al-Ghrawi, Aminetou Yacoub, Ruwa AbuHweidi, Vatimetou Mohamed Lemin, Reem Abdel-Salam, Ahlam Bashiti, Adel Ammar, Aisha Alansari, Ahmed Ashraf, Nora Alturayeif, Alcides Alcoba Inciarte, AbdelRahim A. Elmadany, Mohamedou Cheikh Tourad, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed:

Pearl: A Multimodal Culturally-Aware Arabic Instruction Dataset. 23048-23079 - Yijia Xiao, Wanjia Zhao, Junkai Zhang, Yiqiao Jin, Han Zhang, Zhicheng Ren, Renliang Sun, Haixin Wang, Guancheng Wan, Pan Lu, Xiao Luo, Yu Zhang, James Zou, Yizhou Sun, Wei Wang:

Protein Large Language Models: A Comprehensive Survey. 23080-23103 - Raoyuan Zhao, Beiduo Chen, Barbara Plank, Michael A. Hedderich:

MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs. 23104-23136 - Manishit Kundu, Sumit Shekhar, Pushpak Bhattacharyya:

Looking Beyond the Pixels: Evaluating Visual Metaphor Understanding in VLMs. 23137-23158 - Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song:

AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents. 23159-23172 - Victor Wang, Michael JQ Zhang, Eunsol Choi:

Improving LLM-as-a-Judge Inference with the Judgment Distribution. 23173-23199 - Wanqian Yang, Aahlad Manas Puli, Rajesh Ranganath:

Learning Is Not A Race: Improving Retrieval in Language Models via Equal Learning. 23200-23211 - Marlene Lutz, Indira Sen, Georg Ahnert, Elisa Rogers, Markus Strohmaier:

The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models. 23212-23237 - Mingze Zhong, Meng Fang, Zijing Shi, Yuxuan Huang, Shunfeng Zheng, Yali Du, Ling Chen, Jun Wang:

Spiral of Silence in Large Language Model Agents. 23238-23253 - Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi, Michael A. Hedderich, Hinrich Schütze:

Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing. 23254-23280 - Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A. Huerta, Hao Peng:

Context Length Alone Hurts LLM Performance Despite Perfect Retrieval. 23281-23298 - Luke Yoffe, Alfonso Amayuelas, William Yang Wang:

DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics. 23299-23315 - Kazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha, Sharmila Duppala, Saimadhav Naga Sakhamuri, Zhicheng Liu:

ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision-Language Model Performance. 23316-23348 - Tianyi Zhang:

Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks. 23349-23360 - Xingjian Tao, Yiwei Wang, Yujun Cai, Zhicheng Yang, Jing Tang:

Understanding GUI Agent Localization Biases through Logit Sharpness. 23361-23374 - Sophie Wu, Jan Philip Wahle, Saif M. Mohammad:

The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions. 23375-23399 - Chuan He, Zhuozhao Li, Song Guo, Xiaocheng Lu, Jinxiang Lai:

HomoGraphAdapter: A Homogeneous Graph Neural Network as an Effective Adapter for Vision-Language Models. 23400-23414 - Xiaoxue Han, Pengfei Hu, Chang Lu, Jun-En Ding, Feng Liu, Yue Ning:

No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery. 23415-23427 - Joshua Tint:

PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response Semantics. 23428-23438 - Lu Wang, Chiming Duan, Pu Zhao, Fangkai Yang, Yong Shi, Xuefeng Luo, Bingjing Xu, Weiwei Deng, Qingwei Lin, Dongmei Zhang:

ICL-Bandit: Relevance Labeling in Advertisement Recommendation Systems via LLM. 23439-23449 - Vishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik:

Intent-aware Schema Generation and Refinement for Literature Review Tables. 23450-23472 - Joshua Tint:

NLP Needs Diversity outside of 'Diversity'. 23473-23479 - Mohammad Saim, Phan Anh Duong, Cat Luong, Aniket Bhanderi, Tianyu Jiang:

Anatomy of a Feeling: Narrating Embodied Emotions via Large Vision-Language Models. 23480-23495 - Tianchun Li, Tianci Liu, Xingchen Wang, Rongzhe Wei, Pan Li, Lu Su, Jing Gao:

Towards Universal Debiasing for Language Models-based Tabular Data Generation. 23496-23512 - Narmeen Oozeer, Luke Marks, Fazl Barez, Amir Abdullah:

Beyond Linear Steering: Unified Multi-Attribute Control for Language Models. 23513-23557 - Yixuan Liu, Abel Elekes, Jianglin Lu, Rodrigo Dorantes Gilardi, Albert-László Barabási:

Unequal Scientific Recognition in the Age of LLMs. 23558-23568 - Md. Atabuzzaman, Andrew Zhang, Christopher Thomas:

Zero-Shot Fine-Grained Image Classification Using Large Vision-Language Models. 23569-23582 - Wonjin Yoon, Ian Bulovic, Timothy A. Miller:

Using tournaments to calculate AUROC for zero-shot classification with LLMs. 23583-23591 - Gyunyeop Kim, Sangwoo Kang:

Exploration-Driven Reinforcement Learning for Expert Routing Improvement in Mixture-of-Experts Language Models. 23592-23605 - Yoel Ashkenazi, Etzion Harari, Regev Yehezkel Imra, Naphtali Abudarham, Dekel Cohen, Yoram Louzoun:

D2CS - Documents Graph Clustering using LLM supervision. 23606-23623 - Sahiti Yerramilli, Nilay Pande, Rynaa Grover, Jayant Sravan Tamarapalli:

GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning. 23624-23639 - Anushka Sivakumar, Andrew Zhang, Zaber Ibn Abdul Hakim, Christopher Thomas:

SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models. 23640-23665 - Juhyeong Kim, Sangyeon Yu, Gyunyeop Kim, Sangwoo Kang:

FractalLLM: Lossless Self-Speculative Decoding with Layer Embedded Self-Compression. 23666-23673 - Ryan Solgi, Kai Zhen, Rupak Vignesh Swaminathan, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang:

Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models. 23674-23683 - Simin Hong, Jun Sun, Hongyang Chen:

Third-Person Appraisal Agent: Simulating Human Emotional Reasoning in Text with Large Language Models. 23684-23701 - Hanxu Hu, Jannis Vamvas, Rico Sennrich:

Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents. 23702-23712 - Fengxiang Cheng, Chuan Zhou, Xiang Li, Alina Leidinger, Haoxuan Li, Mingming Gong, Fenrong Liu, Robert van Rooij:

Mitigating Spurious Correlations via Counterfactual Contrastive Learning. 23713-23722 - Chanwoo Choi, Jinsoo Kim, Sukmin Cho, Soyeong Jeong, Buru Chang:

The RAG Paradox: A Black-Box Attack Exploiting Unintentional Vulnerabilities in Retrieval-Augmented Generation Systems. 23723-23744 - Zhenxi Lin, Ziheng Zhang, Jian Wu, Yefeng Zheng, Xian Wu:

Guiding Large Language Models for Biomedical Entity Linking via Restrictive and Contrastive Decoding. 23745-23759 - Yao Tong, Weijun Li, Xuanli He, Haolan Zhan, Qiongkai Xu:

Cut the Deadwood Out: Backdoor Purification via Guided Module Substitution. 23760-23783 - Jingjing Liu, Zeming Liu, Zihao Cheng, Mengliang He, Xiaoming Shi, Yuhang Guo, Xiangrong Zhu, Yuanfang Guo, Yunhong Wang, Haifeng Wang:

RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models. 23784-23813 - Yingjia Wan, Haochen Tan, Xiao Zhu, Xinyu Zhou, Zhiwei Li, Qingsong Lv, Changxuan Sun, Jiaqi Zeng, Yi Xu, Jianqiao Lu, Yinhong Liu, Zhijiang Guo:

FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMs. 23814-23854 - Maram Hasanain, Md. Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Giovanni Da San Martino, Firoj Alam:

PropXplain: Can LLMs Enable Explainable Propaganda Detection? 23855-23863 - Qin Hua, Jiaqi Sun, Shiyou Qian, Dingyu Yang, Jian Cao, Guangtao Xue:

EoT: Evolution of Thoughts for Complex Reasoning Tasks. 23864-23886 - Linxi Xie, Xin Teng, Shichang Ke, Hongyi Wen, Shenji Wan:

Reveal and Release: Iterative LLM Unlearning with Self-generated Data. 23887-23899 - Sujin Chen, Kang Wang, Zixuan Zhou, Xiangyu Duan, Wanqun Zhang, Hao Yang, Jinsong Su, Min Zhang:

An Evaluation Resource for Grounding Translation Errors. 23900-23916 - Sunkyung Lee, Seongmin Park, Jonghyo Kim, Mincheol Yoon, Jongwuk Lee:

Enhancing Time Awareness in Generative Recommendation. 23917-23933 - Pranoy Panda, Raghav Magazine, Chaitanya Devaguptapu, Sho Takemori, Vishal Sharma:

Adaptive LLM Routing under Budget Constraints. 23934-23949 - Mohamed Insaf Ismithdeen, Muhammad Uzair Khattak, Salman Khan:

Promptception: How Sensitive Are Large Multimodal Models to Prompts? 23950-23985 - Wenkai Guo, Xuefeng Liu, Haolin Wang, Jianwei Niu, Shaojie Tang, Jing Yuan:

Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation. 23986-24013 - Qingyu Lu, Liang Ding, Siyi Cao, Xuebo Liu, Kanjian Zhang, Jinxia Zhang, Dacheng Tao:

Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments. 24014-24027 - Lei Li, Xiangxu Zhang, Xiao Zhou, Zheng Liu:

AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels. 24028-24047 - Settaluri Lakshmi Sravanthi, Pulkit Agarwal, Debjyoti Mondal, Rituraj Singh, Subhadarshi Panda, Ankit Mishra, Kiran Pradeep, Srihari K. B, Godawari Sudhakar Rao, Pushpak Bhattacharyya:

RG-VQA: Leveraging Retriever-Generator Pipelines for Knowledge Intensive Visual Question Answering. 24048-24060 - Shuyu Guo, Shuo Zhang, Zhaochun Ren:

Enhancing RAG Efficiency with Adaptive Context Compression. 24061-24076 - Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro:

Revealing the impact of synthetic native samples and multi-tasking strategies in Hindi-English code-mixed humour and sarcasm detection. 24077-24107 - Zhuofan Chen, Jiyuan He, Yichi Zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong:

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models. 24108-24125 - Sungjae Lee, Hoyoung Kim, Jeongyeon Hwang, Eunhyeok Park, Jungseul Ok:

Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs. 24126-24144 - Hiroto Otake, Peinan Zhang, Yusuke Sakai, Masato Mita, Hiroki Ouchi, Taro Watanabe:

BannerBench: Benchmarking Vision Language Models for Multi-Ad Selection with Human Preferences. 24145-24159 - Jian Chen, Zhenyan Chen, Xuming Hu, Peilin Zhou, Yining Hua, Han Fang, Cissy Hing Yee Choy, Xinmei Ke, Jingfeng Luo, Zixuan Yuan:

DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction. 24160-24176 - Junlin Li, Bo Peng, Yu-Yin Hsu:

Facilitating Cross-lingual Transfer of Empathy through Language-independent Latent Diffusion: A Case Study in Chinese. 24177-24192 - Pranav Bhagat, K. N. Ajay Shastry, Pranoy Panda, Chaitanya Devaguptapu:

Evaluating Compound AI Systems through Behaviors, Not Benchmarks. 24193-24222 - Joshua Alan Flashner, Adithya Kulkarni, Dawei Zhou:

SciCompanion: Graph-Grounded Reasoning for Structured Evaluation of Scientific Arguments. 24223-24244 - Zhihao Zhang, Yiran Zhang, Xiyue Zhou, Liting Huang, Imran Razzak, Preslav Nakov, Usman Naseem:

From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation. 24245-24260 - Lorenzo Proietti, Stefano Perrella, Vilém Zouhar, Roberto Navigli, Tom Kocmi:

Estimating Machine Translation Difficulty. 24261-24285 - Kun Zhang, Liqiang Niu, Zhen Cao, Fandong Meng, Jie Zhou:

TIU-Bench: A Benchmark for Evaluating Large Multimodal Models on Text-rich Image Understanding. 24286-24295 - Kavin R. V., Pawan Goyal:

Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics. 24296-24304 - Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou:

ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects. 24305-24326 - Chenxi Wang, Yixuan Zhang, Lang Gao, Zixiang Xu, Zirui Song, Yanbo Wang, Xiuying Chen:

Under the Shadow of Babel: How Language Shapes Reasoning in LLMs. 24327-24344 - Primakov Chungkham, Venktesh V, Vinay Setty, Avishek Anand:

Think Right, Not More: Test-Time Scaling for Numerical Claim Verification. 24345-24363 - Nikolas Gritsch, Qizhen Zhang, Acyr Locatelli, Sara Hooker, Ahmet Üstün:

Nexus: Adaptive Upcycling to Efficiently Pretrain Mixture of Experts. 24364-24381 - Ritvik Choudhary, Rem Hida, Masaki Hamada, Hayato Futami, Toshiyuki Sekiya:

Exploring Context Strategies in LLMs for Discourse-Aware Machine Translation. 24382-24391 - Elisa Sartori, Serena Tardelli, Maurizio Tesconi, Mauro Conti, Alessandro Galeazzi, Stefano Cresci, Giovanni Da San Martino:

Insights into using temporal coordinated behaviour to explore connections between social media posts and influence. 24392-24404 - Junhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao, Qing Li, Yong Jiang:

SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration. 24405-24415 - Sang Min Jung, Kaixiang Zhang, Cristian Danescu-Niculescu-Mizil:

A Similarity Measure for Comparing Conversational Dynamics. 24416-24447 - Le Huy Khiem, Ting Hua, Nitesh V. Chawla:

AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization. 24448-24458 - Fukun Ma, Kaibin Tian, Jieting Xue, Xiaoyi Wang, Ye Ma, Quan Chen, Peng Jiang, Lijie Wen:

Improving Preference Alignment of LLM with Inference-Free Self-Refinement. 24459-24473 - Ahmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud:

Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees. 24474-24488 - Haiyu Zhao, Zhenyu Guo, Chunhong Zhang, Ziyu Zhou, Zheng Hu:

StructuThink: Reasoning with Task Transition Knowledge for Autonomous LLM-Based Agents. 24489-24506 - Jizhi Zhang, Chongming Gao, Wentao Shi, Xi-Lin Chen, Jingang Wang, Xunliang Cai, Fuli Feng:

Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning. 24507-24521 - Zhongbin Xie, Thomas Lukasiewicz:

Investigating Multi-layer Representations for Dense Passage Retrieval. 24522-24536 - Mengqi Zhang, Bowen Fang, Qiang Liu, Xiaotian Ye, Shu Wu, Pengjie Ren, Zhumin Chen, Liang Wang:

KELE: Residual Knowledge Erasure for Enhanced Multi-hop Reasoning in Knowledge Editing. 24537-24552 - Ansh Poonia, Maeghal Jain:

Dissecting Persona-Driven Reasoning in Language Models via Activation Patching. 24553-24566 - Yaoshu Wang, Mengyi Yan, Wei Wang:

PUER: Boosting Few-shot Positive-Unlabeled Entity Resolution with Reinforcement Learning. 24567-24579 - Aina Garí Soler, Matthieu Labeau, Chloé Clavel:

Toward the Automatic Detection of Word Meaning Negotiation Indicators in Conversation. 24580-24596 - Shiji Yang, Shu Zhao, Congyao Mei, Zhen Yang, Jie Chen, Fulan Qian, Zhen Duan, Yan-ping Zhang:

Forget the Unneeded: Backdooring Large Language Models via Contrastive-enhanced Machine Unlearning. 24597-24607 - Lingnan Xu, Chong Feng, Kaiyuan Zhang, Liu Zhengyong, Wenqiang Xu, Fanqing Meng:

Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness. 24608-24631 - Woojun Jung, Junyeong Kim:

QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering. 24632-24642 - Cong Liu, Wenchang Chai, Hejun Wu, Yan Pan, Pengxu Wei, Liang Lin:

Thinking Before You Speak: A Proactive Test-time Scaling Approach. 24643-24650 - Wei-Hsiang Lin, Sheng-Lun Wei, Hen-Hsen Huang, Hsin-Hsi Chen:

Do Before You Judge: Self-Reference as a Pathway to Better LLM Evaluation. 24651-24672 - Muhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, Shady Shehata:

Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models. 24673-24695 - Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo:

ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions. 24696-24727 - Hyewon Jeon, Jay-Yoon Lee:

GraphCheck: Multipath Fact-Checking with Entity-Relationship Graphs. 24728-24745 - Parker Seegmiller, Kartik Mehta, Soumya Saha, Chenyang Tao, Shereen Oraby, Arpit Gupta, Tagyoung Chung, Mohit Bansal, Nanyun Peng:

FLAMES: Improving LLM Math Reasoning via a Fine-Grained Analysis of the Data Synthesis Pipeline. 24746-24766 - Leif Azzopardi, Yashar Moshfeghi:

POW: Political Overton Windows of Large Language Models. 24767-24773 - Ting Cai, Stephen Sheen, AnHai Doan:

Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language Models. 24774-24792 - Juan Pablo Muñoz, Jinjie Yuan:

RTTC: Reward-Guided Collaborative Test-Time Compute. 24793-24809 - Ziqing Wang, Chengsheng Mao, Xiaole Wen, Yuan Luo, Kaize Ding:

AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering. 24810-24832 - Pouya Pezeshkpour, Moin Aminnaseri, Estevam Hruschka:

Mixed Signals: Decoding VLMs' Reasoning and Underlying Bias in Vision-Language Conflict. 24833-24848 - Jianfei Zhao, Feng Zhang, Xin Sun, Chong Feng:

Mitigating Hallucination in Large Vision-Language Models through Aligning Attention Distribution to Information Flow. 24849-24863 - Rahul Atul Bhope, Praveen Venkateswaran, K. R. Jayaram, Vatche Isahagian, Vinod Muthusamy, Nalini Venkatasubramanian:

OptiSeq: Ordering Examples On-The-Fly for In-Context Learning. 24864-24887 - Devvrat Joshi, Islem Rekik:

Dependency Parsing-Based Syntactic Enhancement of Relation Extraction in Scientific Texts. 24888-24897 - Ixak Sarasua, Ander Corral, Xabier Saralegi:

DIPLomA: Efficient Adaptation of Instructed LLMs to Low-Resource Languages via Post-Training Delta Merging. 24898-24912 - Takumi Goto, Yusuke Sakai, Taro Watanabe:

Reliability Crisis of Reference-free Metrics for Grammatical Error Correction. 24913-24926 - Ananya Malik, Kartik Sharma, Shaily Bhatt, Lynnette Hui Xian Ng:

Who Speaks Matters: Analysing the Influence of the Speaker's Linguistic Identity on Hate Classification. 24927-24937 - Ananya Malik, Nazanin Sabri, Melissa Karnaze, Mai ElSherief:

Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's Empathy. 24938-24959 - Diyam Akra, Mohammed Khalilia, Mustafa Jarrar:

Active Learning for Multidialectal Arabic POS Tagging. 24960-24973 - Jessica Maghakian, Raunak Sinha, Max Schettewi, Gunkirat Kaur:

Embedding-Free RAG. 24974-24985 - Rajarshi Haldar, Julia Hockenmaier:

Rating Roulette: Self-Inconsistency in LLM-As-A-Judge Frameworks. 24986-25004 - Yangyi Li, Mengdi Huai:

Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering. 25005-25013 - Patrícia Schmidtová, Ondrej Dusek, Saad Mahamood:

Real-World Summarization: When Evaluation Reaches Its Limits. 25014-25026 - Arti Rani, Shweta Singh, Nihar Ranjan Sahoo, Gaurav Kumar Nayak:

Open-DeBias: Toward Mitigating Open-Set Bias in Language Models. 25027-25051 - Dhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie:

SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization. 25052-25065 - Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S. M. Iftekhar, Ahmed Magooda, Benjamin Van Durme, Daniel Khashabi, Kyle Jackson:

Jailbreak Distillation: Renewable Safety Benchmarking. 25066-25089 - Aakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang:

Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems. 25090-25098 - Odysseas S. Chlapanis, Dimitris Galanis, Nikolaos Aletras, Ion Androutsopoulos:

GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and Citations. 25099-25119 - Yongdong Chi, Hanqing Wang, Yun Chen, Yan Yang, Jian Yang, Zonghan Yang, Xiao Yan, Guanhua Chen:

Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages. 25120-25144 - Changmao Li, Jeffrey Flanigan:

RAC: Efficient LLM Factuality Correction with Retrieval Augmentation. 25145-25159 - James Ford, Anthony Rios:

Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach. 25160-25173 - Abdessalam Ed-dib, Zhanibek Datbayev, Amine Mohamed Aboussalah:

GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning. 25174-25196 - Arun Verma, Zhaoxuan Wu, Zijian Zhou, Xiaoqiang Lin, Zhiliang Chen, Rachael Hwee Ling Sim, Rui Qiao, Jingtan Wang, Nhung Bui, Xinyuan Niu, Wenyang Hu, Gregory Kang Ruey Lau, Zi-Yu Khoo, Zitong Zhao, Xinyi Xu, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low:

Uncovering Scaling Laws for Large Language Models via Inverse Problems. 25197-25211 - Wenyu Wang, Mengqi Zhang, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, Zhumin Chen:

UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets. 25212-25227 - Natasha Johnson, Amanda Bertsch, Maria-Emil Deal, Emma Strubell:

FicSim: A Dataset for Multi-Faceted Semantic Similarity in Long-Form Fiction. 25228-25246 - Chao Feng, Zihao Wei, Andrew Owens:

Masked Diffusion Captioning for Visual Feature Learning. 25247-25263 - Bohan Yao, Vikas Yadav:

Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning. 25264-25282 - Didi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu:

Enhancing Goal-oriented Proactive Dialogue Systems via Dynamic Multi-dimensional Consistency Optimization. 25283-25296 - Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen:

Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey. 25297-25311 - Calvin Bao, Connor Baumler, Hal Daumé III, Marine Carpuat:

Who's the Author? How Explanations Impact User Reliance in AI-Assisted Authorship Attribution. 25312-25330 - Zhengyan Sheng, Zhihao Du, Heng Lu, Shiliang Zhang, Zhen-Hua Ling:

UniSpeaker: A Unified Approach for Multimodality-driven Speaker Generation. 25331-25346 - Surgan Jandial, Yinong Oliver Wang, Andrea Bajcsy, Fernando De la Torre:

On the Fine-Grained Planning Abilities of VLM Web Agents. 25347-25380 - Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou:

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models with Human Feedback. 25381-25400 - Jiazhou Ji, Xinru Lu:

ReFLAIR: Enhancing Multimodal Reasoning via Structured Reflection and Reward-Guided Learning. 25401-25413 - Bowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle H. Ungar, Camillo Jose Taylor:

ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations. 25414-25425 - Beibei Yu, Tao Shen, Ling Chen:

STA-CoT: Structured Target-Centric Agentic Chain-of-Thought for Consistent Multi-Image Geological Reasoning. 25426-25444 - Chi Han, Xin Liu, Haodong Wang, Shiyang Li, Jingfeng Yang, Haoming Jiang, Zhengyang Wang, Qingyu Yin, Liang Qiu, Changlong Yu, Yifan Gao, Zheng Li, Bing Yin, Jingbo Shang, Heng Ji:

Can Language Models Follow Multiple Turns of Entangled Instructions? 25445-25460 - Claudio Borile, Carlo Abrate:

How to Generalize the Detection of AI-Generated Text: Confounding Neurons. 25461-25476 - Fenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang:

SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks. 25477-25503 - Priyanshu Priya, Saurav Dudhate, Desai Yasheshbhai, Asif Ekbal:

We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for Tourism. 25504-25536 - Tereza Vrabcová, Marek Kadlcík, Petr Sojka, Michal Stefánik, Michal Spiegel:

Towards the Roots of the Negation Problem: A Multilingual NLI Dataset and Model Scaling Analysis. 25537-25551 - Sai Ashish Somayajula, Bokai Hu, Qi Cao, Xin Pan, Pengtao Xie:

Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning. 25552-25567 - Hasan Kerem Seker, Gökçe Uludogan, Pelin Önal, Arzucan Özgür:

HATECAT-TR: A Hate Speech Span Detection and Categorization Dataset for Turkish. 25568-25579 - Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, Akmmahbubur Rahman, Aman Chadha, Tariq Iqbal, M. Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali:

DM-Codec: Distilling Multimodal Representations for Speech Tokenization. 25580-25602 - Shuli Zhang, Zhiqiang You, Xiao Xiang Qi, Peng Liu, Gaode Wu, Kan Xia, Shenguang Huang:

LCAN: A Label-Aware Contrastive Attention Network for Multi-Intent Recognition and Slot Filling in Task-Oriented Dialogue Systems. 25603-25612 - Andrei Kucharavy, Sherine Seppey, Cyril Vallez, Dimitri Percia David, Ljiljana Dolamic:

Low-Resource Languages LLM Disinformation is Within Reach: The Case of Walliserdeutsch. 25613-25625 - Kuanchao Chu, Yi-Pei Chen, Hideki Nakayama:

Exploring and Controlling Diversity in LLM-Agent Conversation. 25626-25644 - Sneheel Sarangi, Chetan Talele, Hanan Salam:

Agentic-ToM: Cognition-Inspired Agentic Processing For Enhancing Theory of Mind Reasoning. 25645-25661 - Xinhao Yi, Jake Lever, Kevin Bryson, Zaiqiao Meng:

Can We Edit LLMs for Long-Tail Biomedical Knowledge? 25662-25679 - Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Deli Zhao, Anh Tuan Luu, Yu Rong:

GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning. 25680-25688 - Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou:

CM-Align: Consistency-based Multilingual Alignment for Large Language Models. 25689-25702 - Nearchos Potamitis, Lars Henning Klein, Bardia Mohammadi, Chongyang Xu, Attreyee Mukherjee, Niket Tandon, Laurent Bindschaedler, Akhil Arora:

Cache Saver: A Modular Framework for Efficient, Affordable, and Reproducible LLM Inference. 25703-25724 - Melika Nobakhtian, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar:

Evaluating Cultural Knowledge and Reasoning in LLMs Through Persian Allusions. 25725-25737 - Craig Thomson, Ehud Reiter, João Sedoc, Anya Belz:

Evolving Stances on Reproducibility: A Longitudinal Study of NLP and ML Researchers' Views and Experience of Reproducibility. 25738-25760 - Yajing Yang, Tony Deng, Min-Yen Kan:

KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration. 25761-25785

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














