video-SALMONN S: Streaming Audio-Visual LLMs Beyond Length Limits via Memory
arXiv:2510.11129v1 Announce Type: cross Abstract: Continuous, high-frame-rate, high-resolution processing of long video streams is critical for future AI agents, yet…
For An Exciting Tomorrow
arXiv:2510.11129v1 Announce Type: cross Abstract: Continuous, high-frame-rate, high-resolution processing of long video streams is critical for future AI agents, yet…
arXiv:2510.09541v2 Announce Type: replace-cross Abstract: Diffusion large language models (dLLMs) are emerging as an efficient alternative to autoregressive models due…
arXiv:2510.09278v1 Announce Type: cross Abstract: Training expert LLMs in domains with scarce data is difficult, often relying on multiple-choice questions…
arXiv:2510.08279v2 Announce Type: replace-cross Abstract: Recent advances in neural scene representations have led to unprecedented quality in 3D reconstruction and…
arXiv:2510.09302v1 Announce Type: cross Abstract: Geometric reasoning remains a core challenge for Multimodal Large Language Models (MLLMs). Even the most…
arXiv:2510.08619v1 Announce Type: new Abstract: Large-scale scientific datasets — spanning health biobanks, cell atlases, Earth reanalyses, and more — create…
arXiv:2510.08539v2 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR), which uses simple binary feedback to post-train large language…
arXiv:2510.07231v2 Announce Type: replace-cross Abstract: Causal reasoning is fundamental for Large Language Models (LLMs) to understand genuine cause-and-effect relationships beyond…
arXiv:2510.07024v2 Announce Type: replace-cross Abstract: LLMs are remarkable artifacts that have revolutionized a range of NLP and AI tasks. A…
arXiv:2510.07331v1 Announce Type: new Abstract: This paper introduces Truth-Aware Decoding (TAD), a verification-oriented decoding scheme that aligns neural language generation…