Collective Communication for 100k+ GPUs
arXiv:2510.20171v4 Announce Type: replace-cross Abstract: The increasing scale of large language models (LLMs) necessitates highly efficient collective communication frameworks, particularly…
Simulation-Free PSRO: Removing Game Simulation from Policy Space Response Oracles
arXiv:2601.05279v1 Announce Type: cross Abstract: Policy Space Response Oracles (PSRO) combines game-theoretic equilibrium computation with learning and is effective in…
IIB-LPO: Latent Policy Optimization via Iterative Information Bottleneck
arXiv:2601.05870v1 Announce Type: cross Abstract: Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) for Large Language Model (LLM) reasoning…
Memorization in Large Language Models in Medicine: Prevalence, Characteristics, and Implications
arXiv:2509.08604v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated significant potential in medicine, with many studies adapting them…
NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning
arXiv:2601.03790v1 Announce Type: cross Abstract: Neologism-aware machine translation aims to translate source sentences containing neologisms into target languages. This field…
MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models
arXiv:2601.03331v1 Announce Type: cross Abstract: Recent advances in Vision-Language Models (VLMs) have improved performance in multi-modal learning, raising the question…
OnlineMate: An LLM-Based Multi-Agent Companion System for Cognitive Support in Online Learning
arXiv:2509.14803v4 Announce Type: replace-cross Abstract: In online learning environments, students often lack personalized peer interactions, which are crucial for cognitive…
