RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs
arXiv:2509.21128v2 Announce Type: replace Abstract: Large language models (LLMs) are typically trained by reinforcement learning (RL) with verifiable rewards (RLVR)…
arXiv:2509.21128v2 Announce Type: replace Abstract: Large language models (LLMs) are typically trained by reinforcement learning (RL) with verifiable rewards (RLVR)…
arXiv:2505.19342v2 Announce Type: replace-cross Abstract: Multi-device inference can reduce Transformer latency by parallelizing computation. However, existing methods require high inter-device…
Nature Machine Intelligence, Published online: 28 May 2026; doi:10.1038/s42256-026-01248-2 Human–AI interactions reshape the self and our social networks
arXiv:2603.25415v2 Announce Type: replace Abstract: Semantic world models enable embodied agents to reason about objects, relations, and spatial context beyond…
arXiv:2602.12833v2 Announce Type: replace-cross Abstract: Longitudinal clinical reasoning over electronic health records requires tracking evolving physiological measurements, laboratory results, and…
arXiv:2511.04711v2 Announce Type: replace-cross Abstract: Large-scale vision-language models, especially CLIP, have demonstrated remarkable performance across diverse downstream tasks. Soft prompts,…
arXiv:2603.28345v2 Announce Type: replace-cross Abstract: LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that…
arXiv:2604.08059v5 Announce Type: replace-cross Abstract: Software systems built from versioned AI components increasingly need lifecycle-time governance: when a capability module…