Epistemic Traps: Rational Misalignment Driven by Model Misspecification
arXiv:2602.17676v1 Announce Type: new Abstract: The rapid deployment of Large Language Models and AI agents across critical societal and technical…
arXiv:2602.17676v1 Announce Type: new Abstract: The rapid deployment of Large Language Models and AI agents across critical societal and technical…
arXiv:2602.17037v2 Announce Type: replace-cross Abstract: Autonomous coding agents, powered by large language models (LLMs), are increasingly being adopted in the…
arXiv:2602.18277v1 Announce Type: cross Abstract: This work studies heterogeneous Multi-Objective Reinforcement Learning (MORL), where objectives can differ sharply in temporal…
arXiv:2602.18262v1 Announce Type: cross Abstract: While mechanistic interpretability has developed powerful tools to analyze the internal workings of Large Language…
arXiv:2602.15997v2 Announce Type: replace-cross Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across…
arXiv:2602.17557v1 Announce Type: cross Abstract: Alzheimer’s disease (AD) and Lewy body dementia (LBD) present overlapping clinical features yet require distinct…
arXiv:2602.17395v1 Announce Type: cross Abstract: Generalized Category Discovery (GCD) aims to identify novel categories in unlabeled data while leveraging a…