The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation
arXiv:2603.24124v1 Announce Type: cross Abstract: RLHF-aligned language models exhibit response homogenization: on TruthfulQA (n=790), 40-79% of questions produce a single…
Upper Entropy for 2-Monotone Lower Probabilities
arXiv:2603.23558v1 Announce Type: cross Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifying…
PLDR-LLMs Reason At Self-Organized Criticality
arXiv:2603.23539v1 Announce Type: new Abstract: We show that PLDR-LLMs pretrained at self-organized criticality exhibit reasoning at inference time. The characteristics…
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
arXiv:2603.23064v2 Announce Type: replace-cross Abstract: We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered…
Cost-Sensitive Neighborhood Aggregation for Heterophilous Graphs: When Does Per-Edge Routing Help?
arXiv:2603.24291v1 Announce Type: cross Abstract: Recent work distinguishes two heterophily regimes: adversarial, where cross-class edges dilute class signal and harm…
The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents
arXiv:2603.24284v1 Announce Type: cross Abstract: When multiple LLM-based code agents independently implement parts of the same class, they must agree…
Tiny Inference-Time Scaling with Latent Verifiers
arXiv:2603.22492v2 Announce Type: replace-cross Abstract: Inference-time scaling has emerged as an effective way to improve generative models at test time…
