Exploration and Exploitation Errors Are Measurable for Language Model Agents
arXiv:2604.13151v1 Announce Type: new Abstract: Language Model (LM) agents are increasingly used in complex open-ended decision-making tasks, from AI coding…
arXiv:2604.13151v1 Announce Type: new Abstract: Language Model (LM) agents are increasingly used in complex open-ended decision-making tasks, from AI coding…
arXiv:2508.05153v2 Announce Type: replace-cross Abstract: Category-level generalization for robotic garment manipulation, such as bimanual smoothing, remains a significant hurdle due…
arXiv:2601.03523v2 Announce Type: replace Abstract: One of the most important queries in knowledge compilation is weighted model counting (WMC), which…
arXiv:2604.11465v2 Announce Type: replace Abstract: Large language model (LLM) agents show promise on realistic tool-use tasks, but deploying capable agents…
arXiv:2604.13924v1 Announce Type: cross Abstract: Time-series anomaly detection (TSAD) is critical in domains such as industrial monitoring, healthcare, and cybersecurity,…
arXiv:2604.14128v1 Announce Type: cross Abstract: Rhetorical questions are asked not to seek information but to persuade or signal stance. How…