Japanese AI Agent System on Human Papillomavirus Vaccination: System Design
arXiv:2601.10718v1 Announce Type: new Abstract: Human papillomavirus (HPV) vaccine hesitancy poses significant public health challenges, particularly in Japan where proactive…
OctoBench: Benchmarking Scaffold-Aware Instruction Following in Repository-Grounded Agentic Coding
arXiv:2601.10343v2 Announce Type: replace-cross Abstract: Modern coding scaffolds turn LLMs into capable software agents, but their ability to follow scaffold-specified…
Relational Linearity is a Predictor of Hallucinations
arXiv:2601.11429v1 Announce Type: cross Abstract: Hallucination is a central failure mode in large language models (LLMs). We focus on hallucinations…
The Great March 100: 100 Detail-oriented Tasks for Evaluating Embodied AI Agents
arXiv:2601.11421v1 Announce Type: cross Abstract: Recently, with the rapid development of robot learning and imitation learning, numerous datasets and methods…
Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling
arXiv:2601.09566v2 Announce Type: replace-cross Abstract: Large language models typically represent Chinese characters as discrete index-based tokens, largely ignoring their visual…
AI Survival Stories: a Taxonomic Analysis of AI Existential Risk
arXiv:2601.09765v1 Announce Type: new Abstract: Since the release of ChatGPT, there has been a lot of debate about whether AI…
Bridging Semantic Understanding and Popularity Bias with LLMs
arXiv:2601.09478v2 Announce Type: replace-cross Abstract: Semantic understanding of popularity bias is a crucial yet underexplored challenge in recommender systems, where…
