VideoDeepResearch: Long Video Understanding With Agentic Tool Using
arXiv:2506.10821v1 Announce Type: cross Abstract: Long video understanding (LVU) presents a significant challenge for current multi-modal large language models (MLLMs)…
For An Exciting Tomorrow
arXiv:2506.10821v1 Announce Type: cross Abstract: Long video understanding (LVU) presents a significant challenge for current multi-modal large language models (MLLMs)…
arXiv:2506.09820v2 Announce Type: replace-cross Abstract: Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language…
arXiv:2506.10130v1 Announce Type: new Abstract: This article introduces a conjecture that formalises a fundamental trade-off between provable correctness and broad…
arXiv:2506.07563v2 Announce Type: replace-cross Abstract: Personalized recommendation systems must adapt to user interactions across different domains. Traditional approaches like MLoRA…
arXiv:2506.08822v1 Announce Type: cross Abstract: Generative modeling-based visuomotor policies have been widely adopted in robotic manipulation attributed to their ability…
arXiv:2506.08827v1 Announce Type: cross Abstract: The extraction of information about traffic accidents from legal documents is crucial for quantifying insurance…
arXiv:2506.07976v2 Announce Type: replace-cross Abstract: The current paradigm of test-time scaling relies on generating long reasoning traces (“thinking” more) before…
arXiv:2506.08026v1 Announce Type: new Abstract: This paper proposes TIP-Search, a time-predictable inference scheduling framework for real-time market prediction under uncertain…