MASEval: Extending Multi-Agent Evaluation from Models to Systems
arXiv:2603.08835v1 Announce Type: new Abstract: The rapid adoption of LLM-based agentic systems has produced a rich ecosystem of frameworks (smolagents,…
arXiv:2603.08835v1 Announce Type: new Abstract: The rapid adoption of LLM-based agentic systems has produced a rich ecosystem of frameworks (smolagents,…
arXiv:2603.05437v2 Announce Type: replace-cross Abstract: Weakly-Supervised Dense Video Captioning aims to localize and describe events in videos trained only on…
arXiv:2603.08059v1 Announce Type: cross Abstract: With the rapid advancement of commercial multi-modal models, image editing has garnered significant attention due…
arXiv:2603.08090v1 Announce Type: cross Abstract: Significant progress has been achieved in subject-driven text-to-image (T2I) generation, which aims to synthesize new…
arXiv:2603.05768v2 Announce Type: replace-cross Abstract: Model merging integrates multiple task-specific models into a single consolidated one. Recent research has made…
arXiv:2603.06587v1 Announce Type: new Abstract: The deployment of autonomous AI agents in derivatives markets has widened a practical gap between…
arXiv:2603.04413v2 Announce Type: replace-cross Abstract: Meaning in human language is relational, context dependent, and emergent, arising from dynamic systems of…
arXiv:2603.06351v1 Announce Type: cross Abstract: Diffusion Transformers process images as fixed-length sequences of tokens produced by a static $textit{patchify}$ operation.…
arXiv:2603.06361v1 Announce Type: cross Abstract: Accurate fault detection in high-dimensional industrial environments remains a major challenge due to the inherent…
arXiv:2603.05504v2 Announce Type: replace-cross Abstract: Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces…