DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models
arXiv:2603.18048v1 Announce Type: new Abstract: Recent Audio Multimodal Large Language Models (Audio MLLMs) demonstrate impressive performance on speech benchmarks, yet…
TDAD: Test-Driven Agentic Development – Reducing Code Regressions in AI Coding Agents via Graph-Based Impact Analysis
arXiv:2603.17973v2 Announce Type: replace-cross Abstract: AI coding agents can resolve real-world software issues, yet they frequently introduce regressions — breaking…
Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework
arXiv:2603.18677v1 Announce Type: cross Abstract: Artificial intelligence is increasingly embedded in human decision-making, where it can either enhance human reasoning…
Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image Segmentation
arXiv:2603.18655v1 Announce Type: cross Abstract: Medical ultrasound image segmentation faces significant challenges due to limited labeled data and characteristic imaging…
SCALE:Scalable Conditional Atlas-Level Endpoint transport for virtual cell perturbation prediction
arXiv:2603.17380v2 Announce Type: replace-cross Abstract: Virtual cell models aim to enable in silico experimentation by predicting how cells respond to…
