ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding
arXiv:2508.21496v2 Announce Type: replace-cross Abstract: Video multimodal large language models (Video-MLLMs) have achieved remarkable progress in video understanding. However, they…
