BAGEL: Benchmarking Animal Knowledge Expertise in Language Models
arXiv:2604.16241v1 Announce Type: cross Abstract: Large language models have shown strong performance on broad-domain knowledge and reasoning benchmarks, but it…
arXiv:2604.16241v1 Announce Type: cross Abstract: Large language models have shown strong performance on broad-domain knowledge and reasoning benchmarks, but it…
arXiv:2604.15495v1 Announce Type: new Abstract: Navigating complex, densely packed environments like retail stores, warehouses, and hospitals poses a significant spatial…
arXiv:2604.14646v2 Announce Type: replace Abstract: Recent advances in reinforcement learning (RL) have improved the reasoning capabilities of large language models…
arXiv:2510.27617v2 Announce Type: replace Abstract: Automation of Register Transfer Level (RTL) design can help developers meet increasing computational demands. Large…
arXiv:2505.21569v3 Announce Type: replace-cross Abstract: Although LLM-based agents are proven to master tool orchestration in scientific fields, particularly chemistry, their…
arXiv:2604.15456v1 Announce Type: new Abstract: Trustworthiness and transparency are essential for the clinical adoption of artificial intelligence (AI) in healthcare…
arXiv:2604.14967v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) extends Large Vision-Language Models (LVLMs) with external visual knowledge. However, existing visual…
arXiv:2604.16104v1 Announce Type: cross Abstract: Lung cancer remains one of the leading causes of cancer-related mortality worldwide. Conventional computed tomography…
arXiv:2604.16090v1 Announce Type: cross Abstract: Probabilistic Synchronous Parallel (PSP) is a technique in distributed learning systems to reduce synchronization bottlenecks…
arXiv:2604.14373v2 Announce Type: replace-cross Abstract: Rural environmental risks are shaped by place-based conditions (e.g., housing quality, road access, land-surface patterns),…