Leibniz’s Monadology as Foundation for the Artificial Age Score: A Formal Architecture for Al Memory Evaluation
arXiv:2511.17541v1 Announce Type: new Abstract: This paper develops a mathematically rigorous, philosophically grounded framework for evaluating artificial memory systems, rooted…
GRAPHIC–Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity
arXiv:2511.17443v2 Announce Type: replace-cross Abstract: Artificial Intelligence (AI) has been increasingly applied to creative domains, leading to the development of…
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
arXiv:2511.18890v1 Announce Type: cross Abstract: Efficient deployment of small language models (SLMs) is essential for numerous real-world applications with stringent…
CoreEval: Automatically Building Contamination-Resilient Datasets with Real-World Knowledge toward Reliable LLM Evaluation
arXiv:2511.18889v1 Announce Type: cross Abstract: Data contamination poses a significant challenge to the fairness of LLM evaluations in natural language…
Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT
arXiv:2511.17405v2 Announce Type: replace-cross Abstract: Multiple-choice question answering (MCQA) has been a popular format for evaluating and reinforcement fine-tuning (RFT)…
Stable diffusion models reveal a persisting human and AI gap in visual creativity
arXiv:2511.16814v1 Announce Type: new Abstract: While recent research suggests Large Language Models match human creative performance in divergent thinking tasks,…
WER is Unaware: Assessing How ASR Errors Distort Clinical Understanding in Patient Facing Dialogue
arXiv:2511.16544v2 Announce Type: replace-cross Abstract: As Automatic Speech Recognition (ASR) is increasingly deployed in clinical dialogue, standard evaluations still rely…
Preventing Shortcut Learning in Medical Image Analysis through Intermediate Layer Knowledge Distillation from Specialist Teachers
arXiv:2511.17421v1 Announce Type: cross Abstract: Deep learning models are prone to learning shortcut solutions to problems using spuriously correlated yet…
DS-Span: Single-Phase Discriminative Subgraph Mining for Efficient Graph Embeddings
arXiv:2511.17419v1 Announce Type: cross Abstract: Graph representation learning seeks to transform complex, high-dimensional graph structures into compact vector spaces that…
VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference
arXiv:2511.16449v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models have shown great promise for embodied AI, yet the heavy computational cost…