ChessArena: A Chess Testbed for Evaluating Strategic Reasoning Capabilities of Large Language Models
arXiv:2509.24239v4 Announce Type: replace-cross Abstract: Recent large language models (LLMs) have shown strong reasoning capabilities. However, a critical question remains:…
