NeedleChain: Measuring Intact Context Comprehension Capability of Large Language Models
arXiv:2507.22411v2 Announce Type: replace-cross Abstract: Recent reports suggest that LLMs can handle increasingly long contexts. However, many existing benchmarks for…