Going All-In on LLM Accuracy: Fake Prediction Markets, Real Confidence Signals
arXiv:2512.05998v1 Announce Type: new Abstract: Large language models are increasingly used to evaluate other models, yet these judgments typically lack…
For An Exciting Tomorrow
arXiv:2512.05998v1 Announce Type: new Abstract: Large language models are increasingly used to evaluate other models, yet these judgments typically lack…
arXiv:2512.05397v2 Announce Type: replace-cross Abstract: Major life transitions demand high-stakes decisions, yet people often struggle to imagine how their future…
arXiv:2512.07400v1 Announce Type: cross Abstract: A persistent paradox in continual learning (CL) is that neural networks often retain linearly separable…
arXiv:2512.07371v1 Announce Type: cross Abstract: Behavior-cloning based visuomotor policies enable precise manipulation but often inherit the slow, cautious tempo of…
arXiv:2512.05103v2 Announce Type: replace-cross Abstract: Video generation models are rapidly advancing, but can still struggle with complex video outputs that…
arXiv:2512.05122v1 Announce Type: new Abstract: Small and medium-sized enterprises (SMEs) still depend heavily on tacit, experience-based know-how that rarely makes…
arXiv:2512.03728v2 Announce Type: replace-cross Abstract: The 3rd Generation Partnership Project (3GPP), the standards body for mobile networks, is in the…
arXiv:2512.05693v1 Announce Type: cross Abstract: The development of foundation models for embodied intelligence critically depends on access to large-scale, high-quality…
arXiv:2512.05681v1 Announce Type: cross Abstract: Retrieving case law is a time-consuming task predominantly carried out by querying databases. We provide…
arXiv:2512.03102v2 Announce Type: replace-cross Abstract: In emergency response and other high-stakes societal applications, early-stage state estimates critically shape downstream outcomes.…