BrowseConf: Confidence-Guided Test-Time Scaling for Web Agents
arXiv:2510.23458v2 Announce Type: replace-cross Abstract: Confidence in LLMs is a useful indicator of model uncertainty and answer reliability. Existing work…
For An Exciting Tomorrow
arXiv:2510.23458v2 Announce Type: replace-cross Abstract: Confidence in LLMs is a useful indicator of model uncertainty and answer reliability. Existing work…
arXiv:2510.23409v2 Announce Type: replace-cross Abstract: Data valuation has become central in the era of data-centric AI. It drives efficient training…
arXiv:2510.21280v2 Announce Type: replace-cross Abstract: While recent sound event detection (SED) systems can identify baleen whale calls in marine audio,…
arXiv:2510.23028v1 Announce Type: cross Abstract: AutoRegressive (AR) models have demonstrated competitive performance in image generation, achieving results comparable to those…
arXiv:2510.21720v1 Announce Type: new Abstract: The confluence of Artificial Intelligence and Computational Psychology presents an opportunity to model, understand, and…
arXiv:2510.20819v2 Announce Type: replace-cross Abstract: Recent advances in generative modeling have positioned diffusion models as state-of-the-art tools for sampling from…
arXiv:2510.23034v1 Announce Type: cross Abstract: Binarized Neural Networks (BNNs) are a class of deep neural networks designed to utilize minimal…
arXiv:2506.07736v3 Announce Type: replace Abstract: Large Language Models (LLMs) continue to exhibit vulnerabilities despite deliberate safety alignment efforts, posing significant…
arXiv:2510.20706v2 Announce Type: replace-cross Abstract: Model-free reinforcement learning (RL) has enabled adaptable and agile quadruped locomotion; however, policies often converge…
arXiv:2510.20838v1 Announce Type: new Abstract: This study introduces a human-in-the-loop pipeline that converts unscaled, hand-drawn floor plan sketches into semantically…