Robust Safety Monitoring of Language Models via Activation Watermarking
arXiv:2603.23171v1 Announce Type: cross Abstract: Large language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions…
arXiv:2603.23171v1 Announce Type: cross Abstract: Large language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions…
arXiv:2603.21825v2 Announce Type: replace-cross Abstract: Evaluating badminton performance often requires expert coaching, which is rarely accessible for amateur players. We…
arXiv:2603.22920v1 Announce Type: cross Abstract: The EU AI Act constitutes an important development in shaping the Union’s digital regulatory architecture.…
arXiv:2603.22966v1 Announce Type: cross Abstract: Large language models (LLMs) inherently operate over a large generation space, yet conventional usage typically…
arXiv:2603.22153v2 Announce Type: replace-cross Abstract: Recent advances in cross-view geo-localization (CVGL) methods have shown strong potential for supporting unmanned aerial…
arXiv:2603.22306v1 Announce Type: new Abstract: Affective judgment in real interaction is rarely a purely local prediction problem. Emotional meaning often…
Nature Machine Intelligence, Published online: 25 March 2026; doi:10.1038/s42256-026-01195-y Frank et al. introduce Euclidean fast attention, a linear-scaling framework for…
Nature Machine Intelligence, Published online: 25 March 2026; doi:10.1038/s42256-026-01204-0 Muzellec and Kar use reverse predictivity to show that only a…
arXiv:2603.20991v1 Announce Type: cross Abstract: A single matrix out of 468 in GPT-2 Small can increase perplexity by 20,000x when…
arXiv:2603.21108v1 Announce Type: cross Abstract: Molecular property prediction constitutes a cornerstone of drug discovery and materials science, necessitating models capable…