FBS: Modeling Native Parallel Reading inside a Transformer

ByAdmin

Apr 9, 2026

THE AI TODAY

arXiv:2601.21708v2 Announce Type: replace
Abstract: Large language models (LLMs) excel across many tasks, yet inference is still dominated by strictly token-by-token autoregression. Existing acceleration methods largely patch this pipeline and miss core human-reading ingredients: content-adaptive foresight, chunk-structure-aware compute allocation, and train-test consistency for preview/skimming. We propose the Fovea-Block-Skip Transformer (FBS), which injects a causal, trainable loop into Transformers via Parafovea-Attention Window (PAW), Chunk-Head (CH), and Skip-Gate (SG). Across diverse benchmarks, FBS improves the quality-efficiency trade-off without increasing parameters, and ablations show the three modules are complementary.

By Admin

AI RESEARCH

FBS: Modeling Native Parallel Reading inside a Transformer

ByAdmin

By Admin

Related Post

Bridging Natural Language and Microgrid Dynamics: A Context-Aware Simulator and Dataset

Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation

On the Step Length Confounding in LLM Reasoning Data Selection