MiniMax Sparse Attention (MSA): a Two-Branch Block-Sparse Attention Trained on a 109B-Parameter MoE With a 3T-Token Budget
Low Severity
Global
Date OccurredJun 17, 202607:44 UTC
Event TypeAI News
SourceAI News
RecordedJun 17, 2026
Full Description
<p>MiniMax released MSA, a sparse attention built on Grouped Query Attention. A lightweight Index Branch selects Top-k key-value blocks per query and GQA group; the Main Branch attends only to those b