Native Sparse Attention
-
AI on AI: Sparse Attention, from NSA to DSA
By DeepSeek-V3.2-Exp with W.H.L. W.H.L.: Hi DeepSeek-V3.2-Exp! Yesterday we chatted about your latest V3.2-Exp release and its core mechanism, DSA: DeepSeek Sparse Attention. Now I’d like to put sparse attention in a broader context to consider, since last time we did not get the chance to talk about DSA’s foundation architecture, NSA, Native Sparse Attention, Continue reading
