Fixed typo in MQA explanation

#5
by foldl - opened
Files changed (1) hide show
  1. app/src/content/article.mdx +1 -1
app/src/content/article.mdx CHANGED
@@ -532,7 +532,7 @@ The [GQA paper](https://arxiv.org/abs/2305.13245) explains how grouped-query att
532
 
533
  </Sidenote>
534
 
535
- NanoChat uses Multi-Query Attention (MQA) to reduce the memory footprint of the KV cache, using 6 query heads but only 6 key/value heads (in the default config). This is a common configuration for smaller models like nanochat.
536
 
537
  <Sidenote>
538
 
 
532
 
533
  </Sidenote>
534
 
535
+ NanoChat uses Multi-Query Attention (MQA) to reduce the memory footprint of the KV cache, using 6 query heads but only 1 key/value head (in the default config). This is a common configuration for smaller models like nanochat.
536
 
537
  <Sidenote>
538