An understanding of Attention Head in LLM

March 14, 2026 · Riya

The attention heads live inside the Transformer layer

Each Transformer layer actually contains two main parts:

1 Multi-Head Self Attention
2 Feed Forward Neural Network

Attention heads operate inside each transformer layer to determine how tokens in the sentence relate to each other before the model predicts the next token.