An understanding of Attention Head in LLM
The attention heads live inside the Transformer layer Each Transformer layer actually contains two main parts: 1 Multi-Head Self Attention2 Feed Forward Neural Network Attention heads operate inside each transformer layer to determine how tokens in the sentence relate to each other before the model predicts the next token. Related post – How LLM is […]