Multi-Head Attention
IntermediateAllows model to attend to information from different subspaces simultaneously.
Full Definition
Allows model to attend to information from different subspaces simultaneously.
Keywords
Domains
Related Terms
Concept Map
See how Multi-Head Attention connects to other concepts.
Open Knowledge Graph