Multi-Head Attention

Intermediate

Allows model to attend to information from different subspaces simultaneously.

Full Definition

Allows model to attend to information from different subspaces simultaneously.

Keywords

Domains

Related Terms

Concept Map

See how Multi-Head Attention connects to other concepts.

Open Knowledge Graph