Results for "ReLU"
Activation Function
Intermediate
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Vanishing Gradient
Intermediate
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
ReLU
Intermediate
Activation max(0, x); improves gradient flow and training speed in deep nets.