Results for "rectified linear"
Activation max(0, x); improves gradient flow and training speed in deep nets.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Mathematical foundation for ML involving vector spaces, matrices, and linear transformations.
Optimal estimator for linear dynamic systems.
Vector whose direction remains unchanged under linear transformation.
Normalized covariance.
Optimal control for linear systems with quadratic cost.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Systematic error introduced by simplifying assumptions in a learning algorithm.
Set of vectors closed under addition and scalar multiplication.
Control that remains stable under model uncertainty.
Incremental capability growth.
Inferring the agent’s internal state from noisy sensor data.
Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
A parameterized mapping from inputs to outputs; includes architecture + learned parameters.
Networks using convolution operations with weight sharing and locality, effective for images and signals.
Local surrogate explanation method approximating model behavior near a specific input.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
Optimization problems where any local minimum is global.
Gradually increasing learning rate at training start to avoid divergence.
Allows model to attend to information from different subspaces simultaneously.
Neural networks that operate on graph-structured data by propagating information along edges.
Extension of convolution to graph domains using adjacency structure.
Controls amount of noise added at each diffusion step.
Generates audio waveforms from spectrograms.
Sequential data indexed by time.
Models time evolution via hidden states.
Monte Carlo method for state estimation.