Results for "input variable"
Activation max(0, x); improves gradient flow and training speed in deep nets.
Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.
Model that compresses input into latent space and reconstructs it.
Sensitivity of a function to input perturbations.
Small prompt changes cause large output changes.
Fast approximation of costly simulations.
Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
The learned numeric values of a model adjusted during training to minimize a loss function.
A parameterized function composed of interconnected units organized in layers with nonlinear activations.
Methods to set starting weights to preserve signal/gradient scales across layers.
Networks using convolution operations with weight sharing and locality, effective for images and signals.
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).
Local surrogate explanation method approximating model behavior near a specific input.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
A narrow hidden layer forcing compact representations.
Allows model to attend to information from different subspaces simultaneously.
Routes inputs to subsets of parameters for scalable capacity.
Probabilistic graphical model for structured prediction.
Detects trigger phrases in audio streams.
Control without feedback after execution begins.
Running predictions on large datasets periodically.
Using markers to isolate context segments.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.