Results for "deep learning"
Deep Learning
IntermediateA branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Deep Learning is a type of machine learning that uses structures called neural networks, which are inspired by how the human brain works. Imagine a series of layers where each layer learns to recognize different features of an image, like edges, shapes, and eventually, whole objects. This is how ...
AI supporting legal research, drafting, and analysis.
AI-assisted review of legal documents.
AI selecting next experiments.
AI tacitly coordinating prices.
Rate at which AI capabilities improve.
Research ensuring AI remains safe.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.
Measures a model’s ability to fit random noise; used to bound generalization error.
The learned numeric values of a model adjusted during training to minimize a loss function.
A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.
Minimizing average loss on training data; can overfit when data is limited or biased.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
When a model cannot capture underlying structure, performing poorly on both training and test data.
How well a model performs on new data drawn from the same (or similar) distribution as training.
Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.
A robust evaluation technique that trains/evaluates across multiple splits to estimate performance variability.
A table summarizing classification outcomes, foundational for metrics like precision, recall, specificity.
Scalar summary of ROC; measures ranking ability, not calibration.
Plots true positive rate vs false positive rate across thresholds; summarizes separability.
A proper scoring rule measuring squared error of predicted probabilities for binary outcomes.
Penalizes confident wrong predictions heavily; standard for classification and language modeling.
Average of squared residuals; common regression objective.
Uses an exponential moving average of gradients to speed convergence and reduce oscillation.
One complete traversal of the training dataset during training.
Halting training when validation performance stops improving to reduce overfitting.
Methods to set starting weights to preserve signal/gradient scales across layers.
Networks with recurrent connections for sequences; largely supplanted by Transformers for many tasks.
Converting text into discrete units (tokens) for modeling; subword tokenizers balance vocabulary size and coverage.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.