Results for "deep learning"
Deep Learning
IntermediateA branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Deep Learning is a type of machine learning that uses structures called neural networks, which are inspired by how the human brain works. Imagine a series of layers where each layer learns to recognize different features of an image, like edges, shapes, and eventually, whole objects. This is how ...
A narrow minimum often associated with poorer generalization.
Empirical laws linking model size, data, compute to performance.
Embedding signals to prove model ownership.
AI systems that perceive and act in the physical world through sensors and actuators.
Automated assistance identifying disease indicators.
Predicting protein 3D structure from sequence.
A continuous vector encoding of an item (word, image, user) such that semantic similarity corresponds to geometric closeness.
A parameterized function composed of interconnected units organized in layers with nonlinear activations.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Randomly zeroing activations during training to reduce co-adaptation and overfitting.
Networks using convolution operations with weight sharing and locality, effective for images and signals.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Identifying and localizing objects in images, often with confidence scores and bounding rectangles.
Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.
The shape of the loss function over parameter space.
AI subfield dealing with understanding and generating human language, including syntax, semantics, and pragmatics.
A wide basin often correlated with better generalization.
Converting audio speech into text, often using encoder-decoder or transducer architectures.
Neural networks that operate on graph-structured data by propagating information along edges.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Combining signals from multiple modalities.
Extension of convolution to graph domains using adjacency structure.
Generating human-like speech from text.
GNN using attention to weight neighbor contributions dynamically.
Changing speaker characteristics while preserving content.
Maps audio signals to linguistic units.
Identifying speakers in audio.