Results for "contrastive vision-language"

AdvertisementAd space — search-top

99 results

CLIP Intermediate

Joint vision-language model aligning images and text.

Computer Vision
Multimodal Model Intermediate

Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.

Foundations & Theory
Computer Vision Intermediate

AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.

Computer Vision
Boltzmann Machine Intermediate

Probabilistic energy-based neural network with hidden variables.

Model Architectures
Restricted Boltzmann Machine Intermediate

Simplified Boltzmann Machine with bipartite structure.

Model Architectures
Language Model Intermediate

A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.

Large Language Models
Exteroception Advanced

External sensing of surroundings (vision, audio, lidar).

Robotics & Embodied AI
Deep Learning Intermediate

A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.

Deep Learning
Vision Transformer Intermediate

Transformer applied to image patches.

Computer Vision
Natural Language Instruction Frontier

Controlling robots via language.

World Models & Cognition
NLP Intermediate

AI subfield dealing with understanding and generating human language, including syntax, semantics, and pragmatics.

Foundations & Theory
Object Detection Intermediate

Identifying and localizing objects in images, often with confidence scores and bounding rectangles.

Computer Vision
Segmentation Intermediate

Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.

Computer Vision
Large Language Model Intermediate

A high-capacity language model trained on massive corpora, exhibiting broad generalization and emergent behaviors.

Large Language Models
Transfer Learning Intermediate

Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.

Machine Learning
Data Augmentation Intermediate

Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.

Foundations & Theory
Conditional Random Field Intermediate

Probabilistic graphical model for structured prediction.

Model Architectures
Narrow AI Frontier

AI limited to specific domains.

AGI & General Intelligence
Vocabulary Intermediate

The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.

Transformers & LLMs
Masked Language Model Intermediate

Predicts masked tokens in a sequence, enabling bidirectional context; often used for embeddings rather than generation.

Foundations & Theory
Prompt Intermediate

The text (and possibly other modalities) given to an LLM to condition its output behavior.

Prompting & Instructions
Prompt Engineering Intermediate

Crafting prompts to elicit desired behavior, often using role, structure, constraints, and examples.

Prompting & Instructions
Speech Recognition Intermediate

Converting audio speech into text, often using encoder-decoder or transducer architectures.

Speech & Audio AI
Convolutional Neural Network Intermediate

Networks using convolution operations with weight sharing and locality, effective for images and signals.

Neural Networks Computer Vision
Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory
Image Classification Intermediate

Assigning category labels to images.

Computer Vision
Instance Segmentation Intermediate

Pixel-level separation of individual object instances.

Computer Vision
Semantic Segmentation Intermediate

Pixel-wise classification of image regions.

Computer Vision
Optical Flow Intermediate

Pixel motion estimation between frames.

Computer Vision
SLAM Intermediate

Simultaneous Localization and Mapping for robotics.

Computer Vision

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.