Results for "contrastive vision-language"

15 results

Deep Learning Intermediate

A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.

Deep Learning
Multimodal Model Intermediate

Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.

Foundations & Theory
CLIP Intermediate

Joint vision-language model aligning images and text.

Computer Vision
Large Language Model Intermediate

A high-capacity language model trained on massive corpora, exhibiting broad generalization and emergent behaviors.

Large Language Models
Natural Language Instruction Frontier

Controlling robots via language.

World Models & Cognition
Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory
Sensor Advanced

Devices measuring physical quantities (vision, lidar, force, IMU, etc.).

Robotics & Embodied AI
Exteroception Advanced

External sensing of surroundings (vision, audio, lidar).

Robotics & Embodied AI
Artificial Intelligence Intermediate

The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...

Foundations & Theory
Log Loss Intermediate

Penalizes confident wrong predictions heavily; standard for classification and language modeling.

Optimization
NLP Intermediate

AI subfield dealing with understanding and generating human language, including syntax, semantics, and pragmatics.

Foundations & Theory
Computer Vision Intermediate

AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.

Computer Vision
Vision Transformer Intermediate

Transformer applied to image patches.

Computer Vision
Language Model Intermediate

A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.

Large Language Models
Masked Language Model Intermediate

Predicts masked tokens in a sequence, enabling bidirectional context; often used for embeddings rather than generation.

Foundations & Theory