Results for "vision"

Computer Vision

Intermediate

AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.

Computer vision is like giving a computer the ability to see and understand pictures and videos, similar to how humans do. Just as you can recognize your friend's face in a crowd or identify a cat in a photo, computer vision allows machines to perform these tasks. It uses special algorithms and m...

Full Definition View in 3D WordGraph

26 results

Computer Vision Intermediate

AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.

Computer Vision

Multimodal Model Intermediate

Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.

Foundations & Theory

Exteroception Advanced

External sensing of surroundings (vision, audio, lidar).

Robotics & Embodied AI

Object Detection Intermediate

Identifying and localizing objects in images, often with confidence scores and bounding rectangles.

Computer Vision

Deep Learning Intermediate

A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.

Vision Transformer Intermediate

Transformer applied to image patches.

Computer Vision

Segmentation Intermediate

Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.

Computer Vision

Transfer Learning Intermediate

Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.

Machine Learning

Convolutional Neural Network Intermediate

Networks using convolution operations with weight sharing and locality, effective for images and signals.

Neural Networks Computer Vision

Data Augmentation Intermediate

Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.

Foundations & Theory

Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory

Conditional Random Field Intermediate

Probabilistic graphical model for structured prediction.

Model Architectures

Image Classification Intermediate

Assigning category labels to images.

Computer Vision

Instance Segmentation Intermediate

Pixel-level separation of individual object instances.

Computer Vision

Semantic Segmentation Intermediate

Pixel-wise classification of image regions.

Computer Vision

CLIP Intermediate

Joint vision-language model aligning images and text.

Computer Vision

Optical Flow Intermediate

Pixel motion estimation between frames.

Computer Vision

SLAM Intermediate

Simultaneous Localization and Mapping for robotics.

Computer Vision

3D Reconstruction Intermediate

Recovering 3D structure from images.

Computer Vision

Particle Filter Intermediate

Monte Carlo method for state estimation.

Sensor Advanced

Devices measuring physical quantities (vision, lidar, force, IMU, etc.).

Robotics & Embodied AI

Perception Stack Advanced

Software pipeline converting raw sensor data into structured representations.

Robotics & Embodied AI

Synthetic Sensors Advanced

Artificial sensor data generated in simulation.

Simulation & Sim-to-Real

Affordance Frontier

Perceived actions an environment allows.

World Models & Cognition

Gesture Recognition Frontier

Interpreting human gestures.

World Models & Cognition

Narrow AI Frontier

AI limited to specific domains.

AGI & General Intelligence