HomeGlossaryComputer Vision
Core Concepts

Computer Vision

The field of AI focused on enabling machines to interpret and understand images and video.

Computer vision is the branch of AI that teaches machines to analyze visual data such as images, video, medical scans, and camera feeds. Tasks include object detection, image classification, segmentation, OCR, tracking, and scene understanding.

It has applications across healthcare, manufacturing, autonomous vehicles, retail, security, robotics, and consumer apps. Modern computer vision systems are powered by deep neural networks, especially convolutional models and transformers.

Goal: turn raw pixels into useful understanding, whether that means recognizing a face, reading a document, or analyzing a live video stream.

Common Computer Vision Tasks

  • Classification — assign a label to an image
  • Detection — find and locate objects within an image
  • Segmentation — label regions or pixels precisely
  • OCR — extract text from images and scans

Computer vision is increasingly blending with language models to create multimodal AI systems that can both see and explain what they see.

Related Terms

← Back to Glossary