Computer Vision Resource Collection

A curated collection of resources from Reshma Abraham from our Education team

Introduction

Computer vision is a sub-field of artificial intelligence that focuses on enabling computers to understand and interpret visual information from the real world in the form of images and videos and make decisions or predictions based on that understanding. In other words, computer vision is all about teaching computers to 'see' and understand the world, much like how humans do. The goal of computer vision is to replicate human vision using machine learning and other techniques. To do this, it uses methods from mathematics, physics, computer science, and more.

Let's say you're looking at a photo of a park. As a human, you can easily identify things like trees, people, the sky, maybe a bench or a walking path. You can also understand more complex things - you can tell if it's day or night, you can guess the weather, you can identify if the people in the park adults or children are, you can tell if they're playing a game or just sitting, and so on. But for a computer, this photo is just a bunch of numbers - each pixel in the photo has a number that represents its colour, and that's all a computer sees initially. Computer vision is the field that tries to teach computers to understand these numbers the way we understand the park scene.

Objectives

The field of computer vision has many applications, including autonomous vehicles, image editing, health diagnostics, facial recognition, and many others. It's an active area of research and is continually advancing to tackle more complex and diverse visual understanding tasks.

  • Demystify Core Concepts: The primary objective of this computer vision collection is to simplify and elucidate core concepts and principles, helping practitioners to gain a fundamental understanding of computer vision technology, its theoretical foundations, and various sub-fields.

  • Develop Practical Skills: The resources aim to equip AI practitioners with the necessary practical skills to implement and leverage computer vision technology. This includes hands-on coding exercises, detailed walk-throughs of popular computer vision algorithms, and training on specific tools and libraries like OpenCV, TensorFlow, and PyTorch.

  • Encourage Innovation: The collection of resources seeks to inspire AI practitioners to innovate and push the boundaries of what's possible with computer vision. This includes resources on cutting-edge research, case studies of innovative uses of computer vision, and challenges to encourage experimentation.

Who is this for?

Thisjou collection of resources on computer vision is designed to cater to a broad spectrum of individuals with diverse needs and interests. Here's a look at who might find this information especially beneficial:

  • Beginners in AI and Machine Learning: If you're new to the field of Artificial Intelligence (AI) and Machine Learning (ML) and are keen to understand how machines can be trained to 'see' and interpret images and videos, these resources will provide an excellent foundation.

  • Software Developers and Engineers: Professionals who are interested in expanding their skill set to include computer vision or are tasked with developing applications that involve image or video processing will find the information valuable.

  • AI Enthusiasts and Hobbyists: If you're intrigued by the capabilities of AI and want to understand more about how machines interpret visual data, these resources will satisfy your curiosity and may inspire you to delve deeper.

  • Companies and Industry Professionals: For organisations and professionals working in industries where computer vision has applications - such as autonomous vehicles, healthcare, security, and retail - these resources can help in understanding the potential benefits and challenges of implementing computer vision technologies.

In essence, anyone with an interest in understanding and applying computer vision, whether for personal knowledge, academic pursuits, or professional growth, will find this collection invaluable.

Resources

(please note that each resource contains a link external to the Diverse AI website that will open in a new window)

General Certified Courses

Image Classification 

This is the task of categorising an image into one of many different categories. For example, identifying whether a photo is of a dog, cat, or car.

Papers and articles and tutorials

Courses

Activities and exercises

Object Detection 

This task involves identifying the presence, location, and type of one or more objects in an image.

Papers and articles and tutorials

Courses

Activities and exercises

Image Segmentation 

Segmentation is the task of partitioning an image into multiple segments or "superpixels". Semantic segmentation assigns a class to each pixel in the image, while instance segmentation identifies each distinct object of each class.

Papers and articles and tutorials

Activities and exercises

Facial Recognition 

A task involving the identification or verification of a person's identity based on their face.

Papers and articles and tutorials

Courses

Activities and exercises

Action Recognition 

This task involves identifying a human action (like running, jumping, etc.) in an image or video.

Papers and articles and tutorials

Activities and exercises

Optical Character Recognition (OCR)

The task involves converting images of typed, handwritten, or printed text into machine-encoded text. In other words, understanding the Text in Images.

Papers and articles and tutorials

Activities and exercises

Image Generation: Generative Adversarial Networks

This task focuses on creating new images that were not part of the original dataset but still closely resemble some given real data. Applications include image synthesis, super-resolution, and style transfer.

Papers and articles and tutorials

Activities and exercises