Computer Vision

7 min readOct 3, 2021

An Introduction to the Technology

In today’s world, all of us are using the service of this technology in a different manner in our day to day life. In this session, I’ll give you a basic introduction to Computer vision, how it is different from Image processing, why it is so difficult to implement computer vision and what applications of computer vision are changing the world around us.

The Content we are Covering today is:

What is Computer Vision?
Computer Vision vs Image Processing
Importance of Computer Vision
Challenges of Computer Vision
Applications of Computer Vision
Tools used in Computer Vision
Future Scope Computer Vision.

Computer Vision, often abbreviated as CV, is defined as a field of study that seeks to develop techniques to help computers “see” and understand the content of digital images such as photographs and videos.

Human vision is amazingly complex due to the evolution of millions of years. Humans have eyes to capture, a recaptor to access and a visual cortex to process the visual content. Today we are able to mimic the human eye with cameras but turn’s out that is the easy part. Understanding what is in the photo is much more difficult.

Consider this picture my human brain can look at it and immediately know it’s a flower. Our Brain is cheating since we got a couple of million years worth of evolutionary context to immediately understand what it is. But computers don’t have that advantage to a computer image look like this.

Just a massive array of integer values that represent intensities across the colour spectrum. There is no context hear just a massive pile of data.

It turns out that the context is the crux of getting algorithms to understand in the same way that the human brain does. And to make this work we use an algorithm very similar to how the brain operates using machine learning.

Machine learning allows us to effectively train the context for the data set. So that an algorithm can understand what all those numbers in a specific organization actually represent.

How does computer vision work?

In a nutshell, Deep learning algorithms can easily be trained with annotated data where humans draw shapes for specific classes (“car,” “human,” “dog”) in every image, and neural networks are trained on it. The trained image recognition algorithm is then able to find and return those classes.

The most popular image recognition algorithms are pre-trained and benchmarked on massive public datasets with already annotated images. Image annotation to label photos or video frames manually is needed for specific use cases or to retrain an algorithm to increase the detection accuracy further.

Computer Vision and Image Processing

Computer vision is distinct from image processing. Image processing is the process of creating a new image from an existing image, typically simplifying or enhancing the content in some way. It is a type of digital signal processing and is not concerned with understanding the content of an image. A given computer vision system may require image processing to be applied to raw input, e.g. pre-processing images.

Examples of image processing include:

Normalizing photometric properties of the image, such as brightness or colour.
Cropping the bounds of the image, such as centring an object in a photograph.
Removing digital noise from an image, such as digital artefacts from low light levels.

Why is computer vision important?

Technologically, computer vision is the most advanced field in the modern artificial intelligence space. And this is about to translate into an enormous commercial value with its climax over the next 5 to 10 years. The computer vision market is projected to reach 27bn by 2028.

Even today, computer vision enables applications across every industry, from agriculture to retail or from insurance to construction; computer vision applications will be applied to a vast growing range of industry-specific use cases to automate products and services.

Why computer vision development is difficult?

One reason is that we don’t have a strong grasp of how human vision works. Studying biological vision requires an understanding of the perception organs like the eyes, as well as the interpretation of the perception within the brain. Much progress has been made, both in charting the process and in terms of discovering the tricks and shortcuts used by the system, although like any study that involves the brain, there is a long way to go.

Another reason why it is such a challenging problem is because of the complexity inherent in the visual world. A given object may be seen from any orientation, in any lighting conditions, with any type of occlusion from other objects, and so on. A true vision system must be able to “see” in any of an infinite number of scenes and still extract something meaningful.

Applications

OCR: Optical character recognition (OCR) technology is a business solution for automating data extraction from printed or written text from a scanned document or image file and then converting the text into a machine-readable form to be used for data processing like editing or searching.

Medical-Imaging: The computer vision technique has shown great application in surgery and therapy of some diseases. Recently, three-dimensional (3D) modelling and rapid prototyping technologies have driven the development of medical imaging modalities, such as CT and MRI.

Biometrics: Biometric systems are playing an important role in identifying a person, thus contributing to global security. There is much possible biometrics, for example, height, DNA, handwriting etc., but computer vision-based biometrics have found an important place in the domain of human identification. Computer vision-based biometrics include identification of face, fingerprints, iris etc. and using their abilities to create efficient authentication systems.

3D-modelling: 3D reconstruction is the process of capturing the shape and appearance of real objects. This process can be accomplished either by active or passive methods. If the model is allowed to change its shape in time, this is referred to as non-rigid or Spatio-temporal reconstruction

Object-Recognition: Object recognition is a computer vision technique for identifying objects in images or videos. Object recognition is a key output of deep learning and machine learning algorithms. When humans look at a photograph or watch a video, we can readily spot people, objects, scenes, and visual details.

Tool’s

Future Scope

With further research on and refinement of the technology, the future of computer vision will see it perform a broader range of functions. Not only will computer vision technologies be easier to train but also be able to discern more from images than they do now

Computer vision will play a vital role in the development of artificial general intelligence (AGI) and artificial superintelligence (ASI) by giving them the ability to process information as well as or even better than the human visual system.
Scaling Of technology in the field of Manufacturing and Agriculture

What We Discussed

Gentle introduction to the field of computer vision.
The goal of the field of computer vision and its distinctness from image processing.
What makes the problem of computer vision Difficult.
Typical problems or tasks pursued in computer vision.