An Intro to KerasCV: AI with vision baked in

Aug 18, 2023
Andrew O’Shei

In order to advance research in computer vision and optimize CPU performance for vision related tasks, Intel Research launched the OpenCV project in 1999, with the version 1.0 release finally arriving in 2006. Since its release, OpenCV has become a mainstay in computer vision research. Though it is written in C++ and largely focuses on traditional image processing techniques, OpenCV has managed to remain relevant even in the age of AI.
OpenCV’s Python language bindings allow the creation of portable code that easily plugs into the most popular machine learning frameworks while maintaining fast performance under the hood. I have been using OpenCV for years in both research and professional projects. However, when the Keras development team announced the prerelease of KerasCV v0.1.0 back in April of 2022 it made me wonder if OpenCV’s days were numbered.

For the unfamiliar, Keras was originally released by AI Researcher François Chollet in 2015. Keras is designed to simplify the building and training of deep neural networks. It provides a python-based interface and is used primarily with the TensorFlow machine learning library as its backend.

So, will KerasCV replace OpenCV ?

The short answer is no. The longer answer is yes for some tasks.

Unlike OpenCV, which aims to be a comprehensive set of tools for building computer vision applications, KerasCV aims to provide a set of tools to simplify building neural networks for vision-related tasks. However, KerasCV does add a few nice features to Keras which may eliminate the need for OpenCV in some projects.

KerasCV consists of a few key components:

  1. Pretrained Networks : This is a set of neural network models with pretrained weights designed for computer vision tasks. The pretrained models include CSP DarkNet, EfficientNetV2, MobileNetV3, ResNet50 and YOLOV8.
  2. Computer Vision Layers : These are special Keras layers which allow baking computer vision processes directly into a model.
    1. Augmentation Layers : Automate data augmentation of input images when training a model.
    2. Preprocessing Layers : Preprocess images before they pass through the model.
    3. Regularization Layers : Fine tune how a model learns features within a data set.
  3. Bounding Box Utility : Allows the user to configure the formatting of bounding boxes returned from the model. It also provides some additional features like checking for image bounds and intersections.

KerasCV is currently on release v0.6.1, so it is still in prerelease. That being said, I’m happy with what has been presented so far. Though none of the current features are groundbreaking they offer some good quality of life improvements which will accelerate neural network model prototyping for computer vision tasks. A common use of OpenCV in machine learning, for example, is formatting the size and color space of images before passing them to a model for classification or detection. The KerasCV preprocessing layers seem to eliminate the need for this OpenCV boiler plate code.

In a future blog I plan on benchmarking KerasCV, in particular with its use in data augmentation. Last year I worked on a project where I automated the augmentation of an image dataset using OpenCV. Though I was happy with the result it was a time consuming process. It appears that I will be able to achieve the same result with far fewer lines of code using KerasCV. If the resulting trained model performs similarly to what I produced using OpenCV then KerasCV is a clear win.

I am also curious as to how the KerasCV layers will perform on embedded systems. OpenCV is already a well optimized library. However, TensorFlow and by extension Keras can benefit from TPUs (Tensor Processing Units), like Google’s Coral.ai. These AI accelerators are more readily available than GPUs in the embedded space. If TPUs can help to accelerate image processing in addition to machine learning then that is another potential win.

About the author

Technical Lead – Robotics & AI | France
As Technical Lead for Robotics & AI, Andrew combines his extensive experience in embedded systems and mechatronics with artificial intelligence to develop innovative technical solutions.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Slide to submit