YOLO, which stands for “You Only Look Once” is a popular real-time object detection algorithm in computer vision introduced in 2016. YOLO revolutionized object detection by proposing a different approach compared to traditional methods where it divided images into a grid and predicts the bounding boxes and class probabilities directly from the input image in a single pass.
YOLO operates at a high speed because it avoids the need for multiple passes through the image when compared to the two-stage complex region proposal object detection networks like Fast-RCNN and Faster-RCNN. The single forward pass of the YOLO network allows it to achieve real-time object detection on both images and videos.
Over the years, there have been several versions of YOLO, including the state of the arts (SOTA) YOLOv6, YOLOv7, YOLOv8, PP-YOLO, YOLOR, and the latest YOLO-NAS. Each version introduced improvements and architectural changes to enhance both speed and accuracy.
YOLO-NAS (YOLO Neural Architecture Search) is the latest state-of-the-art YOLO model released by Deci in May 2023 that outperforms its predecessors in terms of performance.
Some of the key features of the YOLO-NAS algorithm are described below:
- The architecture of the algorithm was found using the company’s proprietary technology AutoNAC which is a neural architecture search-NAS method.
- AutoNAC was used to determine optimal sizes and structures of stages, including block type, number of blocks, and number of channels in each stage.
- During the NAS process, quantization-aware RepVGG blocks (QSP and QCI blocks in the above diagram) were introduced into the model architecture, ensuring that the model architecture would be compatible with Post-Training Quantization (PTQ) to enable minimal accuracy loss.
- It uses a hybrid quantization method that selectively quantizes certain parts of a model, reducing information loss and balancing latency and accuracy.
- It was trained on Objects365, a diverse dataset for object detection consisting of 2 million images across 365 categories with 30 million bounding boxes.
- It was also trained on the RoboFlow100 dataset or RF100, a collection of 100 datasets from diverse domains, to demonstrate its ability to handle complex object detection tasks.
- It also incorporates Attention Mechanism, Knowledge Distillation and Distribution Focal Loss to enhance its training process.
- It is fully compatible with high-performance inference engines like NVIDIA TensorRT and supports INT8 quantization for unprecedented runtime performance. This allows YOLO-NAS to excel in real-world scenarios, such as autonomous vehicles, robotics, and video analytics applications, where low latency and efficient processing are essential.
Below is a per-category breakdown of YOLO-NAS’s performance on the RF-100 dataset, compared to the performance of v5/v7/v8 models:
Refer this notebook to do an inference and custom training on YOLO-NAS architecture
Deci Introduces YOLO-NAS - A Next-Generation, Object Detection Foundation Model Generated by Deci's…
Deci is thrilled to announce the release of a new object detection model, YOLO-NAS - a game-changer in the world of…
YOLO-NAS by Deci Achieves State-of-the-Art Performance on Object Detection Using Neural…
Eugene Khvedchenya, Deep Learning Research Engineer Harpreet Sahota, DevRel Manager Object detection has revolutionized…