YOGA: Deep Object Detection in the Wild with Lightweight Feature Learning and Multiscale Attention

Abstract

We Introduce YOGA, a Deep Learning based Yet Lightweight Object Detection Model that Can Operate on Low-End Edge Devices While Still Achieving Competitive Accuracy. the YOGA Architecture Consists of a Two-Phase Feature Learning Pipeline with a Cheap Linear Transformation, Which Learns Feature Maps using Only Half of the Convolution Filters Required by Conventional Convolutional Neural Networks. in Addition, It Performs Multi-Scale Feature Fusion in its Neck using an Attention Mechanism Instead of the Naive Concatenation Used by Conventional Detectors. YOGA is a Flexible Model that Can Be Easily Scaled Up or Down by Several Orders of Magnitude to Fit a Broad Range of Hardware Constraints. We Evaluate YOGA on COCO-Val and COCO-Testdev Datasets with over 10 State-Of-The-Art Object Detectors. the Results Show that YOGA Strikes the Best Trade-Off between Model Size and Accuracy (Up to 22% Increase of AP and 23–34% Reduction of Parameters and FLOPs), Making It an Ideal Choice for Deployment in the Wild on Low-End Edge Devices. This is Further Affirmed by Our Hardware Implementation and Evaluation on NVIDIA Jetson Nano.

Department(s)

Computer Science

International Standard Serial Number (ISSN)

0031-3203

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2023 Elsevier, All rights reserved.

Publication Date

01 Jul 2023

Share

 
COinS