American Sign Language Alphabet Recognition using Convolutional Neural Networks with Multiview Augmentation and Inference Fusion
American Sign Language (ASL) alphabet recognition by computer vision is a challenging task due to the complexity in ASL signs, high interclass similarities, large intraclass variations, and constant occlusions. This paper describes a method for ASL alphabet recognition using Convolutional Neural Networks (CNN) with multiview augmentation and inference fusion, from depth images captured by Microsoft Kinect. Our approach augments the original data by generating more perspective views, which makes the training more effective and reduces the potential overfitting. During the inference step, our approach comprehends information from multiple views for the final prediction to address the confusing cases caused by orientational variations and partial occlusions. On two public benchmark datasets, our method outperforms the state-of-the-arts.
W. Tao et al., "American Sign Language Alphabet Recognition using Convolutional Neural Networks with Multiview Augmentation and Inference Fusion," Engineering Applications of Artificial Intelligence, vol. 76, pp. 202 - 213, Elsevier Ltd, Nov 2018.
The definitive version is available at https://doi.org/10.1016/j.engappai.2018.09.006
Mechanical and Aerospace Engineering
Intelligent Systems Center
Keywords and Phrases
American Sign Language; Convolutional Neural Networks (CNN); Data augmentation; Fusion
International Standard Serial Number (ISSN)
Article - Journal
© 2018 Elsevier Ltd, All rights reserved.
01 Nov 2018
This research work was supported by the National Science Foundation, United States grant CMMI-1646162 on cyberâ€“physical systems and also by the Intelligent Systems Center at Missouri University of Science and Technology, United States.