A Minimax Approach for Classification with Big-Data

Abstract

In this paper, a novel methodology to reduce the generalization errors occurring due to domain shift in big data classification is presented. This reduction is achieved by introducing a suitably selected domain shift to the training data via what is referred to as "distortion model". These distortions are introduced through an affine transformation and additional data-samples are obtained. Next, a deep neural network (NN), referred as "classifier", is used to classify both the original and the additional data samples. By learning from both the original and additional data-samples, the classifier compensates for the domain shift while maintaining its performance on original data. However, as the exact magnitude of the shift one would encounter in real applications is unknown a priori and difficult to predict. The objective is to compensate for the optimal shift that can be introduced by the distortion model without significantly degrading the performance of the model. A two-player zero-sum game is thus designed where the first player is the distortion model with the aim of increasing the domain shift. The classifier then becomes the second player whose aim is to minimize the impact of domain shift. Finally, a direct error-driven learning scheme is utilized to minimize the impact of the classifier while maximizing the domain shift. A comprehensive simulation study is presented where a 12% improvement in the presence of domain shift is demonstrated. The proposed approach is also shown to improve generalization by 6%.

Meeting Name

2018 IEEE International Conference on Big Data, Big Data 2018 (2018: Dec. 10-13, Seattle, WA)

Department(s)

Electrical and Computer Engineering

Second Department

Mathematics and Statistics

Research Center/Lab(s)

Intelligent Systems Center

Second Research Center/Lab

Center for High Performance Computing Research

Comments

This research was supported in part by an NSF I/UCRC award IIP 1134721 and Intelligent Systems Center.

Keywords and Phrases

Big data; Deep neural networks; Metadata; Affine transformations; Data classification; Error-driven learning; Generalization Error; Minimax approach; Novel methodology; Real applications; Simulation studies; Classification (of information)

International Standard Book Number (ISBN)

978-1-5386-5035-6

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2018 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.

Publication Date

13 Dec 2018

Share

 
COinS