Structured Iterative Hard Thresholding For Categorical And Mixed Data Types

Thy Nguyen
Tayo Obafemi-Ajayi, Missouri University of Science and TechnologyFollow

Abstract

In many applications, data exists in a mixed data type format, i.e. a combination of nominal (categorical) and numerical features. A common practice for working with categorical features is to use an encoding method to transform the discrete values into numeric representation. However, numeric representation often neglects the innate structures in categorical features, potentially degrading the performance of learning algorithms. Utilizing the numeric representation could also limit interpretation of the learned model, such as finding the most discriminative categorical features or filtering irrelevant attributes. In this work, we extend the iterative hard thresholding (IHT) algorithm to quantify the structure of categorical features. The empirical evaluation of the proposed structured hard thresholding algorithm is based on both real and synthetic data sets in comparison with the original hard thresholding algorithm, LASSO and Random Forest. The results demonstrate an improved performance over the original IHT.

Recommended Citation

T. Nguyen and T. Obafemi-Ajayi, "Structured Iterative Hard Thresholding For Categorical And Mixed Data Types," 2019 IEEE Symposium Series on Computational Intelligence, SSCI 2019, pp. 2541 - 2547, article no. 9002948, Institute of Electrical and Electronics Engineers, Dec 2019.

The definitive version is available at https://doi.org/10.1109/SSCI44817.2019.9002948

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

categorical data types; sparse linear model; thresholding; feature selection.

International Standard Book Number (ISBN)

978-172812485-8

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Dec 2019

Electrical and Computer Engineering Faculty Research & Creative Works

Structured Iterative Hard Thresholding For Categorical And Mixed Data Types

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

Structured Iterative Hard Thresholding For Categorical And Mixed Data Types

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations