An Adaptive Physics-based Feature Engineering Approach For Machine Learning-assisted Alloy Discovery


This study investigated the importance of integrating a physics-based perspective in feature engineering for machine learning applications in material science problems. Specifically, we studied the encoding of the variable of temper designation, which contains critical alloy manufacturing information and is commonly included as an important feature for predicting alloy properties in machine learning models. Popular encoding methods such as one-hot encoding or ordinal encoding neglect the physics-based mechanism of temper designations by considering them either totally independent or sequentially ordinal. Following the underlying physical mechanism of the temper designation variable, we propose an adaptive encoding method for temper designations by first decomposing them into categorical and numerical subunits that can be more properly encoded by one-hot encoding and ordinal encoding respectively. The proposed adaptive encoding method is investigated on two independent aluminum alloy datasets consisting of various mechanical and technological properties. Our investigations showed that the proposed adaptive encoding method outperforms traditional encoding methods in the prediction of both mechanical and technological properties. As a general encoding method, this adaptive encoding method can be applied to a variety of decomposable variables to help advance machine-learning-assisted alloy design.


Materials Science and Engineering

Second Department

Engineering Management and Systems Engineering

Keywords and Phrases

Aluminum alloy; Categorical variable encoding; Feature engineering; Machine learning; Material property; Temper designations

International Standard Serial Number (ISSN)


Document Type

Article - Journal

Document Version


File Type





© 2023 Elsevier, All rights reserved.

Publication Date

25 Jun 2023