Data-Driven approaches for State of Charge (SOC) prediction have been developed considerably in recent years. However, determining the appropriate training dataset is still a challenge for model development and validation due to the considerably varieties of lithium-ion batteries in terms of material, types of battery cells, and operation conditions. This work focuses on optimization of the training data set by using simple measurable data sets, which is important for the accuracy of predictions, reduction of training time, and application to online estimation. It is found that a randomly generated data set can be effectively used for the training data set, which is not necessarily the same format as conventional predefined battery testing protocols, such as constant current cycling, Highway Fuel Economy Cycle, and Urban Dynamometer Driving Schedule. The randomly generated data can be successfully applied to various dynamic battery operating conditions. For the ML algorithm, XGBoost is used, along with Random Forest, Artificial Neural Network, and a reduced-order physical battery model for comparison. The XGBoost method with the optimal training data set shows excellent performance for SOC prediction with the fastest learning time within 1 s, a short running time of 0.03 s, and accurate results with a 0.358% Mean Absolute Percentage Error, which is outstanding compared to other Data-Driven approaches and the physics-based model.


Electrical and Computer Engineering

Second Department

Mechanical and Aerospace Engineering


The authors gratefully acknowledge financial support from the National Science Foundation (Award Nos. 1538415 and 1610396)

Keywords and Phrases

Battery Modeling; Battery Soc; Dynamic Current; Estimation; Machine Learning; Random Signal; XGBoost

International Standard Serial Number (ISSN)


Document Type

Article - Journal

Document Version

Final Version

File Type





© 2021 The Authors, All rights reserved.

Creative Commons Licensing

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Publication Date

01 Sep 2021