Steam flooding is a complex process that has been considered as an effective enhanced oil recovery technique in both heavy oil and light oil reservoirs. Many studies have been conducted on different sets of steam flooding projects using the conventional data analysis methods, while the implementation of machine learning algorithms to find the hidden patterns is rarely found. In this study, a hierarchical clustering algorithm (HCA) coupled with principal component analysis is used to analyze the steam flooding projects worldwide. The goal of this research is to group similar steam flooding projects into the same cluster so that valuable operational design experiences and production performance from the analogue cases can be referenced for decision-making. Besides, hidden patterns embedded in steam flooding applications can be revealed based on data characteristics of each cluster for different reservoir/fluid conditions. In this research, principal component analysis is applied to project original data to a new feature space, which finds two principal components to represent the eight reservoir/fluid parameters (8D) but still retain about 90% of the variance. HCA is implemented with the optimized design of five clusters, Euclidean distance, and Ward's linkage method. The results of the hierarchical clustering depict that each cluster detects a unique range of each property, and the analogue cases present that fields under similar reservoir/fluid conditions could share similar operational design and production performance.


Geosciences and Geological and Petroleum Engineering


National Natural Science Foundation of China, Grant 2021KJ060

International Standard Serial Number (ISSN)


Document Type

Article - Journal

Document Version

Final Version

File Type





© 2023 American Chemical Society, All rights reserved.

Creative Commons Licensing

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Publication Date

07 Jun 2022