Data Conditioning and Forecasting Methodology using Machine Learning Onproduction Data for a Well Pad
Abstract
A new machine learning (ML)/statistical-based methodology for conditioning and predicting productiondata for a well pad has been developed. Typically, data conditioning involves outlier detection, missingdata, data imputation, and smoothing. Time-series production data prediction can be challenging becausethe target (wellbore oil production) depends on large-scale, high-dimensional data sets with unknowndistributions and is influenced by missing data and outliers. Hence, data conditioning is key for accuratepredictions. The current work is the first attempt at using ensemble ML and statistical techniques, such asmultilayer perceptron (MLP), principal component analysis (PCA), and support vector regression (SVR),for well pad data conditioning using recently disclosed subsurface and production data from a field inthe southern area of the Norwegian North Sea. The time-series forecasting based on large-scale, high-dimensional conditioned and cleaned data sets is also presented. The data with an oil production rate greater than 10 Sm3 have been retained for data cleansing, whichreduced the size of the production well data set by 14.9%. Outliers are detected using the z-score method.The missing values are predicted using a trained ML model on all available nonmissing data. The procedurefirst predicts the downhole missing values from all the wells, including the available neighboring wells, andthen uses these features to predict other missing values for the well pad. In this paper, the two approachesimplemented and compared for prediction of missing data are MLP and SVR, and PCA is performed toextract the most important data features. Production data with 12 related variables (i.e., dates, hours, temperature, pressure, etc.) are used to explorethe complex nonlinearity of features and estimate wellbore oil production with ML and deep-learningmodels. Conventional SVR and MLP methods are implemented as the benchmark. During this work, more than 60% of the missing and abnormal data from the field data set are detectedand imputed using advanced ML methods, such as MLP and SVR with radial basis function kernel. Morethan 6% of data are outliers and are removed using the z-score method. The modified SVR with time-seriesdata structure and long short-term memory (LSTM) algorithms are used for the comparisons. An R-squared(R2) of 98% is achieved for both the algorithms; however, LSTM has the lowest root mean square error(RMSE) results compared to SVR. Data conditioning is conventionally performed using statistical techniques, but here, an ensemble of MLtechniques is used depending on the available data. This paper presents a new methodology to performdata conditioning and production prediction for a well pad using ML and neighboring well data. The MLalgorithms used are highly efficient, as demonstrated by the results.
Recommended Citation
M. Bagheri et al., "Data Conditioning and Forecasting Methodology using Machine Learning Onproduction Data for a Well Pad," Proceedings of the Annual Offshore Technology Conference, Offshore Technology Conference, Jan 2020.
Department(s)
Engineering Management and Systems Engineering
International Standard Book Number (ISBN)
978-161399707-9
International Standard Serial Number (ISSN)
0160-3663
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2025 Offshore Technology Conference, All rights reserved.
Publication Date
01 Jan 2020
