From Data Collection to Data Analytics: How to Successfully Extract Useful Information from Big Data in the Oil & Gas Industry?


Big data has become a major topic in many industries. Most recently, the oil and gas industry adopted a special interest in data science as a result of the increasing availability of public domains and commercial databases. Utilizing and processing such data can help in making better future decisions. The aim of this work is to provide an example and demonstrate methodologies on how to collect and utilize big data to help in making better future decisions in the oils and gas industry.

After reading a good number of papers and books about the applications of data analysis in the oil and gas industry, in addition to other industries, and given that data analysis is the area of expertise of the authors, this paper was written to demonstrate real examples of data processing and validation workflows. This work is intended to cover the gaps in the literature were many of the publications only discuss the importance of data-driven analytics.

This paper provides an overview of the diverse and bulk data generating sources in the oil and gas industry, starting from the exploration phase to the end of the lifecycle of the well. It provides an example of utilizing a public domain database (FracFocus) and demonstrates a step by step workflow on how to collect and process the data based on the objective of the analytics. Two real examples of descriptive and predictive analytics are also demonstrated in this paper to show the power of having a diverse and multiple resources databases. A framework of data validation and preparation is also shown to illustrate data quality checks combined with best practices of data cleansing and outlier detection methodologies.

This paper provides a clear methodology on how to successfully apply data analysis which can serve as a guide for some future data analysis applications in the oil and gas industry.

Meeting Name

SPE/IATMI Asia Pacific Oil and Gas Conference and Exhibition 2019, APOG 2019 (2019: Oct. 29-31, Bali, Indonesia)


Geosciences and Geological and Petroleum Engineering

Research Center/Lab(s)

Center for Research in Energy and Environment (CREE)

Keywords and Phrases

Big data; Data Analytics; Database systems; Gas industry; Gas oils; Gases; Life cycle; Petroleum prospecting; Predictive analytics, Best practices; Data cleansing; Data collection; Data validation; Exploration phase; Multiple resources; Oil and Gas Industry; Public domains, Data acquisition

International Standard Book Number (ISBN)


Document Type

Article - Conference proceedings

Document Version


File Type





© 2019 Society of Petroleum Engineers (SPE), All rights reserved.

Publication Date

01 Oct 2019