A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Ce Zhou, Missouri University of Science and TechnologyFollow
Qian Li
Chen Li
Jun Yu
Yixin Liu
Guangjing Wang
Kai Zhang
Cheng Ji
Qiben Yan
Lifang He
Hao Peng
Jianxin Li
Jia Wu
Ziwei Liu
Pengtao Xie
Caiming Xiong
Jian Pei
Philip S. Yu
Lichao Sun

Abstract

Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks across different data modalities. A PFM (e.g., BERT, ChatGPT, GPT-4) is trained on large-scale data, providing a solid parameter initialization for a wide range of downstream applications. In contrast to earlier methods that use convolution and recurrent modules for feature extraction, BERT learns bidirectional encoder representations from Transformers, trained on large datasets as contextual language models. Similarly, the Generative Pretrained Transformer (GPT) method employs Transformers as feature extractors and is trained on large datasets using an autoregressive paradigm. Recently, ChatGPT has demonstrated significant success in large language models, utilizing autoregressive language models with zero-shot or few-shot prompting. The remarkable success of PFMs has driven significant breakthroughs in AI, leading to numerous studies proposing various methods, datasets, and evaluation metrics, which increases the demand for an updated survey. This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, and other data modalities. It covers the basic components and existing pretraining methods used in natural language processing, computer vision, and graph learning, while also exploring advanced PFMs for different data modalities and unified PFMs that address data quality and quantity. Additionally, the review discusses key aspects such as model efficiency, security, and privacy, and provides insights into future research directions and challenges in PFMs. Overall, this survey aims to shed light on the research of the PFMs on scalability, security, logical reasoning ability, cross-domain learning ability, and user-friendly interactive ability for artificial general intelligence.

Recommended Citation

C. Zhou and Q. Li and C. Li and J. Yu and Y. Liu and G. Wang and K. Zhang and C. Ji and Q. Yan and L. He and H. Peng and J. Li and J. Wu and Z. Liu and P. Xie and C. Xiong and J. Pei and P. S. Yu and L. Sun, "A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT," International Journal of Machine Learning and Cybernetics, Springer, Jan 2024.

The definitive version is available at https://doi.org/10.1007/s13042-024-02443-6

Department(s)

Computer Science

Comments

Fundação para a Ciência e a Tecnologia, Grant ARTEMIS/0003/2013

Keywords and Phrases

BERT; ChatGPT; Computer vision; GPT-4; Graph learning; Natural language processing; Pretrained foundation models

International Standard Serial Number (ISSN)

1868-808X; 1868-8071

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 2024

Computer Science Faculty Research & Creative Works

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Abstract

Recommended Citation

Department(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Author

Abstract

Recommended Citation

Department(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations