Abstract

The Capacity to Generalize to Future Unseen Data Stands as One of the Utmost Crucial Attributes of Deep Neural Networks. Sharpness-Aware Minimization (SAM) Aims to Enhance the Generalizability by Minimizing Worst-Case Loss using One-Step Gradient Ascent as an Approximation. However, as Training Progresses, the Non-Linearity of the Loss Landscape Increases, Rendering One-Step Gradient Ascent Less Effective. on the Other Hand, Multi-Step Gradient Ascent Will Incur Higher Training Cost. in This Paper, We Introduce a Normalized Hessian Trace to Accurately Measure the Curvature of Loss Landscape on Both Training and Test Sets. in Particular, to Counter Excessive Non-Linearity of Loss Landscape, We Propose Curvature Regularized SAM (CR-SAM), Integrating the Normalized Hessian Trace as a SAM Regularizer. Additionally, We Present an Efficient Way to Compute the Trace Via Finite Differences with Parallelism. Our Theoretical Analysis based on PAC-Bayes Bounds Establishes the Regularizer's Efficacy in Reducing Generalization Error. Empirical Evaluation on CIFAR and ImageNet Datasets Shows that CR-SAM Consistently Enhances Classification Performance for ResNet and Vision Transformer (ViT) Models Across Various Datasets. Our Code is Available at Https://github.com/TrustAIoT/CR-SAM.

Department(s)

Computer Science

Second Department

Electrical and Computer Engineering

Comments

National Sleep Foundation, Grant 2008878

International Standard Serial Number (ISSN)

2374-3468; 2159-5399

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Association for the Advancement of Artificial Intelligence, All rights reserved.

Publication Date

25 Mar 2024

Share

 
COinS