Ensemble LUT Classification For Degraded Document Enhancement
Abstract
The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to estimate local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly efficient. Experimental evaluation results are provided using the Frieder diaries collection.1 © 2008 SPIE-IS&T.
Recommended Citation
T. Obafemi-Ajayi et al., "Ensemble LUT Classification For Degraded Document Enhancement," Proceedings of SPIE - The International Society for Optical Engineering, vol. 6815, article no. 681509, Society of Photo-optical Instrumentation Engineers, Mar 2008.
The definitive version is available at https://doi.org/10.1117/12.767120
Department(s)
Electrical and Computer Engineering
Keywords and Phrases
Document degradation models; Document image analysis; Ensemble classification; Historical documents; Image enhancement
International Standard Book Number (ISBN)
978-081946987-8
International Standard Serial Number (ISSN)
0277-786X
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2023 Society of Photo-optical Instrumentation Engineers, All rights reserved.
Publication Date
31 Mar 2008