Dealing with Dangerous Data: Part-Whole Validation for Low Incident, High Risk Data

Abstract

In certain situations, syntactically valid, but incorrect, data entered into a database can result in nearimmediate, catastrophic financial losses for an organization. Examples include: omitting zeros in prices of goods on e-commerce sites; and financial fraud where data is directly entered into databases, bypassing application-level financial checks. Such "dangerous data" can, and should, be detected, because it deviates substantially from the statistical properties of existing data. Detection of this kind of problem requires comparing individual data items to a large amount of existing data in the database at run- Time. Furthermore, the identification of errors is probabilistic, rather than deterministic, in nature. This research proposes part-whole validation as an approach to addressing the dangerous data situation. Part-whole validation addresses fundamental issues in database management, for example, integrity maintenance. Illustrative and representative examples are first defined, and analyzed. Then, an architecture for part-whole validation is presented and implemented in a prototype to illustrate the feasibility of the research.

Department(s)

Business and Information Technology

Keywords and Phrases

Audit; Boyce-codd normal form; Business rules; Dangerous data; Data management; Data quality; Database design; Part-whole validation; Relational databases

International Standard Serial Number (ISSN)

1063-8016; 1533-8010

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2016 IGI Global, All rights reserved.

Publication Date

01 Mar 2016

Share

 
COinS