Computer Science Faculty Research & Creative Works

Analysis of a Universal Class of Hash Functions

George Markowsky, Missouri University of Science and TechnologyFollow
J. Lawrence Carter
Mark N. Wegman

Abstract

In this paper we use linear algebraic methods to analyze the performance of several classes of hash functions, including the class H₂ presented by Carter and Wegman. Suppose H is a suitable class, the hash functions in H map A to B, S is any subset of A whose size is equal to that of B, and x is any element of A. We show that the probability of choosing a function from H which maps x to the same value as more than t other elements of S is no greater than min (¹/t², ¹¹/t⁴).

Consider a database storage and retrieval system implemented using hashing and a linked list collision resolution strategy. A corollary of the main result is that the probability that the system would perform more than t times more slowly than expected is no greater than min(¹/t²,¹¹/t⁴). The “performance” being considered can be either the number of memory references required to process any individual request or the number required to process an arbitrary sequence of requests.

Notice that these results do not assume that the requests to the database are random or uniformly distributed. Instead, the averaging is done over the possible choices of the actual hash function from H. Since the system designer can be sure that this choice is made randomly, the probabilities given hold for any input. It is also shown that the bound on poor performance when balanced trees are used in place of linked lists is approximately min (1/(4^t), 11/(16^t)). The formulas are generalized to any size S.

Recommended Citation

G. Markowsky et al., "Analysis of a Universal Class of Hash Functions," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 64, pp. 345 - 354, Springer Verlag, Sep 1978.

The definitive version is available at https://doi.org/10.1007/3-540-08921-7_82

Meeting Name

7th Symposium on Mathematical Foundations of Computer Science, MFCS 1978 (1978: Sep. 4-8, Zakopane, Poland)

Department(s)

Computer Science

Keywords and Phrases

Balanced trees; Collision resolution; Database storage; Linear-algebraic; Memory references; Poor performance; System designers

International Standard Book Number (ISBN)

978-354008921-6

International Standard Serial Number (ISSN)

0302-9743

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

08 Sep 1978

Link to Full Text

COinS

Computer Science Faculty Research & Creative Works

Analysis of a Universal Class of Hash Functions

Abstract

Recommended Citation

Meeting Name

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

Analysis of a Universal Class of Hash Functions

Author

Abstract

Recommended Citation

Meeting Name

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations