Mathematics and Statistics Faculty Research & Creative Works

Using Multiset Discrimination to Solve Language Processing Problems Without Hashing

Jiazhen Cai
Robert Paige L., Missouri University of Science and TechnologyFollow

Abstract

It is generally assumed that hashing is essential to solve many language processing problems efficiently; e.g. symbol table formation and maintenance, grammar manipulation, basic block optimization, and global optimization. This paper questions this assumption, and initiates development of an efficient alternative compiler methodology without hashing or sorting. The methodology rests on efficient solutions to the basic problem of detecting duplicate values in a multiset, which we call multiset discrimination. Paige and Tarjan (1987) gave an efficient solution to multiset discrimination for detecting duplicate elements occurring in a multiset of varying length strings. The technique was used to develop an improved algorithm for lexicographic sorting, whose importance stems largely from its use in solving a variety of isomorphism problems (Aho et al., 1974). The current paper and a related paper (Paige, 1994) show that full lexicographic sorting is not needed to solve these isomorphism problems, because they can be solved more efficiently using straightforward extensions to the simpler multiset discrimination technique. By reformulating language processing problems in terms of multiset discrimination, we also show how almost every subtask of compilation can be solved without hashing in worst case running time no worse (and frequently better) than the best previous expected time solution (under the assumption that one hash operation takes unit expected time). Because of their simplicity, our solutions may be of practical as well as theoretical interest.The various applications presented culminate with a new algorithm to solve iterated strength reduction folded with useless code elimination that runs in worst case asymptotic time and auxiliary space Θ(¦L¦ + ¦L'¦), where ¦L¦ and ¦L'¦ represent the lengths of the initial and optimized programs, respectively. The previous best solution due to Cocke and Kennedy (1977) takes Ω(¦L¦3¦L'¦) has operations in the worst case.

Recommended Citation

J. Cai and R. Paige, "Using Multiset Discrimination to Solve Language Processing Problems Without Hashing," Theoretical Computer Science, Elsevier, Jan 1995.

The definitive version is available at https://doi.org/10.1016/0304-3975(94)00183-J

Department(s)

Mathematics and Statistics

International Standard Serial Number (ISSN)

0304-3975

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 1995

Link to Full Text

COinS

Mathematics and Statistics Faculty Research & Creative Works

Using Multiset Discrimination to Solve Language Processing Problems Without Hashing

Abstract

Recommended Citation

Department(s)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Mathematics and Statistics Faculty Research & Creative Works

Using Multiset Discrimination to Solve Language Processing Problems Without Hashing

Author

Abstract

Recommended Citation

Department(s)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations