From Regular Expressions to DFA's Using Compressed NFA's

Chia-Hsiang Chang
Robert Paige L., Missouri University of Science and TechnologyFollow

Abstract

There are two principal methods for turning regular expressions into NFA's - one due to McNaughton and Yamada and another due to Thompson. Unfortunately, both have drawbacks. Given a regular expression R of length r and with s occurrences of alphabet symbols, Chang and Paige (1992) and Brüggemann-Klein (1993) gave Θ(m + r) time and O(r) space algorithms to produce a Θ(m) space representation of McNaughton and Yamada's NFA with s + 1 states and m transitions. The problem with this NFA is that m = Θ(s2) in the worst case. Thompson's method takes Θ(r) time and space to construct a Θ(r) space NFA with Θ(r) states and Θ(r) transitions. The problem with this NFA is that r can be arbitrarily larger than s. We overcome drawbacks of both methods with a Θ(r) time Θ(s) space algorithm to construct an O(s) space representation of McNaughton and Yamada's NFA. Given any set V of NFA states, our representation can be used to compute the set U of states one transition away from the states in V in optimal time O(¦V¦ + ¦U¦). McNaughton and Yamada's NFA requires Θ(¦V¦ × ¦U¦) time in the worst case. Using Thompson's NFA, the equivalent calculation requires Θ(r) time in the worst case. Comparative benchmarks show that an implementation of our method outperforms implementations of competing methods with respect to time for NFA construction, NFA accepting testing, and NFA-to-DFA conversion by subset construction. Throughout this paper program transformations are used to design algorithms and derive programs. A transformation of special importance is a form of finite differencing used previously by Douglas Smith to improve the efficiency of functional programs.

Recommended Citation

C. Chang and R. Paige, "From Regular Expressions to DFA's Using Compressed NFA's," Theoretical Computer Science, Elsevier, Jan 1997.

The definitive version is available at https://doi.org/10.1016/S0304-3975(96)00140-5

Department(s)

Mathematics and Statistics

International Standard Serial Number (ISSN)

0304-3975

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 1997

Mathematics and Statistics Faculty Research & Creative Works

From Regular Expressions to DFA's Using Compressed NFA's

Abstract

Recommended Citation

Department(s)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Mathematics and Statistics Faculty Research & Creative Works

From Regular Expressions to DFA's Using Compressed NFA's

Author

Abstract

Recommended Citation

Department(s)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations