Automata Versus Regular Expressions
A regular expression (regex) is a string of characters that defines a text search pattern. For example, the following regex searches for the characters 123 with an optional abc or xyz occurring between 1 and 2:
The strings 1abc23, 1xyz23, and 123 would all match this regex.
Figure 1 shows this regex converted into a graph with nodes representing the characters and arrows representing the paths for finding a matching text string:
Similarly, in the context of automata computing, nodes detect specific characters and arrows show paths. Figure 2 shows the automaton implementation of the example regex.
As discussed earlier, a state in an automaton is defined by the STE. Each STE is programmed to accept a specific set of symbols, and the STEs are chained together with transition arrows which represent activation connections.
In Figure 2, the STE with the 1 indicator is active on the very first input processing cycle. This is analogous to the start anchor (^) in regex. The STE with the indicator R means the STE will generate a report event if it matches the input symbol during the cycle in which it is active. This is analogous to the string matching the regular expression.
|Characters/character classes||STEs programmed with characters/character classes|
|Beginning anchor||STE start-of-data attribute|
|Unanchored expressions||STE all-input attribute|
|Tail anchor||End-of-data (EOD) signal|
|Repetitions||Loopback connections, possibly with counters|
|Match-any-character (.)||Match-any-character (*)|
|Quantifiers||Can be implemented with repeated STE chains or counters|
|Modifiers||Certain modifiers are supported|