In Automata Theory, an Automaton is
a self-propelled computing device that follows predetermined (self-contained) instructions automatically on input.
As the point of Automata Theory is to model real-life computation machines, their formal definitions vary by author according to what exactly the author is trying to model.
Below is an overview of a basic, generic definition of an automaton which then specializes to those generally used to study Formal Languages & Grammar.
An Automaton is a 5-tuple M=⟨Σ,Γ,Q,δ,λ⟩:
Σ – Input Alphabet
a finite set of symbols
Γ – Output Alphabet
a finite set of symbols
Q – States
a set of states
δ – Transition Function δ:Q×Σ↦Q
λ – Output Function λ:Q×Σ↦Γ
An automaton is finite when Q is finite. λ and Γ are “optional”; an automaton is not required to “output symbols”.
Automata can be nicely visualized (and represented) as transition diagrams – directed graphs whose vertices are states and edges are transitions (labelled with the necessary symbol(s) to “traverse” them).
Input
An input (word/string)w is a string of symbols a1a2a3…an for an∈Σ. In Kleene Star notation, w∈Σ∗. (Recall a language, L⊆Σ∗.)
An automaton reads (consumes) an input word one symbol at a time.
Note only ω-languages are not considered, and as such the inputs to all automata are finite in length.
Run
A run is a sequence of states q0q1q2…qn where qn∈Q and qi+1=δ(qi,ai+1).
I.e., an automaton starts at state q0, then for each input symbol, advances one state according to the transition functionδ. After reading the entire input word, the automaton’s run ends on state qn, the final state.
Should the automaton have a λ, it also emits an output symbol b∈Γ for each transition where bi+1=λ(qi,ai+1).
The inductive extension of δ into δˉ:Q×Σ∗↦Q maps entire input words to final states.
δˉ(q,w):={qδ(δˉ(q,u),a)w=εw=ua,a∈Σ
where ε is the empty string. λˉ can be constructed in the same way.
This abstracts away the “process of computation” and models automata as input-output functions.
For automata that do not have a λ (i.e., that do not output symbols), their “output” is taken as qn, the final state.
Acceptor
To ship the two lovebirds automata and Formal Languages, we equip an (λ, Γ-less) automaton with
q0∈Q, a distinguished state, the start state; and,
F⊆Q, a distinguished set of states, the accepting states,
to form an acceptor (the variant of automata used to study formal languages, and everything discussed hereafter).
Observe that acceptors are boolean-output functions.
Accepting Word
A word w∈Σ∗ is an accepting word for acceptor M if δˉM(q0,w)∈FM.
Recognized Language
The language recognized by (or the language of) an acceptor is the set of all its accepting words. I.e., M recognizes L⊆Σ∗ iff L={w∈ΣM∗∣δˉM(q0M,w)∈FM}.
I.e., M will tell you for all w∈Σ∗ if w∈L – if w is a sentential form.
Recognizable Languages
The (class of) recognizable languages for a class of automata is, well, the set of languages each of which is recognized by some automata of said class.
(See below for two automata classes: Finite and Pushdown.)
In addition, automata are commonly modeled with forms of memory (e.g., pushdown automata, below) and non-functional transitions (one-to-many, aka. nondeterminism). Nondeterministic automata can be in more than one state at once, due to transitions with multiple output states; i.e., δ is a binary relation, and no longer a function.
Note that nondeterminism here solely refers to the ability for an automaton to be in more than one state at once; it does not mean the automaton produces nondeterministic results (runs are always deterministic).
Automaton Class
Recognizable Languages
Deterministic/Nondeterministic Finite State Machine (FSM)
Regular (Type-3)
Deterministic Pushdown Automaton (DPDA)
(deterministic) Context-Free
Pushdown Automaton (NPDA)
Context-Free (Type-2)
Linear-bounded Automaton (LBA)
Context-Sensitive (Type-1)
Turing Machine (TM)
Recursively Enumerable (Type-0)
Finite Automata
Aka. Finite State Machine (FSM).
A Finite Automaton (FA) is a 5-tuple ⟨Q,Σ,δ,q0,F⟩ (see acceptor above):
Q – States
Σ – Alphabet
δ – Transition Function
q0 – Initial State
F – Accepting States
Note that FA has no memory / auxiliary storage. Transitions depend purely on current state and input symbol.
Nondeterministic FA (NFA) and Deterministic FA (DFA) are equivalent (always inter-convertible).
Recognizes all Regular (Type-3) Languages. L is regular iff there exists a DFA whose language is L.
Completeness:
A DFA is complete if δ is total.
Technically all DFAs should be complete, implying, in practice, that we explicitly define “dead”/“trash” states… but if not, undefined transitions are assumed to denote failure (immediate rejection), or the transition into an implicit non-accepting dead state.
Nondeterministic Finite Automaton
δ is a binary relation (it can map one-to-many states for the same input symbol) (or as δ:Q×Σ↦P(Q))
ε-transitions are allowed (transitions may consume no symbols)
no explicit “dead” state needed; NFAs need not be complete
ε-Closure
As we now have ε-transitions, for any current state, all other states accessible via ε-transition(s) are also current-states.
The set of ε-reachable states from a given state (which includes itself) is its ε-closure.
I.e., the transitive closure of ε.
NFA → DFA
For NFA N=⟨Q,Σ,δ,q0,F⟩, its equivalent DFA is M=⟨Q′,Σ′,δ′,q0′,F′⟩:
Q′:=P(Q)
Σ′:=Σ
δ′(q′,a):=⋃q∈q′E(δ(q,a))
q0′:=E({q0})
F′:={q′∈Q′∣q′∩F=∅}
where E:P(Q)↦P(Q) is the union of the ε-closures for each of the states in its inputs.
Observe that the only nontrivial change is adding ε-transitions from M1's accepting states to M2's initial state (and make F1 no longer accept).
For DFA M1, M1∗ (Kleene star) has:
too lazy to write out formally, but:
add ε-transitions from all accepting states to q0
to also accept ε (as required by ∗), introduce new state as new q0, which is an accepting state and has an ε-transition to the old q0.
Pumping Lemma
All sufficiently long sentences (strings) of a regular language can be pumped – have a middle substring repeated (self-concat) an arbitrary number of times while still remaining within the language.
“Sufficiently long” is defined by a length threshold p.
Finite languages vacuously satisfy the lemma (take p> longest sentence).
∀L∈RL:∃p∈Z+:∀w∈L,∣w∣≥p:∃xyz=w,∣y∣>0:xy∗z⊆L
Also, ∣xy∣≤p.
Note that the pumping lemma is a necessary but not sufficient condition for regularity. (Hence it alone can only be used to disprove, not prove.)
Alternatively, as three properties:
∣y∣≥1
∣xy∣≤p
∀n≥0:xynz∈L
For L⊆Σ∗, ∀p:∃s,∣s∣≥p:(∀xyz=s:¬P(x,y,z))⟹L∈RL.
where p∈Z+, s∈L, and P is the pumping lemma (the three properties).
Key Limitations
Inability to count; e.g., balanced parentheses. No internal state.
Failure at recursive structures; e.g., nested loops in programming languages.
Regular Expressions
Note that various “regex” languages used in programming have long exceeded the capabilities of a (formal) Regular Grammar. Here we speak of a strictly regular regex.
In CS-style notation, ∪ (union) is demoted by ∣ and ∘ (concatenation) by juxtaposition. Also, there’s commonly + for “one or more” – i.e., a+:=aa∗.
Then, operator precedence is 1. parentheses (), 2. star ∗, 3. concat., 4. union ∣.
Generalized NFA (GNFA)
NFAs where transitions are regexes
just for the convenience of writing (and useful in manual DFA → Regex conversion)
Regex → NFA
Structural induction
Base cases
a is trivially an DFA (single a-transition to accepting state) ε is trivially an DFA (only (initial) state is accepting state) ∅ is trivially an DFA (one with no accepting states)
Induction Hypotheses
R1 and R2 are (arbitrary) regexes with equivalent DFAs
Inductive Steps
R1∪R2: union closure proof!!! R1∘R2: concat closure proof!!! R1∗: star closure proof!!!
Nondeterministic PDAs (NPDA) are more powerful than Deterministic PDAs (DPDA).
NPDAs can recognize all Context-Free (Type-2) Languages, while DPDAs recognize only a subset.
Buuut we won’t talk about DPDAs :) .
A (nondeterministic) Pushdown Automaton ((N)PDA) is 6-tuple ⟨Q,Σ,Γ,δ,q0,F⟩:
Γ – Stack Alphabet
δ – Transition Function δ:Q×Σε×Γε↦P(Q×Γε)
everything else – same as those of an NFA
Note that PDAs are FAs equipped with auxiliary storage in the form of a (LIFO) stack.
…And we’re smushing λ's responsibility of outputting Γ symbols into δ.1
In addition to being in a set of states, every point in a run also has an associated stack along its contents. Every transition depends on the top of the stack and manipulates it.
For DFAs, a point in a run is just a state (and remaining input).
For DPDAs, a point in a run is a ⟨state, stack, remaining input⟩ triple.
The nondeterministic variants have multiple nonlinear “sub-runs” per run.
A δ(…,…,a)→(…,b) transition replaces the top of the stack (which must be a) with b.
εs are also allowed for Γ in both δ's input and output, where ε→γpushesγ onto the stack, γ→ε pops (γ from) the stack, and ε→ε leaves the stack unchanged.
Pumping Lemma
∀L∈CFL:∃p∈Z+:∀s∈L,∣s∣≥p:∃uvxyz=s,∣vy∣>0:uvnxynz⊆L
(for any n∈N) and ∣vxy∣≤p
Again, as three properties:
∣vy∣≥1
∣vxy∣≤p
uvnxynz⊆Ln∈N
Branching factor b is the max number of symbols on the RHS of a rule.
The pumping length of L is p=b∣N∣+1 (N is the set of non-terminals in the grammar).
p is a min length that guarantees a repeated non-terminal R (in the parse tree) p is an upper bound on the number of terminals generated by the upper R.
Turing Machine
I typo this as turning machine way too often.
A Turing Machine (TM) is a 7-tuple ⟨Q,Σ,Γ,δ,q0,qaccept,qreject⟩:
Γ – Tape Alphabet ϵ∈Γ,Γ/{ϵ}⊇Σ (ϵ is the default “blank” tape state)
δ – Transition Function δ:Q×Γ↦Q×Γ×{L,R}
qaccept, qreject – Terminating States
these have immediate effect
everything else – same as those of FSM
Storage (tape) is infinite, like the stack of PDAs.
Except here the input is placed on the tape, and we can read/write arbitrary locations.
Configuration
A configuration is a point in a run which now includes (in addition to the state)
contents of tape and
location of read head.
I.e., a configuration C=wLqiwR has
wL – contents of tape left of read head
wR – contents of tape under and right of read head
qi – current state
TM M=⟨Q,Σ,Γ,δ,q0,qaccept,qreject⟩ accepts input string s if there exists a run C1C2…Cn where
C1=q0s (start by placing head on first symbol of input)
Recognition & Decidability
A TM recognizes a language L if it accepts all s∈L (in finite steps).
A TM decides a language L if it recognizes Land rejects all s∈L (in finite steps).
Recognition: partial function returning true for acceptance.
Decidability: total function always returning true/false. Co-Turing-Recognizable: recognizes Lˉ (rejects all non-L in finite steps).
Turing-decidable languages are a strict subset of recursively enumerable (Type-0) languages.
Some decidable languages: LLLLL={aibjck∣i×j=k∧i,j,k≥1}={ww∣w∈Σ∗}={⟨D,w⟩∣⟨D⟩is a DFA that accepts w}={⟨D⟩∣⟨D⟩is a DFA∧L(D)=∅}={⟨G,w⟩∣G is a CFG that generates w}(arbitrary arithmetic)((sub)string matching)(DFA Acceptance)(DFA emptiness)(CFG Acceptance)
Chomsky Normal Form (CNF)
Used to make the last (CFG) example above decidable.
Any CFG can be written in Chomsky Normal Form. In CNF, every derivation of a string w, ∣w∣=n requires exactly2n−1 steps (rule applications).
Specifically, n characters are produced by n (corresponding) non-terminals, of which are expanded from S in n−1 steps. Then, n more derivations converts the non-terminal string into w.
Constraints on Production Rules: S (start symbol) cannot appear on RHS of any rule. ε may appear only as rule S→ε.
Then, all other rules are in one of two forms: A→a A→BC
where A,B,C∈N (non-terminals) and a∈Σ (terminals).
Note that this forces the expansion to be a binary tree, hence the exact bounding on derivation steps shown above.
CFG → CNF
eliminate RHS S's A→S…
if any, introduce new start symbol to proxy the old one
eliminate ε-Productions A→ε(∣…)
by collapsing into all super-rules (rules that produce A, like B→A…)
eliminate Unit Productions A→B
by expanding the RHS B
substitute Terminals
introduce new non-terminals to proxy the each terminal
e.g., for A→aB, introduce C→a to write A→CB
chop down Long Productions (more than two symbols in RHS)
e.g., write A→BBB as A→CB, C→BB.
(this reduces the RHS by 1 symbol at a time, so it works for all lengths)
This algorithm finishes in finite steps (trust me bro).
Other Computation Theory Stuffs
Universal Turing Machine: we can write a TM that simulates any other TM.
(They must share a finite alphabet)
L={⟨M,w⟩∣M is a TM that accepts w}, Turing Acceptance, is undecidable. (Because of the halting problem.)
Proof of TM Acceptance Undecidability (by contradiction): Assume some TM H decides L.
Let a TM D take in any TM N as string ⟨N⟩, and
accepts if [N rejects ⟨N⟩];
rejects if [N accepts ⟨N⟩].
I.e., D reverses the self-acceptance of any given input. D can do this using H; it feeds it ⟨N,⟨N⟩⟩ (and flips the result).
Consider when D is given w=⟨D⟩ as input:
if H accepts ⟨D,w⟩, it means D accepts w… but D will flip H's result and reject w… oops. ∴ No TM decides L.
An from this it is obvious that Lˉ is not Turing-recognizable (as it is only co-Turing-recognizable).
because this uni course is oddballs.
I guess it makes some sense as the stack is not exactly an output string of the automata, and that state transitions also depend on the (top of) the stack. ↩︎