- A language L is any subset of Σ*, which represents all sequences of elements from a finite set Σ
- An algorithm A decides a language L by answering if x ∈ L
- Languages that we can decide = languages that we can describe – all languages
- Decided languages and described languages are similar in size (#ℕ), but both are much smaller than the total available # of languages (#ℝ)
- Two sets have the same cardinality if there exists a one to one mapping with all elements from set A to set B without any remaining elements in either sets
- PCP – Post Correspondence Problem – Given tiles with a top sequence and a bottom sequence, find a sequence of such tiles (using any number of any tile) where the concatenated top sequence matches the concatenated bottom sequence
- PCP cannot be decided by any algorithm in a finite amount of time when there are no valid sequences
- Proof by reduction technique – if PCP was decidable, then another undecidable problem (the halting problem) would be decidable
- The Halting Problem – Given an algorithm and an input, will the algorithm halt on that input or will it loop forever?
- Algorithms can be fed an input matching their algorithm to return an incorrect response; no algorithm can be always correct
- Syracuse Conjecture – For any integer n > 0, where S1 = n, Si+1 = if (Si % 2 == 0) Si/2 else 3Si + 1, Syracuse(n) = lease i such that Si = 0 if Sk > 1 for all k in [1, i)
- Syracuse Conjecture – ∀n[n > 0 ⇒ Syracuse(n) > 0]
- Not all simple looking problems can be easily solved; for instance: x/(y + z) + y/(x + z) + z(y + z)
- NP-Complete Problems
- If any of them is easy, they are all easy
- In practice, some of them have special cases which may be more efficiently solved
- 3-colouring of Maps; hard to solve (no known efficient algorithm), but are easy to verify
- SAT – given boolean formula, is there an assignment to make the formula evaluate to true?
- Travelling salesman – given cities & distances between them, what is the shortest route to visit each city once?
- Knapsack – given items with various weights, is there a subset with total weight K?
- If any of them is easy, they are all easy
- P – Tractable problems – Solvable in polynomial time
- Some problems may be efficiently solvable, but algorithm may not be provable
- 2-colorability of maps
- Primality testing (probably not factoring)
- Solving N x N x N Rubik's cube
- Finding word in dictionary
- Sorting elements
- NP-hard – broader term for NP questions that do not fall in the category of a language
- PSpace Completeness – problems requiring reasonable (Poly) amount of space to be solved, but may use very long time
- Finite-State Automata – simple model where there can be exactly one of a finite number of states at any given time
- Example: swinging door, elevator
- DFA – deterministic finite automaton, 5-tuple (Q, Σ, δ, q0, F)
- States – finite set Q (circle)
- Alphabet – finite set Σ, defines a language (letters)
- Transition function – explicit representation mapping states & alphabets to states (δ = Q x Σ → Q)
- Start state – special state representing entry point, q0 ∈ Q (arrow to state)
- Accept states – decision making state, F ⊆ Q (double circle)
- Next tuesday's class is cancelled
- Prof. Crépeau's office hours are cancelled next week
- Regular Languages
- Given definition from last class, and let w = w1w2...wn (n ≥ 0) be a string where each symbol wi is from the alphabet Σ
M accepts w if states s0,s1,...sn exist s.t- s0 = q0
- si+1 = δ(si, wi+1) for i = 0, ... n – 1
- sn ∈ F
- (w is accepted if it starts at the start state and ends at an accept state)
- Note that the size of the state is typically one greater than the size of the string
- M recognizes language A if A = { w | M accepts w }
- A language is a regular language if some finite automaton recognizes it
- Given definition from last class, and let w = w1w2...wn (n ≥ 0) be a string where each symbol wi is from the alphabet Σ
- [Went through example proof by induction]
- ε represents an empty string
-
Regular Operations Union A ∪ B { x | x ∈ A or x ∈ B } Concatenation A ∘ B { xy | x ∈ A and y ∈ B } Star A* { x1x2 ... xk | K ≥ 0 and each xi ∈ A } - Non-Deterministic Finite Automata
- May jump from state to state without consuming input (eg when encountering the empty string ε)
- Definitions are similar to DFA, except that:
- Alphabet – also includes an empty string ε "--Transition function returns P(Q), which is a subset (partition) of Q that meets the requirements"
- (No class on 19th)
- Minimization of DFA
- Lumping (quotient by an equivalence relation) – if two states lead to the same state(s) at all times, and are the same 'state' themselves, they may be merged together as their difference is forgotten after the next step.
- ~ (equivalence relations) are
Reflexive ∀x x ~ x Symmetric ∀x,y x ~ y ⇒ y ~ x Transitive ∀x,y,z x ~ y, y ~ z ⇒ x ~ z - S/~ represents [s] = { x | x ~ s }
- δ(s, a) – state you went to after reading alphabet a at state s
- δ*(s, w) – state you went to after reading all letters in word w, starting at state s
- M = (s, s0, δ, F)
L(M) = { w | δ*(s0,w) ∈ F } - Def p, q ∈ S are equivalent (p ≈ q) if
∀w ∈ Σ* δ(p,w) ∈ F ⇔ δ(q,w) ∈ F - Lemma A
- p = q ⇒ ∀a ∈ Σ δ(p,a) ≈ δ(q,a)
- Note that p ≈ q can be written [p] = [q] (comparing equivalence classes)
- Therefore: [p] = [q] ⇒ [δ(p,a)] = [δ(q,a)]
- For a new machine M' = (s', s0', δ', F')
- s' = s/≈
- s0' = [s0]
- δ'([s],a) = [δ(s,a)]
- F' = { [s] | s ∈ F }
- Lemma B
- if p ∈ F & q ≈ p, then q ∈ F
- Lemma C
- ∀w ∈ Σ* δ'([p],w) = [δ(p,w)]
- Proof by induction
- Theorem: L(M') = L(M)
- Proof
- x ∈ L(M') ⇔ δ'*([s0], x) ∈ F'
- ⇔ [δ*(s0,x)] ∈ F'
- ⇔ δ*(s0,x) ∈ F
- ⇔ x ∈ L(M) ∎
- 41:29
- Reflexive, symmetric, transitive
- S~ represents an equivalence class, where S is the set & ~ represent equal
- δ(s, a) – state you went to after reading alphabet a at state s
- δ*(s, w) – state you went to after reading all letters in word w, starting at state s
- Every NFA can be done with a DFA
- E(R) = { q | q can be reached from R by traveling along 0 or more ε arrows }
- R is a regular expression if
- a for some a in alphabet Σ
- ε
- ∅
- Union, concatenation, and star of regular expressions are regular expressions
- Lemma – if a language is regular, then it is described by a regular expression
- Generalized NFA
- Start state has transition arrows to every other state, but no arrows coming in from any other state
- Single accept state, with arrows coming in from every other state and no arrows from any other state. The accept state cannot be the start state
- For all other states, one arrow goes from every state to every other state
- In other words, \delta;: (Q – {qaccept|) x (Q - {qstart}}) → R is the transition function
- GNFA → regex
- Basis – if GNFA has 2 states, the states are a start & accept state with a single transition to the accept state
- Reduction
- Given B = { 0n1n | n ≥ 0 }
- Given C = { w | w contains an equal number of 0s & 1s }
- If C is regular, then so is B
- If B is nonregular, then so is C
- Proof
- We can define A = L(01), which is obviously regular; if C was regular than so would C ∩ A = B
- Simple Reductions
- If A* is nonregular, then so is A
- If A is nonregular, then so is AC
- If A is nonregular, then so is AR
- Complex Reductions (let all Rs be regular)
- For A' = (A ∪ R) ∩ (AC ∪ R')
- For A' = ((AC ∩ R) ∪ (A* ∩ R')) ∘ R''
- For A' = (A ∘ R) ∩ (AC ∘ R')
- If A' is nonregular, then so is A
- Myhill-Nerode Theorem
- A set of strings (X) is pairwise distinguishable by language L is every two elements in X are distinguishable by L (∀x, x' in X, x ≢L x')
- Define an index of L to be the size of a maximum set X that is pairwise disinguishable by L. L is regular iff the index is finite.
- The smallest index is also the number of states of the smallest DFA recognizing L
- If DFA has more states than the index, then there must be some x, y in X such that δ(q0, x) = δ(q0, y). However, this is not distinguishable, hence contradiction
- Minimal DFAs can be discovered using the Myhill-Nerode Theorem by first discovering the pairwise distinguishable set, then by creating a DFA whose states matches the distinguished set values.
- Context-Free Grammar
- Derivation – conversion of word from start variable (typically first symbol in grammar)
- Eg. Let grammar G1 = A → 0A1, A → B, B → #
- Derivation of '000#111': A ⇒ 0A1 ⇒ 00a11 ⇒ 000A111 ⇒ 000B111 ⇒ 000#111
-
CFG Definition Variables A, B, C, <TERM>, <EXPR> (Angle brackets denote single variable rather than sequence of letters) Alphabet (of terminals) 0, 1, # Substitution Rules <EXPR> → <TERM> Start Variable A (LHS of first substitution rule) - u derives v (u ⇒* v if u = v or if u ⇒ u1 ⇒ u2 … uk = v, k ≥ 0.
- Chomsky Normal Form
- Context-free grammar notation for which every rule is of the form A → BC or A → ε
- B and C must not be start variables
- Context-free grammar notation for which every rule is of the form A → BC or A → ε
- Any context-free language is generated by a context-free grammar in Chomsky normal form
- Start by creating start variable S0 pointing to S, the first variable in the string (this avoids start variable of CFG from appearing on RHS)
- Remove all A → ε rules where A is not start variable
- This will change rules R → A to R → ε, which will be processed later
- S → ASA would become S → ASA | SA | AS | S
- Remove all unit rules such as A → B
- Convert seqeunce rules A → u1u2…uk (where k ≥ 2) to a series of rules A → u1A1, A1 → u2A2 etc, where each Ai is a new variable
- Pushdown Automata (stack type)
- Has alphabets – one for the inputs (lowercase) & one for the stack (uppercase)
- In most cases, the stack alphabet will be larger than that of the inputs
- b,B → A means that if input is b & stack top has B, digest b and replace top stack with A
- b,ε A means to push A to stack if input is b (popping ε, which is nothing)
- Note that stack usage means that we can only see the top item of the stack. There is no notion of moving through the stack
- Gave more examples on PDA
- Theorem – a language is context free iff some PDA recognizes it
- CFG to PDA
- Place marker symbol $ & start variable on stack
- Repeat following forever
- If top stack is A, nondeterministically select one of rules for A & substitute
- If top stack is terminal symbol α read next symbol from input & compare to α. If match, repeat, otherwise reject
- If top stack is symbol $, enter accept state. This will accept the input if it has been completely read
- If x can bring P from p with empty stack to q with empty stack, Apq generates x
- If computation has 0 steps, x is already the empty string
- Induction - assume true for length k, prove for k + 1
- Once that stack is non empty, it will become empty either at the very last step of at some intermediate point
- If never happened in between, symbol p i pushed at the first move and popped at the last step
- For any portion y = azb, with y transitioning from empty state to empty state, we may do the same for z without the need of passing through transitions a and b (given that there does not exist an empty state in between)
- If empty state exists in between, transitions can be split by the empty states. Note that the sum of those transitions must be less than the total number of steps, as it requires at least two steps to pop into an empty state and push back into a nonempty state
- If never happened in between, symbol p i pushed at the first move and popped at the last step
- Once that stack is non empty, it will become empty either at the very last step of at some intermediate point
- Pumping Lemma for CFLs
- One way only; not iff
- A context-free language L must have a number p where, if s is any string in L of length ≥ p, s may be divided into uvxyz such that:
- for each i ≥ 0, uvixyiz ∈ A
- |vy| > 0
- |vxy| ≤ p