First-order Logic

First-order Logic, in a strict sense, is a class of Formal Systems.

A first-order logic, by definition of a formal system, is made of

I.e., we speak of a specialization of Formal Systems, encompassing a First-order Language (specialization of Formal Language) and a respective Deductive System, that is capable of expressing first-order logic.

The Deductive System and a subset of the Language are (conventionally) fixed across all first-order logics. This leaves possible variations between first-order logics to be within language, namely the signature.

The fixed definitions are what define the class (are what make “first-order” logic “first-order”).
Characteristically, first-order logic is Propositional Logic extended with:

Note that there is no definitive/authoritative specification on the “conventionally fixed” parts of first-order logics. Nevertheless, we hereafter discuss what the more-or-less universally accepted ideas.

Alphabet

Recall what an alphabet, Σ\Sigma, is in a Grammar.

For a first-order language, we further subdivide Σ\Sigma into λ,σ\braket{\lambda, \sigma}:

This division facilitates the separation of semantics from syntax (formation rules).

Logical Symbols

These have constant, assumed, meaning (across all interpretations).
Part of Syntax

The exact contents of λ\lambda is up to the author, although conventionally it contains:

An equality symbol, e.g., ==, is sometimes also included, or otherwise defined to be a nonlogical predicate.

Here, a frequent difference across authors is the usage of alternative symbols, especially for the logical connectives and true and false constants. E.g., ++ for \lor, \cdot for \land, \sim for ¬\lnot, TT for \top, FF for \bot .

Nonlogical Symbols

These are meaningless placeholders until assigned to by interpretations.
Part of Semantics

The author of the language defines σ\sigma, two sets F,P\braket{\mathcal{F}, \mathcal{P}}1:

“Constants”, should they be distinguished, are nullary functions. Similarly,
Propositions (from propositional logic) are nullary predicates.

Some texts speak of Relations instead of Predicates.
They are isomorphic; there is a bijection between the two forms.2

The same could also be said for Predicates and Functions.
Behaviorally, predicates are functions whose codomain is precisely {,}\{ \top, \bot \} (and functions are predicates of (+1)-arity3).
But, functions and predicates face different syntactical rules, and are hence segregated.

Note that σ\sigma contains only “symbols” in the sense of representations of objects with unspecified meaning/behavior.
This is unlike λ\lambda, whose contents do have their meaning/behavior specified.

To relate back to Formal Grammar, a first-order language L\mathcal{L}, where Σ=σλ\Sigma = \sigma \cup \lambda, has LΣ\mathcal{L} \subseteq \Sigma^*

As the signature σ\sigma is the only “variable” part between first-order languages, some consider it equivalent to / definitive of a first-order language.

Syntax (Language Formation Rules)

Here we restate the fixed grammar of first-order languages.

Everything discussed so far is only to establish the syntax of first-order languages.
Syntax allows the construction and manipulation of formulas without concern for what any of the nonlogical symbols represent.

Semi-formal definition

Terms tTt \in \mathcal{T} are (inductively defined as) any

Atomic Formulas / atomic statements are any

(sentential) Formulas / statements are any

Relating back to Formal Language, for any φ\varphi that is a sentential formula, φL\varphi \in \mathcal{L}.

If we consider the “datatypes”, from computer science, of the above,
Terms are of Any type;
(Atomic) formulas are of boolean.

Every (sub-)formula is implicitly parenthesized.
For the final bullet point above, a . or : may be added as per Qx.φ\mathcal{Q} x .\, \varphi or Qx:φ\mathcal{Q} x : \varphi to more explicitly delimit the quantification clause from the sub-formula.

In actual writing, to reduce parenthesis, there is a conventional precedence for the logical connectives such that parenthesis may be omitted as long as unique readability is maintained. In descending precedence:

  1. ¬\lnot
  2. \land and \lor
  3. \forall and \exists
  4. \rightarrow and \leftrightarrow

Formal definition

Observe that “Term”, “Atomic Formula”, and “Formula” are Nonterminal Symbols, while the actual formula produced are comprised on Terminal Symbols only.

See a formal representation in BNF on Wikipedia

It is otherwise too much work (and not exactly useful) to write up the formation rules in a first-order language. (Although this self-definition would be cool?)

Variable Binding

A variable is bound if it is quantified, and free otherwise.
Within a larger formula, a variable is unbound if it is free somewhere, and bound if bound everywhere.

A variable is quantified when it is used with a quantifier, e.g., χ=(xφ)ψ\chi = (\forall x \, \varphi) \lor \psi binds xx in the scope of φ\varphi, but is free in ψ\psi (assuming xx does appear), and is therefore unbound in χ\chi.

I.e., bound variables are not “inputs”; they are not symbols to be assigned values during interpretation (like proposition letters). They are implicit and cannot be assigned to.

Imagine quantifiers as a for-each loop, e.g., for (item : list){...}. Then, item is bound . (And as such you wouldn’t redefine it, since the loop “manages” it “automatically”.)

A formula with no (free) variables is a sentence.

Semantics (Interpretation Rules)

Specifically, we are discussing Tarskian semantics.

A language does not encode any semantics within itself.
Therefore, to perform any semantic treatment of a formula, we must interpret it.

An interpretation assigns a denotation to each nonlogical symbol and is a pair M,ρ\braket{M, \rho}:

First-order Structures

A “first-order” structure or an L\mathcal{L}-structure (where L\mathcal{L} is a first-order langauge) is a structure for a first-order language.
I.e., the structure provides interpretation specifically for (the symbols in) L\mathcal{L}.

An L\mathcal{L}-structure is a pair M=D,IM = \braket{\mathbb{D}, I}5:

Assignability (of a structure to a formula) is a retrospective assertion of whether the structure is for the language the formula is in.

Valuations

Aka. (variable) assignment.

A valuation in an L\mathcal{L}-structure MM is a mapping
ρ:VDM \rho : \mathcal{V} \mapsto \mathbb{D}^M

such that by applying ρ\rho to every free variable in a formula, we obtain a sentence.

Then, the evaluation of a formula is defined inductively.
For an evaluator function I:L{,}\mathfrak{I}: \mathcal{L} \mapsto \{ \top, \bot \}; vVv \in \mathcal{V}; t1,,tnTt_1, \dots, t_n \in \mathcal{T}; φL\varphi \in \mathcal{L}; etc.,
I(v)ρ(v)I(f(t1,,tn))fM(I(t1),,I(tn))I(P(t1,,tn))PM(I(t1),,I(tn))I(φ) \begin{align*} \mathfrak{I}(v) &\coloneqq \rho(v) \\ \mathfrak{I}\big( f(t_1, \dots, t_n) \big) &\coloneqq f^M\big( \mathfrak{I}(t_1), \dots, \mathfrak{I}(t_n) \big) \\ \mathfrak{I}\big( P(t_1, \dots, t_n) \big) &\coloneqq P^M\big( \mathfrak{I}(t_1), \dots, \mathfrak{I}(t_n) \big) \\ \mathfrak{I}(\varphi) &\coloneqq \cdots \end{align*}

where the omitted definitions (for φ\varphi) are 1. those from Propositional Logic, 2. the rules for the first-order quantifiers, which are described as axioms below.

So, a formula can only be evaluated under a specified valuation in a specified structure (unless it is a tautology or contradiction, or a sentence – forms with no free variables that need valuations).

Valuation Update

Aka. assignment function modification

The idea is that we “update” a ρ\rho with a (overriding) rule to specifically map a variable vv to a value aDa \in \mathbb{D}.

I’ve seen two (three) notations for this:
ρ[va]\rho[v \mapsto a] or ρ[va]\rho[v | a], and ρav\rho\frac{a}{v}

In any case, the mechanism is self-explanatory:
ρ[va](x)={aif x is vρ(x)otherwise \rho[v \mapsto a](x) = \begin{cases} a & \text{if $x$ is $v$}\\ \rho(x) & \text{otherwise} \\ \end{cases}

Validity & Satisfiability

Satisfiable = could evaluate true
Valid = cannot evaluate false

AA evaluates true (under ρ\rho) in MM
    MρA\iff M \models_\rho A

AA is valid in structure MM
    MA\iff M \models A
    ρ: MρA\iff \forall \rho : \ M \models_\rho A
where ρ\rho is a valuation for MM. (and MM is assignable to AA)

AA is valid”
    A\iff \models A
    M:MA\iff \forall M : M \models A
where MM is a structure assignable to AA

AA is satisfiable in MM
    ρ:MρA\iff \exists \rho : M \models_\rho A

AA is satisfiable”
    M:(ρ:MρA)\iff \exists M : (\exists \rho : M \models_\rho A)

“A is valid”     \iff¬A\lnot A is unsatisfiable”;
AA is satisfiable”     \iff¬A\lnot A is invalid”.

Rules of Inference

Rules of Inferences are syntactical transformations, and as such operate only on symbols in λ\lambda (and may be performed irrespective to / without any interpretations).

A nice list is on Wikipedia. Note that most of the rules listed are not “primitive” and can be derived from the 2 axioms below (plus Modus Ponens from propositional logic).

Axioms (minimal)

(For our scope) Axioms and Rules of Inference are isomorphic.

Note that in first-order logic, all variables are symbolic denotations of objects to serve as arguments to functions and predicates.
In second-order logic, PD\mathcal{P} \subseteq \mathbb{D} and variables may denote predicates (and as such quantification over predicates is possible).
We may write axioms for first-order logic using second-order logic (where the axioms are more precisely called axiom schemata).

The minimal set of axiom schemata for first-order logic are those of propositional logic, plus the following two:
xA    A[t/x]A[t/x]    xA \forall x \, A \implies A[t/x] \\[1em] A[t/x] \implies \exists x \, A

where AA is a Formula in which xVx \in \mathcal{V} (exists and) is free, and tTt \in \mathcal{T}. See below for the square bracket syntax.

I.e.,
If A is true for everything assigned to x, then we may assign anything to x and A will still be true.
If A is ever true, then there must exist something that when assigned to x makes A true

Substitution

A[t/x]A[t/x] means formula AA with all (unbound) instances of variable xx replaced by term tt. I.e., a refactoring.
Note this is not a semantic valuation, but a syntactic rewriting.

tt should not contain variable symbols already in AA as that might cause variable shadowing and corrupt the formula.

Equivalences

For \equiv denoting syntactic equivalence between two formulas,
AB    Mρ:(MρA)(MρB) A \equiv B \iff \forall M\, \forall \rho : (M \models_\rho A) \Leftrightarrow (M \models_\rho B)

I.e., AA and BB are equivalent (syntactically) iff they are equivalent (semantically) under all interpretations.

Common (derived) rules of inference stated as equivalences:
xyAyxAxyAyxAxAz(A[z/x])xAz(A[z/x])xAA  if (unbound) x not in AxAA  if (unbound) x not in A¬xAx¬A¬xAx¬Ax(AB)(xA)(xB)x(AB)(xA)(xB)x(AB)(xA)B  if (unbound) x not in Bx(AB)(xA)B  if (unbound) x not in B \begin{align*} \forall x \,\forall y A &\equiv \forall y \,\forall x A \\ \exists x \,\exists y A &\equiv \exists y \,\exists x A \\ \forall x A &\equiv \forall z \,(A[z/x]) \\ \exists x A &\equiv \exists z \,(A[z/x]) \\ \forall x A &\equiv A \ \text{ \footnotesize if (unbound) $x$ not in $A$} \\ \exists x A &\equiv A \ \text{ \footnotesize if (unbound) $x$ not in $A$} \\ \\ \lnot \forall x A &\equiv \exists x \,\lnot A \\ \lnot \exists x A &\equiv \forall x \,\lnot A \\ \forall x \,(A \land B) &\equiv (\forall x A) \land (\forall x B) \\ \exists x \,(A \lor B) &\equiv (\exists x A) \lor (\exists x B) \\ \forall x \,(A \lor B) &\equiv (\forall x A) \lor B \ \text{ \footnotesize if (unbound) $x$ not in $B$} \\ \exists x \,(A \land B) &\equiv (\exists x A) \land B \ \text{ \footnotesize if (unbound) $x$ not in $B$} \\ \end{align*}

Some notable non-equivalences:
xyA≢yxAx(AB)≢(xA)(xB)x(AB)≢(xA)(xB) \begin{align*} \forall x \,\exists y A &\not\equiv \exists y \,\forall x A \\ \forall x \,(A \lor B) &\not\equiv (\forall x A) \lor (\forall x B) \\ \exists x \,(A \land B) &\not\equiv (\exists x A) \land (\exists x B) \\ \end{align*}

Prenex Normal Form

Prenex Normal Form (PNF) is a normal form where all quantifiers are gathered at the outermost scope for a formula.
I.e., when a formula has a two-part structure of a prefix of all quantification clauses followed by a quantifier-free sub-formula matrix.

Every formula has a PNF.

PNF is useful in (automated) theorem proving.

Standard PNF conversion strategy:

  1. Eliminate \rightarrow and \leftrightarrow (substitute for ¬\lnot, \land, \lor)
  2. Push ¬\lnot inwards (De Morgan’s laws)
  3. Rename variables quantified multiple times (substitution)
  4. Pull quantifiers outwards

Step 3 is necessary since all scopes will be “voided / made global” when the quantifiers are moved to the outermost grouping.
See Common Equivalences above for strategies to achieve each step.


  1. technically there’s also a 3rd element in this tuple, ar\text{ar}, that is simply a mapping that gives the arity of each function symbol or relation symbol – ar:FPN\text{ar}: \mathcal{F} \cup \mathcal{P} \mapsto \mathbb{N}. ↩︎

  2. intuitively, a relation is the set of all inputs that make a predicate true.
    Formally, the equivalent relation of a predicate PP is {(x1,,xn)P(x1,,xn)}\{ (x_1, \dots, x_n) \mid P(x_1, \dots, x_n) \} ↩︎

  3. Every nn-ary function is a (n+1)(n+1)-ary predicate. For a function f(x1,,xn)f(x_1, \dots, x_n), we can define an equivalent predicate P(x1,,xn,y)P(x_1, \dots, x_n, y) where for any given X1,,XnX_1, \dots, X_n and YY, P(X1,,Xn,Y)=ifff(X1,,Xn)=YP(X_1, \dots, X_n, Y) = \top \quad \text{iff}\quad f(X_1, \dots, X_n) = Y ↩︎

  4. if λ\lambda contains an equality relation == ↩︎

  5. Also written M=D,FM,PMM = \braket{\mathbb{D}, \mathcal{F}^M, \mathcal{P}^M}, where instead of II we explicitly have the sets of functions and predicates. ↩︎

  6. So, under an interpretation with the Domain of Discourse D\mathbb{D}, x\forall x means “for every element xDx \in \mathbb{D}, …” and x\exists x means "for at least one element xDx \in \mathbb{D}".
    In a textual writing context, D\mathbb{D} is often implicitly inferred. ↩︎