First-order Logic

First-order Logic, in a strict sense, is a class of Formal Systems.

A first-order logic, by definition of a formal system, is made of

a Formal Language, first-order language
- read the notes on Formal Language & Grammar first
- sections Alphabet and Syntax below
a Deductive System
- sections Semantics and Rules of Inference below

I.e., we speak of a specialization of Formal Systems, encompassing a First-order Language (specialization of Formal Language) and a respective Deductive System, that is capable of expressing first-order logic.

The Deductive System and a subset of the Language are (conventionally) fixed across all first-order logics. This leaves possible variations between first-order logics to be within language, namely the signature.

The fixed definitions are what define the class (are what make “first-order” logic “first-order”).
Characteristically, first-order logic is Propositional Logic extended with:

nonlogical objects predicates and functions
the concept of quantification (and variables)
- to argue over a precise subset of a domain

Note that there is no definitive/authoritative specification on the “conventionally fixed” parts of first-order logics. Nevertheless, we hereafter discuss what the more-or-less universally accepted ideas.

Alphabet

Recall what an alphabet, $\Sigma$ , is in a Grammar.

For a first-order language, we further subdivide $\Sigma$ into $\braket{\lambda, \sigma}$ :

$\lambda$ – Logical Symbols
$\sigma$ – Nonlogical Symbols (or, the signature)

This division facilitates the separation of semantics from syntax (formation rules).

Logical Symbols

These have constant, assumed, meaning (across all interpretations).
Part of Syntax

The exact contents of $\lambda$ is up to the author, although conventionally it contains:

Quantifiers
- $\forall$ – universal quantification
- $\exists$ – existential quantification
Logical Connectives (relations / operators)
- $\lnot$ – negation
- $\land$ – conjunction
- $\lor$ – disjunction
- $\rightarrow$ – implication
- $\leftrightarrow$ – biimplication
Punctuation
- (, ) – parenthesis, for precedence grouping
- ., ,, etc. – other delimiting/grouping markers
Constants (syntactically, also variables)
- $\top$ – trueness
- $\bot$ – falseness
Variables (stand-in symbols for statements)
- an infinite set, $\mathcal{V}$ , e.g., $\{ a, b, c, d, \dots \}$

An equality symbol, e.g., $=$ , is sometimes also included, or otherwise defined to be a nonlogical predicate.

Here, a frequent difference across authors is the usage of alternative symbols, especially for the logical connectives and true and false constants. E.g., $+$ for $\lor$ , $\cdot$ for $\land$ , $\sim$ for $\lnot$ , $T$ for $\top$ , $F$ for $\bot$ .

Nonlogical Symbols

These are meaningless placeholders until assigned to by interpretations.
Part of Semantics

The author of the language defines $\sigma$ , two sets $\braket{\mathcal{F}, \mathcal{P}}$ ¹:

$\mathcal{F}$ – Functions
$\mathcal{P}$ – Predicates (or $R$ – Relations)

“Constants”, should they be distinguished, are nullary functions. Similarly,
Propositions (from propositional logic) are nullary predicates.

Some texts speak of Relations instead of Predicates.
They are isomorphic; there is a bijection between the two forms.²

The same could also be said for Predicates and Functions.
Behaviorally, predicates are functions whose codomain is precisely $\{ \top, \bot \}$ (and functions are predicates of (+1)-arity³).
But, functions and predicates face different syntactical rules, and are hence segregated.

Note that $\sigma$ contains only “symbols” in the sense of representations of objects with unspecified meaning/behavior.
This is unlike $\lambda$ , whose contents do have their meaning/behavior specified.

To relate back to Formal Grammar, a first-order language $\mathcal{L}$ , where $\Sigma = \sigma \cup \lambda$ , has $\mathcal{L} \subseteq \Sigma^*$

As the signature $\sigma$ is the only “variable” part between first-order languages, some consider it equivalent to / definitive of a first-order language.

Syntax (Language Formation Rules)

Here we restate the fixed grammar of first-order languages.

Everything discussed so far is only to establish the syntax of first-order languages.
Syntax allows the construction and manipulation of formulas without concern for what any of the nonlogical symbols represent.

Semi-formal definition

Terms $t \in \mathcal{T}$ are (inductively defined as) any

$v \in \mathcal{V}$
$f(t_1, \dots,t_n)$ for $f \in \mathcal{F}, \ \{ t_1, \dots, t_n \} \subseteq \mathcal{T}$

Atomic Formulas / atomic statements are any

$P(t_1, \dots,t_n)$ for $P \in \mathcal{P}, \ \{ t_1, \dots,t_n \} \subseteq \mathcal{T}$
$t_1 = t_2$ for $t_1, t_2 \in \mathcal{T}$ ⁴

(sentential) Formulas / statements are any

Atomic Formula
(for Formulas $\varphi$ , $\psi$ and Logical Connective $\text{op}$ )
- $\varphi \;\text{op}\; \psi \space$ if $\text{op}$ is binary
- $\text{op}\; \varphi \space$ if $\text{op}$ is unary
(for Formula $\varphi$ and Quantifier $\mathcal{Q}$ where $x$ is unbound in $\varphi$ )
- $\mathcal{Q} x\, \varphi$

Relating back to Formal Language, for any $\varphi$ that is a sentential formula, $\varphi \in \mathcal{L}$ .

If we consider the “datatypes”, from computer science, of the above,
Terms are of Any type;
(Atomic) formulas are of boolean.

Every (sub-)formula is implicitly parenthesized.
For the final bullet point above, a . or : may be added as per $\mathcal{Q} x .\, \varphi$ or $\mathcal{Q} x : \varphi$ to more explicitly delimit the quantification clause from the sub-formula.

In actual writing, to reduce parenthesis, there is a conventional precedence for the logical connectives such that parenthesis may be omitted as long as unique readability is maintained. In descending precedence:

$\lnot$
$\land$ and $\lor$
$\forall$ and $\exists$
$\rightarrow$ and $\leftrightarrow$

Formal definition

Observe that “Term”, “Atomic Formula”, and “Formula” are Nonterminal Symbols, while the actual formula produced are comprised on Terminal Symbols only.

See a formal representation in BNF on Wikipedia

It is otherwise too much work (and not exactly useful) to write up the formation rules in a first-order language. (Although this self-definition would be cool?)

Variable Binding

A variable is bound if it is quantified, and free otherwise.
Within a larger formula, a variable is unbound if it is free somewhere, and bound if bound everywhere.

A variable is quantified when it is used with a quantifier, e.g., $\chi = (\forall x \, \varphi) \lor \psi$ binds $x$ in the scope of $\varphi$ , but is free in $\psi$ (assuming $x$ does appear), and is therefore unbound in $\chi$ .

I.e., bound variables are not “inputs”; they are not symbols to be assigned values during interpretation (like proposition letters). They are implicit and cannot be assigned to.

Imagine quantifiers as a for-each loop, e.g., for (item : list){...}. Then, item is bound . (And as such you wouldn’t redefine it, since the loop “manages” it “automatically”.)

A formula with no (free) variables is a sentence.

Semantics (Interpretation Rules)

Specifically, we are discussing Tarskian semantics.

A language does not encode any semantics within itself.
Therefore, to perform any semantic treatment of a formula, we must interpret it.

An interpretation assigns a denotation to each nonlogical symbol and is a pair $\braket{M, \rho}$ :

$M$ – First-order-structure
a structure providing concrete interpretations for the symbols from $\sigma$
$\rho$ – Valuation (of a structure)
a mapping assigning values to unbound variables from $\mathcal{V}$

First-order Structures

A “first-order” structure or an $\mathcal{L}$ -structure (where $\mathcal{L}$ is a first-order langauge) is a structure for a first-order language.
I.e., the structure provides interpretation specifically for (the symbols in) $\mathcal{L}$ .

An $\mathcal{L}$ -structure is a pair $M = \braket{\mathbb{D}, I}$ ⁵:

$\mathbb{D}$ – Domain of Discourse / Universal Set
- non-empty set
- is the range of the quantifiers⁶
- I.e., are the “objects” the formula will “work with”
$I$ – Interpretation Function/Mapping
- $I: \sigma \mapsto \mathcal{F}^M \cup \mathcal{P}^M$
  - $\mathcal{F}^M$ – $n$ -ary functions of $\mathbb{D}^n \mapsto \mathbb{D}$
  - $\mathcal{P}^M$ – $n$ -ary predicates of $\mathbb{D}^n \mapsto \{ \top, \bot \}$
- I.e., maps symbols to their actual denoted objects,
  - assigns each $f \in \mathcal{F}$ an $f^M \in \mathcal{F}^M$
  - assigns each $P \in \mathcal{P}$ a $P^M \in \mathcal{P}^M$
  - the arities, of course, must match

Assignability (of a structure to a formula) is a retrospective assertion of whether the structure is for the language the formula is in.

Valuations

Aka. (variable) assignment.

A valuation in an $\mathcal{L}$ -structure $M$ is a mapping
$\rho : \mathcal{V} \mapsto \mathbb{D}^M$

such that by applying $\rho$ to every free variable in a formula, we obtain a sentence.

Then, the evaluation of a formula is defined inductively.
For an evaluator function $\mathfrak{I}: \mathcal{L} \mapsto \{ \top, \bot \}$ ; $v \in \mathcal{V}$ ; $t_1, \dots, t_n \in \mathcal{T}$ ; $\varphi \in \mathcal{L}$ ; etc.,
$\begin{align*} \mathfrak{I}(v) &\coloneqq \rho(v) \\ \mathfrak{I}\big( f(t_1, \dots, t_n) \big) &\coloneqq f^M\big( \mathfrak{I}(t_1), \dots, \mathfrak{I}(t_n) \big) \\ \mathfrak{I}\big( P(t_1, \dots, t_n) \big) &\coloneqq P^M\big( \mathfrak{I}(t_1), \dots, \mathfrak{I}(t_n) \big) \\ \mathfrak{I}(\varphi) &\coloneqq \cdots \end{align*}$

where the omitted definitions (for $\varphi$ ) are 1. those from Propositional Logic, 2. the rules for the first-order quantifiers, which are described as axioms below.

So, a formula can only be evaluated under a specified valuation in a specified structure (unless it is a tautology or contradiction, or a sentence – forms with no free variables that need valuations).

Valuation Update

Aka. assignment function modification

The idea is that we “update” a $\rho$ with a (overriding) rule to specifically map a variable $v$ to a value $a \in \mathbb{D}$ .

I’ve seen two (three) notations for this:
$\rho[v \mapsto a]$ or $\rho[v | a]$ , and $\rho\frac{a}{v}$

In any case, the mechanism is self-explanatory:
$\rho[v \mapsto a](x) = \begin{cases} a & \text{if $x$ is $v$}\\ \rho(x) & \text{otherwise} \\ \end{cases}$

Validity & Satisfiability

Satisfiable = could evaluate true
Valid = cannot evaluate false

“ $A$ evaluates true (under $\rho$ ) in $M$ ”
$\iff M \models_\rho A$

“ $A$ is valid in structure $M$ ”
$\iff M \models A$
$\iff \forall \rho : \ M \models_\rho A$
where $\rho$ is a valuation for $M$ . (and $M$ is assignable to $A$ )

“ $A$ is valid”
$\iff \models A$
$\iff \forall M : M \models A$
where $M$ is a structure assignable to $A$

“ $A$ is satisfiable in $M$ ”
$\iff \exists \rho : M \models_\rho A$

“ $A$ is satisfiable”
$\iff \exists M : (\exists \rho : M \models_\rho A)$

“A is valid” $\iff$ “ $\lnot A$ is unsatisfiable”;
“ $A$ is satisfiable” $\iff$ “ $\lnot A$ is invalid”.

Rules of Inference

Rules of Inferences are syntactical transformations, and as such operate only on symbols in $\lambda$ (and may be performed irrespective to / without any interpretations).

A nice list is on Wikipedia. Note that most of the rules listed are not “primitive” and can be derived from the 2 axioms below (plus Modus Ponens from propositional logic).

Axioms (minimal)

(For our scope) Axioms and Rules of Inference are isomorphic.

Note that in first-order logic, all variables are symbolic denotations of objects to serve as arguments to functions and predicates.
In second-order logic, $\mathcal{P} \subseteq \mathbb{D}$ and variables may denote predicates (and as such quantification over predicates is possible).
We may write axioms for first-order logic using second-order logic (where the axioms are more precisely called axiom schemata).

The minimal set of axiom schemata for first-order logic are those of propositional logic, plus the following two:
$\forall x \, A \implies A[t/x] \\[1em] A[t/x] \implies \exists x \, A$

where $A$ is a Formula in which $x \in \mathcal{V}$ (exists and) is free, and $t \in \mathcal{T}$ . See below for the square bracket syntax.

I.e.,
If A is true for everything assigned to x, then we may assign anything to x and A will still be true.
If A is ever true, then there must exist something that when assigned to x makes A true

Substitution

$A[t/x]$ means formula $A$ with all (unbound) instances of variable $x$ replaced by term $t$ . I.e., a refactoring.
Note this is not a semantic valuation, but a syntactic rewriting.

$t$ should not contain variable symbols already in $A$ as that might cause variable shadowing and corrupt the formula.

Equivalences

For $\equiv$ denoting syntactic equivalence between two formulas,
$A \equiv B \iff \forall M\, \forall \rho : (M \models_\rho A) \Leftrightarrow (M \models_\rho B)$

I.e., $A$ and $B$ are equivalent (syntactically) iff they are equivalent (semantically) under all interpretations.

Common (derived) rules of inference stated as equivalences:
$\begin{align*} \forall x \,\forall y A &\equiv \forall y \,\forall x A \\ \exists x \,\exists y A &\equiv \exists y \,\exists x A \\ \forall x A &\equiv \forall z \,(A[z/x]) \\ \exists x A &\equiv \exists z \,(A[z/x]) \\ \forall x A &\equiv A \ \text{ \footnotesize if (unbound) $x$ not in $A$} \\ \exists x A &\equiv A \ \text{ \footnotesize if (unbound) $x$ not in $A$} \\ \\ \lnot \forall x A &\equiv \exists x \,\lnot A \\ \lnot \exists x A &\equiv \forall x \,\lnot A \\ \forall x \,(A \land B) &\equiv (\forall x A) \land (\forall x B) \\ \exists x \,(A \lor B) &\equiv (\exists x A) \lor (\exists x B) \\ \forall x \,(A \lor B) &\equiv (\forall x A) \lor B \ \text{ \footnotesize if (unbound) $x$ not in $B$} \\ \exists x \,(A \land B) &\equiv (\exists x A) \land B \ \text{ \footnotesize if (unbound) $x$ not in $B$} \\ \end{align*}$

Some notable non-equivalences:
$\begin{align*} \forall x \,\exists y A &\not\equiv \exists y \,\forall x A \\ \forall x \,(A \lor B) &\not\equiv (\forall x A) \lor (\forall x B) \\ \exists x \,(A \land B) &\not\equiv (\exists x A) \land (\exists x B) \\ \end{align*}$

Prenex Normal Form

Prenex Normal Form (PNF) is a normal form where all quantifiers are gathered at the outermost scope for a formula.
I.e., when a formula has a two-part structure of a prefix of all quantification clauses followed by a quantifier-free sub-formula matrix.

Every formula has a PNF.

PNF is useful in (automated) theorem proving.

Standard PNF conversion strategy:

Eliminate $\rightarrow$ and $\leftrightarrow$ (substitute for $\lnot$ , $\land$ , $\lor$ )
Push $\lnot$ inwards (De Morgan’s laws)
Rename variables quantified multiple times (substitution)
Pull quantifiers outwards

Step 3 is necessary since all scopes will be “voided / made global” when the quantifiers are moved to the outermost grouping.
See Common Equivalences above for strategies to achieve each step.

technically there’s also a 3rd element in this tuple, $\text{ar}$ , that is simply a mapping that gives the arity of each function symbol or relation symbol – $\text{ar}: \mathcal{F} \cup \mathcal{P} \mapsto \mathbb{N}$ . ↩︎
intuitively, a relation is the set of all inputs that make a predicate true.
Formally, the equivalent relation of a predicate $P$ is $\{ (x_1, \dots, x_n) \mid P(x_1, \dots, x_n) \}$ ↩︎
Every $n$ -ary function is a $(n+1)$ -ary predicate. For a function $f(x_1, \dots, x_n)$ , we can define an equivalent predicate $P(x_1, \dots, x_n, y)$ where for any given $X_1, \dots, X_n$ and $Y$ , $P(X_1, \dots, X_n, Y) = \top \quad \text{iff}\quad f(X_1, \dots, X_n) = Y$ ↩︎
if $\lambda$ contains an equality relation $=$ ↩︎
Also written $M = \braket{\mathbb{D}, \mathcal{F}^M, \mathcal{P}^M}$ , where instead of $I$ we explicitly have the sets of functions and predicates. ↩︎
So, under an interpretation with the Domain of Discourse $\mathbb{D}$ , $\forall x$ means “for every element $x \in \mathbb{D}$ , …” and $\exists x$ means "for at least one element $x \in \mathbb{D}$ ".
In a textual writing context, $\mathbb{D}$ is often implicitly inferred. ↩︎