Thanks again @Stephen for your comprehensive answer. This is much more detailed than I had expected. Thank you for your guidance regarding defining a closed (infinite) set of formulas, I will try it later.
I come from a (dynamically-typed) programming language implementation background and the closest analogy to what I am trying to do is a "meta-circular implementation" of propositional logic on top of TLA+. In other words, I am trying to make explicit some of the logic while avoid a complete reimplementation by reusing existing operators. This way I can force the provers to require explicit fact and definition declarations so that the required proofs of basic theorems of propositional logic closely follow what I wrote by hand. In particular, a proof step will fail if I make simple mistakes, such as using the wrong facts or failing to explicit the targets of substitution. But I would like to still obtain a mostly declarative format for the proof, as usual for TLAPS and the proof style advocated by Lamport, and not have to completely redefine and prove a propositional logic.
I think this is possible because the textbook I am using ("A logical approach to discrete mathematics" by Gries and Schneider (1993)), uses a syntactic approach to proving theorems based on sequences of equivalences and substitutions, until the formula is equivalent to one of the axioms. They do this so that you need very little conceptual background to start proving theorem and the proof steps mostly follow the intuition students already have from algebra. So I think I can get away with not having to define and prove a semantics (i.e. interpretation) for the formulas, at the cost of a little more work when defining the axioms and assumptions to supply how equivalence maps to them.
Concretely, here is the current version I have. I re-use:
1. the TLA+ '\equiv'
2. the TLA+ '=' to define variables that corresponds to sub-formulas
3. explicitly mention that substitution is possible between the left-hand side and right-hand side of '\equiv' by asserting that both sides are equal when equivalent, otherwise the provers don't use the equivalence relationship for substitution.
EquivImpliesEq == \A f1,f2 \in F : f1 \equiv f2 => f1 = f2
where F is the set of formulas.
Everything else is redefined with different syntactic operators and a representation based on sequences, so the provers cannot use their knowledge of propositional logic to implicitly perform the proof steps. Any feedback is welcome to help simplify things or maybe correct
some misunderstanding I may have!
------------------------ MODULE PropositionalLogic2 ------------------------
EXTENDS Sequences
(* Syntactic redefinition of basic operators of propositional *)
(* logic to hide them from the provers. Proofs then require *)
(* explicit facts, otherwise facts we supply that are inconsistent *)
(* with the proof step do not matter since the provers already know *)
(* all of propositional logic. *)
True == <<"T">>
False == <<"F">>
Neg(f) == <<"neg", f>>
Equiv(s) == <<"equiv">> \circ s
a \doteq b == Equiv(<<a,b>>)
a ## b == Neg(a \doteq b)
(* We define the set of valid formulas by an unconstrained set *)
(* on which we later assert some properties. This side-steps a few *)
(* bugs we ran into while using recursive operators or functions. *)
(* This set is however not closed so cannot be relied on to determine*)
(* when a syntactic formula is incorrect. I will try the approach *)
(* suggested by Stephen next. *)
CONSTANT F
Formula(f) ==
\/ Len(f) = 1 /\ Head(f) = Head(True)
\/ Len(f) = 1 /\ Head(f) = Head(False)
\/ /\ Head(f) = "neg"
/\ Len(f) = 2
/\ \E f2 \in F :
/\ f=Neg(f2)
\/ /\ Head(f) = "equiv"
/\ Len(f) = 3
/\ \E f1,f2 \in F :
/\ f=(f1 \doteq f2)
(* The FOK assumptions seemed useful in previous tests *)
(* but are not necessary for the first two theorems below. *)
FOK ==
/\ \A f : Formula(f) \equiv f \in F
/\ { True, False } \subseteq F
/\ \A f \in F : Neg(f) \in F
/\ \A f1,f2 \in F : (f1 \doteq f2) \in F
/\ \A f1,f2 \in F : (f1 ## f2) \in F
(* Tells the prover to allow substitutions
between f1 and f2 when they are asserted to be equivalent. *)
EquivImpliesEq == \A f1,f2 \in F : f1 \equiv f2 => f1 = f2
(* These global assumptions are only for the reader: *)
(* they are ignored by the proof manager so we need *)
(* to repeat them for every theorem when necessary. *)
ASSUME FOK
ASSUME EquivImpliesEq
(* Axioms are given, first with their strictly syntactic version, *)
(* then with a subset of corresponding equivalences that we used *)
(* in proofs.*)
AXIOM EqAssoc == \A p,q,r \in F :
/\ ((p \doteq q) \doteq r) \doteq (p \doteq (q \doteq r))
/\ ((p \doteq q) \doteq r) \equiv (p \doteq (q \doteq r))
AXIOM EqSym == \A p,q \in F :
/\ (p \doteq q) \doteq (q \doteq p)
/\ (p \doteq q) \equiv (q \doteq p)
AXIOM EqIden == \A p \in F :
/\ True \doteq (p \doteq p)
/\ True \equiv (p \doteq p)
/\ (True \doteq p) \equiv p
(* Following Gries and Schneider (1993), proofs are *)
(* written with sequences of equivalences until the *)
(* theorem reduces to one of the (purely syntactic) *)
(* axioms. *)
(* Equalities mentioned in proof steps are redundant, *)
(* since they are already part of the assumptions, *)
(* but they make the substitution explicit for the *)
(* reader. *)
THEOREM Thm3_4 ==
ASSUME EquivImpliesEq,
NEW p \in F,
NEW q \in F, q = True,
NEW r \in F, r = (p \doteq p)
PROVE True
<1>1 True \equiv (q \doteq q)
BY EqIden
<1>2 (q \doteq q) \equiv (q \doteq r)
BY EqIden, q = True, r = (p \doteq p)
<1>3 (True \doteq (p \doteq p))
BY EqIden (* Syntactic Axiom, done *)
<1>z QED
BY <1>1,<1>2,<1>3
THEOREM EqRefl ==
ASSUME EquivImpliesEq,
NEW p \in F,
NEW r \in F, r = True,
NEW s \in F, s = (p \doteq p),
NEW t \in F, t = p,
NEW u \in F, u = (p \doteq True),
\A f1,f2 \in F : f1 \equiv f2 => f1 = f2
PROVE p \doteq p
<1>1 (p \doteq p) \equiv (p \doteq (p \doteq True))
BY EqIden, t = p, u = (p \doteq True)
<1>2 (p \doteq (p \doteq True)) \equiv ((p \doteq p) \doteq True)
BY EqAssoc, r = True
<1>3 (True \doteq (p \doteq p)) \equiv ((p \doteq p) \doteq True)
BY EqSym, r = True, s = (p \doteq p)
<1>4 (True \doteq (p \doteq p))
BY EqIden (* Syntactic Axiom, done *)
<1>5 QED
BY <1>1, <1>2, <1>3, <1>4
=============================================================================