Adds experiments/test_conj_5_26_n_21_22.py, a clause-4 checker that re-uses find_all_36_witnesses + check_clause_4 from check_conj_final_scaled.py and runs them on n = 21, 22 with incremental JSONL output and a 10-minute PROGRESS heartbeat. Results (139 min wall, single thread): n=21: 192 tri, 392,370 colourings w/ clause-1–3 witness, all pass n=22: 651 tri, 1,786,314 colourings w/ clause-1–3 witness, all pass total at n ≤ 22: 2,321,496 / 2,321,496 (combined with the existing 142,812 at n ≤ 20 from check_conj_final_scaled.py) Paper edits: - Abstract: "|V(G)| ≤ 20 (142,812)" → "|V(G)| ≤ 22 (2,321,496)" for the strengthening; clauses-1–3 count unchanged at 535,182 / n ≤ 21. - Intro paragraph: matching update. - Remark rem:conj-3-8-empirical table: added n=21 and n=22 rows; new total ($n \le 22$) = 959 triangulations, 2,321,496 colourings. - Updated script reference in that remark to point at check_conj_final_scaled.py + test_conj_5_26_n_21_22.py. COMMENTARY.md summary table: Conjecture 5.26 row bumped to 2,321,496 / 2,321,496 (n ≤ 22). Also commits the test_*_results.jsonl artifacts (with per-tri records + n-summaries + grand summary) for reproducibility. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
14 KiB
Empirical state of the Conjecture-5.1 proof
This document is a snapshot of where the proof of Conjecture 5.1
(the face-monochromatic-pair conjecture) stands after the empirical
work in experiments/. It is meant as commentary for the reader who
has finished reading the paper and wants to know what's been verified
computationally vs what remains to be proven structurally.
Summary table
| claim | status | empirical evidence |
|---|---|---|
| Conjecture 5.1 (clauses 1–3) | conjecture | ✓ 535,182 / 535,182 (n ≤ 21, direct witness search) |
| Conjecture 5.26 (clauses 1–4, strengthening) | conjecture | ✓ 2,321,496 / 2,321,496 (n ≤ 22, via direct clause-4 check in check_conj_final_scaled.py for n ≤ 20 + test_conj_5_26_n_21_22.py for n ∈ {21, 22}) |
Non-constancy of h_φ on V(K_b) ∪ V(K_c) |
sufficient to prove 5.1 via Lemma 5.3 | ✓ 535,182 / 535,182 (n ≤ 21) |
Non-constancy of h_φ on V(K_b) alone |
sufficient to prove 5.1 via Corollary 5.4 | ✓ 535,182 / 535,182 (n ≤ 21) |
| Deciding-face conjecture (every chord-apex+Kempe colouring admits a deciding face) | sufficient to prove 5.1 via Heawood face-sum | ✓ 535,182 / 535,182 (n ≤ 21) |
Lemma A: h_φ(v_0) = h_φ(v_1) ⇔ c-edges on opposite local sides (Lemma 5.2 in the paper) |
proven (Lemma 5.2) | ✓ 625,200 / 625,200 consecutive pairs |
Identity s_b ⊕ s_c = i_b ⊕ i_c ⊕ 1 at shared vertex |
follows from the Heawood definitions + Lemma A | ✓ 263,004 / 263,004 shared vertices |
Parity-bucket symmetry n_{(0,0)} = n_{(1,1)} and n_{(0,1)} = n_{(1,0)} over shared vertices |
structural (likely provable) | ✓ universal |
At consecutive shared K_b-vertices, i_b parity strictly alternates |
structural | ✓ universal (transition matrix has 0 within-parity transitions) |
At odd-i_b shared vertices, i_c parity flip is forced on transition |
structural | ✓ universal |
What's actually open
The shortest path to a proof of Conjecture 5.1 is now:
Show structurally that for every chord-apex+Kempe colouring
φof every reduced dualĜ'_{v,i},h_φis not constant onV(K_b)(or equivalently onV(K_c)).
This is verified on 142,812 / 142,812 colourings up to n = 20. It is
captured in the paper by Corollary 5.4 (the per-cycle form of
Lemma 5.3) and Remark 5.5 (the empirical near-proof).
Why the proof is harder than it looks
Three empirical facts from the diagnostics in experiments/ rule out
the simplest proof strategies:
-
The obstruction has no slack. The minimum Heawood-flip count on
K_bobserved across the data is 2, attained on 12 colourings atn = 18. Those colourings have one single minority Heawood vertex onV(K_b)— flipping its sign would give constancy. So the proof cannot rest on bulk inequalities like "at least half ofV(K_b)has each sign"; it must rule out every single-minority configuration. -
The minority isn't anchored to a structural vertex. For low-flip-count colourings, the minority vertex(es) are distributed roughly evenly across
v_n,A_0, …, A_4, and "other" non-named vertices in the rest ofG':v_n 12.86% A_{i+1} 10.82% A_{i+2} 8.98% A_i 7.76% A_{i+4} 5.31% A_{i+3} 5.10% other 49.18%So no single named vertex is always the minority — the proof cannot fix on "vertex X must have Heawood opposite to the majority"; roughly half the time the minority lives outside the named six.
-
No single named-vertex-pair is always a Heawood mismatch. The most "reliable" same-cycle pair is
(A_i, A_{i+1})and(A_{i+1}, A_{i+2})at consecutive face-boundary positions, each with 75% mismatch rate — well short of universal. So the proof can neither identify a specific edge that always has differing Heawood at its endpoints, nor a specific vertex that's always minority.
Together (1)–(3) say the obstruction is global, not local: there
is always a Heawood mismatch on V(K_b), but where it sits varies by
colouring. A successful structural proof will need a global argument —
likely a topological / homological / parity-counting argument that
operates on all of V(K_b) simultaneously, rather than identifying a
specific forced flip.
Candidate mechanisms (none confirmed)
These were explored and the corresponding diagnostics ruled them out or revealed why they don't yield a contradiction on their own:
-
Heawood sum identity
∑_v h_φ(v) = 0. Holds only ~17.6% of the time on chord-apex+Kempe colourings; the sum can be anywhere in{-24, -20, …, 24}. So this classical identity is not available here. -
Heawood sum on a single Kempe cycle
∑_{V(K)} h_φ = 0. Holds only ~23% of the time per cycle. -
Cycle-side balance
|L_b| = |R_b|. Holds only 35.43% of the time. Constancy would force this exactly, but the empirical imbalance is large in most colourings. -
Specific named-vertex pair always mismatches. No such pair exists; closest is
(A_j, A_{j+1})at 75%.
Files
- Paper text:
paper.tex, sections 3 (Heawood number definition, Lemma 5.2), 5 (Lemma 5.3, Corollary 5.4, Remark 5.5). - Diagnostic scripts: see
experiments/check_heawood_*.py,experiments/check_kempe_intersection_and_alternation.py,experiments/check_shared_*.py,experiments/check_cw_parity_prediction.py,experiments/check_constancy_obstruction.py,experiments/check_min_flip_structure.py,experiments/check_minority_location.py.
Failed proof route via edge-sharing Kempe constancy
A natural-looking strategy to prove Conjecture 5.1 was:
Conjecture (now disproved): If
K_0is an ${a,b}$-Kempe cycle of\varphiandK_1is an ${a,c}$-Kempe cycle of\varphithat shares an edge withK_0, thenh_\varphicannot be constant on bothV(K_0)andV(K_1)simultaneously.
Combined with Lemma 5.3 (no clause-3 witness \Rightarrow constancy
on both V(K_b) and V(K_c)), this would have closed Conjecture 5.1.
It is false by a concrete counterexample (Figure in paper.tex at
\ref{fig:no-two-constant-kempe-counterexample}). A partial proof
attempt is preserved in the paper alongside the disproof:
- Step 1 (local CW structure) — unconditional.
- Step 2 (forced odd-crossing
\Rightarrow |E(K_0) \cap E(K_1)|even and\geq 2) — unconditional. - Step 3 (Heawood face-sum mod 3) — unconditional but does not yield a contradiction on its own.
- Step 4 (lune-face Case A) — closes the sub-case where two shared
a-edges are consecutive on \emph{both} cycles (automatic when
|E(K_0) \cap E(K_1)| = 2). - Step 5 (general case) — open / now known false via the counterexample.
The counterexample shows the general case of the conjecture is unsalvageable; the search for a structural proof of Conjecture 5.1 will need a different angle.
Lessons from the structural-proof attempts (commit fd4b89a onward)
After working through three structural-proof routes to the deciding-face conjecture and beyond, here's what's known and what's still open:
What worked
- Reduction (Theorem
thm:deciding-face-implies-conj-5-1): Conjecture 5.1 follows from the existence of a deciding face — a face of the reduced dual with boundary ⊆ V(K_b) ∪ V(K_c) and length not divisible by 3 — via Heawood's classical face-sum identity. This is clean and tight. - Tight covering for $n_k = 5$ (Lemmas
lem:flank-covering-baseandlem:outer-face-covering-base): if any of (i)n_i = 5, (ii)n_{i+1} = 5, or (iii)n_{i+2} = n_{i+4} = 5holds, the flank or outer face is a tight structural deciding face. Covers 94.97% of the 7,930 empirical (G, v, i) configurations up to|V(G)| \le 20. - Partial pigeonhole for the G'-pentagon fallback (Lemma
lem:gprime-pigeonhole): if at most one vertex is uncovered by V(K_b) ∪ V(K_c), at least one G'-pentagon is a deciding face. Combined with the tight cases, covers ~91% structurally.
What didn't work
- n_i = 6 flank-covering lemma — the lemma is empirically false.
9,228 / 142,812 chord-apex+Kempe colourings hit sub-case (ii.B) of
Case (b), and 1,314 of those have P_1 ∉ V(K_b) ∪ V(K_c), falsifying
the lemma. Retracted in commit
873c2cc. - Winding-number / topological invariant (commit
fd4b89a). Σ-of-turn-signs around K_b under Lemma 5.2 alternation = 0. This is consistent with K_b being a simple closed planar curve bounding a non-empty region — not a contradiction. The natural topological invariant doesn't distinguish chord-apex+Kempe colourings under constancy from generic K_b's. - Closing the case analysis past |S| ≤ 1 — would require finer graph-structural input, fragmenting the case count. This is exactly the discharging flavour of the traditional reducible-configurations approach (Appel-Haken / RSST / Gonthier) to the Four Colour Theorem itself; we stopped to avoid replaying that route in a different vocabulary.
Diagnostic observation
Across all attempts, Lemma 5.2's alternation (= constancy on V(K_b) implies c_1-edges alternate sides along K_b) is the natural local consequence of constancy, but each global aggregation we've tried produces a quantity that's consistent with constancy rather than contradicting it:
- Heawood face-sum mod 3 on a specific face: requires identifying which face — that's case analysis.
- Winding number of K_b: zero under alternation, which only says K_b isn't a face boundary (true, not contradictory).
- Identity s_b ⊕ s_c = i_b ⊕ i_c ⊕ 1: reduces under constancy to c_b ⊕ c_c = 1, true but not contradictory.
This suggests the contradiction in the chord-apex+Kempe setting is not captured by local-or-aggregable invariants of K_b alone.
What's left open
- Conjecture (Deciding face): 100% empirically verified (142,812 / 142,812), 94.97% structurally proven, ~5% reducible to Conjecture (G'-pentagon fallback), itself 100% empirical but structurally open.
- G'-pentagon fallback structural proof beyond |S| ≤ 1: needs a Kempe-cycle structural result describing which G'-pentagons the uncovered vertices of chord-apex+Kempe colourings can hit.
- Minimality of G: invoked to derive chord-apex (= colour equality spike = merged) then forgotten. Possibly minimality has a stronger consequence we haven't extracted.
Final status (commit 2d8c679)
What we have proved
-
Theorem (Conjecture 5.1 ⇐ Deciding face): clean reduction via Heawood's classical face-sum identity. Tight.
-
Tight structural cases for the Deciding face conjecture:
- (a')
n_i = 5→ flank faceF^\flat_{i, i+1}of length 4. - (b')
n_{i+1} = 5→ flank faceF^\flat_{i+1, i+2}. - (c)
n_{i+2} = n_{i+4} = 5→ outer faceF^\flat_{\text{outer}}of length 7.
Empirical coverage: 7,531 / 7,930 (G, v, i) configurations = 94.97%.
- (a')
-
Refined pigeonhole + S-cycle structure: for bad colourings (= where Lemma flank-covering-hex fails), the uncovered vertex set
Sis even, forms a 2-regular subgraph (= a single cycle), and is bounded so that # G'-pentagons hit by S <p_G. This closes most bad cases structurally. -
Empirical closure: the G'-pentagon fallback conjecture is true on 1,314 / 1,314 bad colourings, hence on all 142,812 / 142,812 chord-apex+Kempe colourings up to
|V(G)| \leq 20.
What's open structurally
-
G'-pentagon fallback in full generality — proven via pigeonhole for
|S| \leq 1, sketched for higher|S|via empirical bounds (max hit, minp_G), but no fully structural proof of the empirical bounds. -
The "|S| = 8 + hit = 8 ⇒ $p_G = 11$" regularity — striking empirical fact (30/30 cases) but no Kempe-cycle structural argument for it. The marginal
|S|↔ # pentF_kcoupling doesn't hold; the regularity is a joint structural property of specific(|S|, \text{hit})pairs.
What the path is
The structural proof has stabilized at the following form:
Conjecture 5.1 (face-monochromatic-pair)
⇐ Theorem (Conj 5.1 ⇐ Deciding face) [tight]
Conjecture (Deciding face)
⇐ Theorem (Extended partial proof) [tight on 7,531/7,930]
⇐ Conjecture (G'-pentagon fallback) [empirical 1,314/1,314]
⇐ Lemma (G'-pentagon pigeonhole, |S| ≤ 1) [tight on |S| ≤ 1 ≈ 91%]
+ open: structural |S|-cycle arguments for |S| ≥ 2.
So Conjecture 5.1 is reduced to two open structural conjectures:
(i) The G'-pentagon fallback conjecture (empirically 100%). (ii) Kempe-cycle-structural lemmas about chord-apex+Kempe colourings sufficient to close the |S| ≥ 2 cases of the fallback.
Closing (i) and (ii) structurally would give a full structural proof. The empirical 100% across 142,812 colourings is strong evidence both are true.
Why this isn't Appel-Haken in disguise
The structural proof has 4 main steps + ~8 case-style sub-lemmas (vs RSST's 633 cases). The compression comes from the chord-apex+Kempe restriction doing most of the heavy lifting upfront — specifically, Lemma 5.3 (no-witness ⇒ constancy on V(K_b) ∪ V(K_c)) and Lemma 5.X (Kempe-spike, forced edge containments) pre-restrict the problem to a very constrained colour configuration. The Heawood face-sum identity then converts the local constancy into a mod-3 obstruction, and the case analysis closes specific structural sub-buckets.
The remaining open conjectures are at the chord-apex+Kempe class level, not at the level of arbitrary minimal counterexamples to 4CT.
Snapshot date
This commentary is current as of commit 2d8c679 (2026-05-25).