Files
didericis 74030a5b8f face_mono: extend Conjecture 5.26 to n_G ≤ 22
Adds experiments/test_conj_5_26_n_21_22.py, a clause-4 checker that
re-uses find_all_36_witnesses + check_clause_4 from
check_conj_final_scaled.py and runs them on n = 21, 22 with
incremental JSONL output and a 10-minute PROGRESS heartbeat.

Results (139 min wall, single thread):
  n=21: 192 tri, 392,370 colourings w/ clause-1–3 witness, all pass
  n=22: 651 tri, 1,786,314 colourings w/ clause-1–3 witness, all pass
  total at n ≤ 22: 2,321,496 / 2,321,496 (combined with the existing
  142,812 at n ≤ 20 from check_conj_final_scaled.py)

Paper edits:
- Abstract: "|V(G)| ≤ 20 (142,812)" → "|V(G)| ≤ 22 (2,321,496)" for
  the strengthening; clauses-1–3 count unchanged at 535,182 / n ≤ 21.
- Intro paragraph: matching update.
- Remark rem:conj-3-8-empirical table: added n=21 and n=22 rows; new
  total ($n \le 22$) = 959 triangulations, 2,321,496 colourings.
- Updated script reference in that remark to point at
  check_conj_final_scaled.py + test_conj_5_26_n_21_22.py.

COMMENTARY.md summary table: Conjecture 5.26 row bumped to
2,321,496 / 2,321,496 (n ≤ 22).

Also commits the test_*_results.jsonl artifacts (with per-tri
records + n-summaries + grand summary) for reproducibility.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 12:27:58 -04:00

14 KiB
Raw Permalink Blame History

Empirical state of the Conjecture-5.1 proof

This document is a snapshot of where the proof of Conjecture 5.1 (the face-monochromatic-pair conjecture) stands after the empirical work in experiments/. It is meant as commentary for the reader who has finished reading the paper and wants to know what's been verified computationally vs what remains to be proven structurally.

Summary table

claim status empirical evidence
Conjecture 5.1 (clauses 13) conjecture ✓ 535,182 / 535,182 (n ≤ 21, direct witness search)
Conjecture 5.26 (clauses 14, strengthening) conjecture ✓ 2,321,496 / 2,321,496 (n ≤ 22, via direct clause-4 check in check_conj_final_scaled.py for n ≤ 20 + test_conj_5_26_n_21_22.py for n ∈ {21, 22})
Non-constancy of h_φ on V(K_b) V(K_c) sufficient to prove 5.1 via Lemma 5.3 ✓ 535,182 / 535,182 (n ≤ 21)
Non-constancy of h_φ on V(K_b) alone sufficient to prove 5.1 via Corollary 5.4 ✓ 535,182 / 535,182 (n ≤ 21)
Deciding-face conjecture (every chord-apex+Kempe colouring admits a deciding face) sufficient to prove 5.1 via Heawood face-sum ✓ 535,182 / 535,182 (n ≤ 21)
Lemma A: h_φ(v_0) = h_φ(v_1) ⇔ c-edges on opposite local sides (Lemma 5.2 in the paper) proven (Lemma 5.2) ✓ 625,200 / 625,200 consecutive pairs
Identity s_b ⊕ s_c = i_b ⊕ i_c ⊕ 1 at shared vertex follows from the Heawood definitions + Lemma A ✓ 263,004 / 263,004 shared vertices
Parity-bucket symmetry n_{(0,0)} = n_{(1,1)} and n_{(0,1)} = n_{(1,0)} over shared vertices structural (likely provable) ✓ universal
At consecutive shared K_b-vertices, i_b parity strictly alternates structural ✓ universal (transition matrix has 0 within-parity transitions)
At odd-i_b shared vertices, i_c parity flip is forced on transition structural ✓ universal

What's actually open

The shortest path to a proof of Conjecture 5.1 is now:

Show structurally that for every chord-apex+Kempe colouring φ of every reduced dual Ĝ'_{v,i}, h_φ is not constant on V(K_b) (or equivalently on V(K_c)).

This is verified on 142,812 / 142,812 colourings up to n = 20. It is captured in the paper by Corollary 5.4 (the per-cycle form of Lemma 5.3) and Remark 5.5 (the empirical near-proof).

Why the proof is harder than it looks

Three empirical facts from the diagnostics in experiments/ rule out the simplest proof strategies:

  1. The obstruction has no slack. The minimum Heawood-flip count on K_b observed across the data is 2, attained on 12 colourings at n = 18. Those colourings have one single minority Heawood vertex on V(K_b) — flipping its sign would give constancy. So the proof cannot rest on bulk inequalities like "at least half of V(K_b) has each sign"; it must rule out every single-minority configuration.

  2. The minority isn't anchored to a structural vertex. For low-flip-count colourings, the minority vertex(es) are distributed roughly evenly across v_n, A_0, …, A_4, and "other" non-named vertices in the rest of G':

    v_n        12.86%
    A_{i+1}    10.82%
    A_{i+2}     8.98%
    A_i         7.76%
    A_{i+4}     5.31%
    A_{i+3}     5.10%
    other      49.18%
    

    So no single named vertex is always the minority — the proof cannot fix on "vertex X must have Heawood opposite to the majority"; roughly half the time the minority lives outside the named six.

  3. No single named-vertex-pair is always a Heawood mismatch. The most "reliable" same-cycle pair is (A_i, A_{i+1}) and (A_{i+1}, A_{i+2}) at consecutive face-boundary positions, each with 75% mismatch rate — well short of universal. So the proof can neither identify a specific edge that always has differing Heawood at its endpoints, nor a specific vertex that's always minority.

Together (1)(3) say the obstruction is global, not local: there is always a Heawood mismatch on V(K_b), but where it sits varies by colouring. A successful structural proof will need a global argument — likely a topological / homological / parity-counting argument that operates on all of V(K_b) simultaneously, rather than identifying a specific forced flip.

Candidate mechanisms (none confirmed)

These were explored and the corresponding diagnostics ruled them out or revealed why they don't yield a contradiction on their own:

  • Heawood sum identity ∑_v h_φ(v) = 0. Holds only ~17.6% of the time on chord-apex+Kempe colourings; the sum can be anywhere in {-24, -20, …, 24}. So this classical identity is not available here.

  • Heawood sum on a single Kempe cycle ∑_{V(K)} h_φ = 0. Holds only ~23% of the time per cycle.

  • Cycle-side balance |L_b| = |R_b|. Holds only 35.43% of the time. Constancy would force this exactly, but the empirical imbalance is large in most colourings.

  • Specific named-vertex pair always mismatches. No such pair exists; closest is (A_j, A_{j+1}) at 75%.

Files

  • Paper text: paper.tex, sections 3 (Heawood number definition, Lemma 5.2), 5 (Lemma 5.3, Corollary 5.4, Remark 5.5).
  • Diagnostic scripts: see experiments/check_heawood_*.py, experiments/check_kempe_intersection_and_alternation.py, experiments/check_shared_*.py, experiments/check_cw_parity_prediction.py, experiments/check_constancy_obstruction.py, experiments/check_min_flip_structure.py, experiments/check_minority_location.py.

Failed proof route via edge-sharing Kempe constancy

A natural-looking strategy to prove Conjecture 5.1 was:

Conjecture (now disproved): If K_0 is an ${a,b}$-Kempe cycle of \varphi and K_1 is an ${a,c}$-Kempe cycle of \varphi that shares an edge with K_0, then h_\varphi cannot be constant on both V(K_0) and V(K_1) simultaneously.

Combined with Lemma 5.3 (no clause-3 witness \Rightarrow constancy on both V(K_b) and V(K_c)), this would have closed Conjecture 5.1. It is false by a concrete counterexample (Figure in paper.tex at \ref{fig:no-two-constant-kempe-counterexample}). A partial proof attempt is preserved in the paper alongside the disproof:

  • Step 1 (local CW structure) — unconditional.
  • Step 2 (forced odd-crossing \Rightarrow |E(K_0) \cap E(K_1)| even and \geq 2) — unconditional.
  • Step 3 (Heawood face-sum mod 3) — unconditional but does not yield a contradiction on its own.
  • Step 4 (lune-face Case A) — closes the sub-case where two shared a-edges are consecutive on \emph{both} cycles (automatic when |E(K_0) \cap E(K_1)| = 2).
  • Step 5 (general case) — open / now known false via the counterexample.

The counterexample shows the general case of the conjecture is unsalvageable; the search for a structural proof of Conjecture 5.1 will need a different angle.

Lessons from the structural-proof attempts (commit fd4b89a onward)

After working through three structural-proof routes to the deciding-face conjecture and beyond, here's what's known and what's still open:

What worked

  • Reduction (Theorem thm:deciding-face-implies-conj-5-1): Conjecture 5.1 follows from the existence of a deciding face — a face of the reduced dual with boundary ⊆ V(K_b) V(K_c) and length not divisible by 3 — via Heawood's classical face-sum identity. This is clean and tight.
  • Tight covering for $n_k = 5$ (Lemmas lem:flank-covering-base and lem:outer-face-covering-base): if any of (i) n_i = 5, (ii) n_{i+1} = 5, or (iii) n_{i+2} = n_{i+4} = 5 holds, the flank or outer face is a tight structural deciding face. Covers 94.97% of the 7,930 empirical (G, v, i) configurations up to |V(G)| \le 20.
  • Partial pigeonhole for the G'-pentagon fallback (Lemma lem:gprime-pigeonhole): if at most one vertex is uncovered by V(K_b) V(K_c), at least one G'-pentagon is a deciding face. Combined with the tight cases, covers ~91% structurally.

What didn't work

  • n_i = 6 flank-covering lemmathe lemma is empirically false. 9,228 / 142,812 chord-apex+Kempe colourings hit sub-case (ii.B) of Case (b), and 1,314 of those have P_1 ∉ V(K_b) V(K_c), falsifying the lemma. Retracted in commit 873c2cc.
  • Winding-number / topological invariant (commit fd4b89a). Σ-of-turn-signs around K_b under Lemma 5.2 alternation = 0. This is consistent with K_b being a simple closed planar curve bounding a non-empty region — not a contradiction. The natural topological invariant doesn't distinguish chord-apex+Kempe colourings under constancy from generic K_b's.
  • Closing the case analysis past |S| ≤ 1 — would require finer graph-structural input, fragmenting the case count. This is exactly the discharging flavour of the traditional reducible-configurations approach (Appel-Haken / RSST / Gonthier) to the Four Colour Theorem itself; we stopped to avoid replaying that route in a different vocabulary.

Diagnostic observation

Across all attempts, Lemma 5.2's alternation (= constancy on V(K_b) implies c_1-edges alternate sides along K_b) is the natural local consequence of constancy, but each global aggregation we've tried produces a quantity that's consistent with constancy rather than contradicting it:

  • Heawood face-sum mod 3 on a specific face: requires identifying which face — that's case analysis.
  • Winding number of K_b: zero under alternation, which only says K_b isn't a face boundary (true, not contradictory).
  • Identity s_b ⊕ s_c = i_b ⊕ i_c ⊕ 1: reduces under constancy to c_b ⊕ c_c = 1, true but not contradictory.

This suggests the contradiction in the chord-apex+Kempe setting is not captured by local-or-aggregable invariants of K_b alone.

What's left open

  • Conjecture (Deciding face): 100% empirically verified (142,812 / 142,812), 94.97% structurally proven, ~5% reducible to Conjecture (G'-pentagon fallback), itself 100% empirical but structurally open.
  • G'-pentagon fallback structural proof beyond |S| ≤ 1: needs a Kempe-cycle structural result describing which G'-pentagons the uncovered vertices of chord-apex+Kempe colourings can hit.
  • Minimality of G: invoked to derive chord-apex (= colour equality spike = merged) then forgotten. Possibly minimality has a stronger consequence we haven't extracted.

Final status (commit 2d8c679)

What we have proved

  1. Theorem (Conjecture 5.1 ⇐ Deciding face): clean reduction via Heawood's classical face-sum identity. Tight.

  2. Tight structural cases for the Deciding face conjecture:

    • (a') n_i = 5 → flank face F^\flat_{i, i+1} of length 4.
    • (b') n_{i+1} = 5 → flank face F^\flat_{i+1, i+2}.
    • (c) n_{i+2} = n_{i+4} = 5 → outer face F^\flat_{\text{outer}} of length 7.

    Empirical coverage: 7,531 / 7,930 (G, v, i) configurations = 94.97%.

  3. Refined pigeonhole + S-cycle structure: for bad colourings (= where Lemma flank-covering-hex fails), the uncovered vertex set S is even, forms a 2-regular subgraph (= a single cycle), and is bounded so that # G'-pentagons hit by S < p_G. This closes most bad cases structurally.

  4. Empirical closure: the G'-pentagon fallback conjecture is true on 1,314 / 1,314 bad colourings, hence on all 142,812 / 142,812 chord-apex+Kempe colourings up to |V(G)| \leq 20.

What's open structurally

  1. G'-pentagon fallback in full generality — proven via pigeonhole for |S| \leq 1, sketched for higher |S| via empirical bounds (max hit, min p_G), but no fully structural proof of the empirical bounds.

  2. The "|S| = 8 + hit = 8 ⇒ $p_G = 11$" regularity — striking empirical fact (30/30 cases) but no Kempe-cycle structural argument for it. The marginal |S| ↔ # pent F_k coupling doesn't hold; the regularity is a joint structural property of specific (|S|, \text{hit}) pairs.

What the path is

The structural proof has stabilized at the following form:

Conjecture 5.1 (face-monochromatic-pair)
  ⇐ Theorem (Conj 5.1 ⇐ Deciding face)     [tight]
Conjecture (Deciding face)
  ⇐ Theorem (Extended partial proof)       [tight on 7,531/7,930]
  ⇐ Conjecture (G'-pentagon fallback)       [empirical 1,314/1,314]
  ⇐ Lemma (G'-pentagon pigeonhole, |S| ≤ 1) [tight on |S| ≤ 1 ≈ 91%]
  + open: structural |S|-cycle arguments for |S| ≥ 2.

So Conjecture 5.1 is reduced to two open structural conjectures:

(i) The G'-pentagon fallback conjecture (empirically 100%). (ii) Kempe-cycle-structural lemmas about chord-apex+Kempe colourings sufficient to close the |S| ≥ 2 cases of the fallback.

Closing (i) and (ii) structurally would give a full structural proof. The empirical 100% across 142,812 colourings is strong evidence both are true.

Why this isn't Appel-Haken in disguise

The structural proof has 4 main steps + ~8 case-style sub-lemmas (vs RSST's 633 cases). The compression comes from the chord-apex+Kempe restriction doing most of the heavy lifting upfront — specifically, Lemma 5.3 (no-witness ⇒ constancy on V(K_b) V(K_c)) and Lemma 5.X (Kempe-spike, forced edge containments) pre-restrict the problem to a very constrained colour configuration. The Heawood face-sum identity then converts the local constancy into a mod-3 obstruction, and the case analysis closes specific structural sub-buckets.

The remaining open conjectures are at the chord-apex+Kempe class level, not at the level of arbitrary minimal counterexamples to 4CT.

Snapshot date

This commentary is current as of commit 2d8c679 (2026-05-25).