Topic 1: Overview & Principles

This overview orients you to ZeroProof’s goals, how it differs from conventional ML numerics, and where to look next.

What ZeroProof Solves

Singularities and indeterminate forms appear in many domains (e.g., division by zero, ∞−∞, 0×∞, log≤0, zero variance normalization).
Conventional stacks rely on ε-hacks or avoid these regions, causing silent instability, NaNs, or brittle heuristics.
ZeroProof uses Transreal (TR) arithmetic to make operations total: every op returns a value with an explicit tag, never throwing.

Tags: REAL (finite), PINF (+∞), NINF (−∞), PHI (nullity for indeterminate forms). Wheel-mode adds BOTTOM (⊥).
Total operations: +, −, ×, ÷, log, sqrt, pow_int are defined for all tag combinations with deterministic rules.
Determinism: Tag decisions use exact predicates (e.g., denom==0). No ε thresholds in core semantics.

Total operations: No exceptions; stable behavior at singularities.
Mask-REAL gradients: Non-REAL forward tags produce zero parameter gradients (prevents gradient explosions).
Saturating gradients (optional): Bounded gradients near poles without ε, for continuous alternatives.
Coverage control: Adaptive λ_rej maintains target REAL/non-REAL ratios in training.
Pole learning: Layers and metrics explicitly detect, supervise, and localize poles.
Stratified evaluation: Near‑pole analysis via |det(J)| buckets (B0–B4) with per‑bucket MSE and 2D pole metrics (PLE, sign consistency, slope error, residual consistency).
Comparator parity: Unified driver runs MLP, ε‑rational, TR basic/full, and DLS on identical splits; quick profile performs stratified subsampling and aligns DLS to the same test subset.
Bench transparency: Hybrid trainer records per‑epoch timings (avg_step_ms, data_time_ms, optim_time_ms, batches) in training summaries.
Reproducibility: Global seeding across Python/NumPy/PyTorch; dataset JSON embeds bucket metadata and seed.

Totality first: All ops are total on TR; forward/backward never throw.
Exactness over ε: No hidden thresholds in core; any evaluation-time τ is explicitly opt-in.
Explicit tags: Values carry semantics; reductions declare STRICT vs DROP_NULL.
Gradient safety: Mask-REAL ensures bounded updates when encountering non-REAL tags.
Mode isolation: TR and Wheel modes never mix within an op; bridging is explicit.

TR (default): Keeps arithmetic flowing with PHI for indeterminate forms; suitable for most ML tasks.
Wheel (optional): Replaces certain indeterminate results with ⊥ for stricter algebraic control (e.g., 0×∞=⊥, ∞+∞=⊥). Useful for audits/verification.

Semantics and tables: See Spec Clarifications in complete_v2.md.
Autodiff rules: See docs/autodiff_mask_real.md and docs/saturating_grad_guide.md.
Layers overview: docs/layers.md (TR-Rational, TR-Norm, enhanced variants).
Bridges and interop: docs/bridge_summary.md and docs/bridge_extended.md.
Precision and float64 enforcement: docs/float64_enforcement.md.
Adaptive loss & coverage: docs/adaptive_loss_guide.md and summaries in docs/adaptive_loss_summary.md.
Robotics how‑tos & parity runner: docs/topics/08_howto_checklists.md (dataset flags, quick/full profiles, comparator driver, and bench metrics).

For the full conceptual framework, see concept_250908.md.