Why Your Neural Network Can't Learn 1/x (And What We Did About It)

  • Home
  • /Why Your Neural Network Can't Learn 1/x (And What We Did About It)

image

Why Your Neural Network Can't Learn 1/x (And What We Did About It)

08 Sep 2025

Try this experiment: train a standard neural network to approximate f(x)=1/xf(x) = 1/x on the interval [2,2][-2, 2]. Use whatever architecture you like. Dense layers, ReLU activations, plenty of capacity. Train until convergence. Now plot the results near x=0x = 0.

You will see something frustrating. While your network captures the behavior perfectly at x=±2,±1,±0.5x = \pm 2, \pm 1, \pm 0.5, it creates a smooth, incorrect plateau exactly where the function should shoot to infinity. The network has learned to give up. It is not a matter of training longer or adding more parameters. Standard architectures with continuous, bounded activation functions fundamentally cannot represent poles. They will always smooth the singularity away, creating what we call a "Soft Wall" or a "Clipped Peak" where the physics demands an infinite asymptote.

This isn't just a mathematical curiosity. In molecular dynamics, this "Soft Wall" allows atoms to fuse together ("Ghost Molecules") because the network underestimates the 1/r121/r^{12} repulsion force. In RF electronics, it erases the sharp resonance peaks of 5G filters. In robotics, it causes the controller to panic or freeze near kinematic locks.

The solution: Signed Common Meadows (SCM)

Our earlier releases experimented with Transreal arithmetic. We now rely on Signed Common Meadows, the algebra described by Bergstra & Tucker where division is total and a bottom element \bot propagates deterministically. ZeroProofML implements that algebra in three pieces:

  1. Rational SCM layers that learn P(x)/Q(x)P(\mathbf{x})/Q(\mathbf{x}) directly, giving the hypothesis space the capacity to represent poles and asymptotes.
  2. Projective training in homogeneous tuples N,D\langle N, D\rangle with detached renormalization. This “ghost gradient” trick keeps optimization smooth when D0D \to 0 without redefining the function being learned.
  3. Strict inference with configurable thresholds (τtrain,τinfer)(\tau_{\text{train}}, \tau_{\text{infer}}), fracterm flattening, and explicit bottom_mask / gap_mask outputs so every singular decode is surfaced instead of silently smoothed away.

These ingredients form the “Train on Smooth, Infer on Strict” protocol. During training we enjoy stable gradients; during inference we enforce true SCM semantics, meaning 1/01/0 returns \bot and carries weak sign information rather than causing NaNs.

The "Physics Trinity" Benchmarks

We validated this architecture against "Steel-Man" MLP baselines (N=11 seeds) across three distinct physical domains. The results confirm that architecture is destiny.

  1. Pharma: The "Hard Wall" (1/r121/r^{12}) When modeling atomic repulsion, standard MLPs create a "Soft Wall" that collapses under pressure. ZeroProofML (using an Improper SCM head) guarantees super-linear growth.
  • Result: >3,000x reduction in core extrapolation error.
  • Impact: While the MLP barrier broke at a force of 1250, ZeroProofML held firm at 3000, matching the analytic oracle perfectly.
  1. Electronics: Spectral Extrapolation (1/Q1/Q) Predicting high-Q resonance peaks from low-Q data is notoriously hard. MLPs suffer from "Phase Incoherence," smearing the Real and Imaginary parts and clipping the peak.
  • Result: 70% Yield vs 40% for MLPs.
  • Impact: We successfully extrapolated resonance poles 33x out-of-distribution (from Q=30Q=30 to Q=1000Q=1000). In worst-case failures, SCM retained 50% of the signal energy, while MLPs lost 98%.
  1. Robotics: Geometric Consistency (detJ0\det J \to 0) Near a kinematic lock, standard controllers often jitter or spike.
  • Result: 31.8x lower variance across training seeds.
  • Impact: While MLPs achieved slightly lower mean error by overfitting the smooth workspace, they under-reacted to the singularity by ~2.4%. ZeroProofML recovered the rational damping peak and reduced worst-case control spikes by 60x. It trades average-case precision for deterministic safety.

What this means practically

If your problem involves smooth manifolds (like images or text), stick with Transformers and ResNets. But if your research involves functions that blow up, go to zero, or become undefined, you are fighting against the inductive bias of your neural network.

ZeroProofML provides a drop-in replacement layer that aligns the network's algebra with the physics of singularities. The code is open source. If you are working on power systems (voltage collapse), finance (correlation singularities), or fluid dynamics (shock waves), we would love to see if SCM solves your stability problems.

Modified on 06 Jan 2026, after the release of v0.4.0.

Join our newsletter!

Enter your email to receive our latest newsletter.

Don't worry, we don't spam

Related Articles

blog cover image
07 Jan 2026

Ghost Molecules: Why Neural Networks Fail at Atomic Repulsion

Standard MLPs create 'Soft Walls' that allow atoms to pass through each other. Here is how we built a 'Hard Wall' with much better physics.

blog cover image
08 Sep 2025

Why Your Neural Network Can't Learn 1/x (And What We Did About It)

Why smooth activations create "Soft Walls" near poles, and how Signed Common Meadows (SCM) fix it for robotics, pharma, and electronics.

blog cover image
15 Jul 2025

Looking for Problems: Where Should We Test ZeroProofML Next?

We validated the 'Physics Trinity' (Pharma, Electronics, Robotics). Now we need your help finding the next singularity.

Why Your Neural Network Can't Learn 1/x (And What We Did About It) | ZeroProofML | ZeroProofML