Why Your Neural Network Can't Learn 1/x (And What We Did About It)

  • Home
  • /Why Your Neural Network Can't Learn 1/x (And What We Did About It)

image

Why Your Neural Network Can't Learn 1/x (And What We Did About It)

08 Sep 2025

Try this experiment: train a standard neural network to approximate f(x)=1/xf(x) = 1/x on the interval [2,2][-2, 2]. Use whatever architecture you like. Dense layers, ReLU activations, plenty of capacity. Train until convergence. Now plot the results near x=0x = 0.

You will see something frustrating. While your network captures the behavior perfectly at x=±2,±1,±0.5x = \pm 2, \pm 1, \pm 0.5, it creates a smooth, incorrect plateau exactly where the function should shoot to infinity. The network has learned to give up. It is not a matter of training longer or adding more parameters. Standard architectures with continuous, bounded activation functions fundamentally cannot represent poles. They will always smooth the singularity away, creating what we call a "Soft Wall" or a "Clipped Peak" where the physics demands an infinite asymptote.

This isn't just a mathematical curiosity. In molecular dynamics, this "Soft Wall" allows atoms to fuse together ("Ghost Molecules") because the network underestimates the 1/r121/r^{12} repulsion force. In RF electronics, it erases the sharp resonance peaks of 5G filters. In robotics, it causes the controller to panic or freeze near kinematic locks.

The solution: Signed Common Meadows (SCM)

Our earlier releases experimented with Transreal arithmetic. We now rely on Signed Common Meadows, the algebra described by Bergstra & Tucker where division is total and a bottom element \bot propagates deterministically. ZeroProofML implements that algebra in three pieces:

  1. Rational SCM layers that learn P(x)/Q(x)P(\mathbf{x})/Q(\mathbf{x}) directly, giving the hypothesis space the capacity to represent poles and asymptotes.
  2. Projective training in homogeneous tuples N,D\langle N, D\rangle with detached renormalization. This “ghost gradient” trick keeps optimization smooth when D0D \to 0 without redefining the function being learned.
  3. Strict inference with configurable thresholds (τtrain,τinfer)(\tau_{\text{train}}, \tau_{\text{infer}}), fracterm flattening, and explicit bottom_mask / gap_mask outputs so every singular decode is surfaced instead of silently smoothed away.

These ingredients form the “Train on Smooth, Infer on Strict” protocol. During training we enjoy stable gradients; during inference we enforce true SCM semantics, meaning 1/01/0 returns \bot and carries weak sign information rather than causing NaNs.

The "Physics Trinity" Benchmarks

We validated the framework against “steel man” baselines in three domains that stress different singular behaviors:

  1. Pharma / Dose (informational singularity). Angular SCM keeps the censored/in-range gate strict. Across roughly two hundred thousand censored test samples the false-accept rate stayed near zero while still delivering usable regression on accepted points. ϵ\epsilon-regularized rationals, by contrast, hallucinated finite answers on more than half of those censored inputs. Safety wins by making \bot a first-class output.
  2. Electronics / RF (spectral poles). A shared complex denominator nearly doubles the success yield compared to an over-parameterized MLP and restores phase coherence instead of letting real/imaginary heads drift independently. The ablation that removes the shared denominator collapses, which tells us the inductive bias—not raw capacity—is doing the work.
  3. Robotics / IK (optimization stability). The SCM parameterization trades a touch of mean error for an order-of-magnitude reduction in seed-to-seed variance and tighter tail behavior near kinematic locks. Rational+ϵ\epsilon versions remained unstable, so if you need reproducibility or certifiable behavior, the bias pays for itself.

What this means practically

If your problem involves smooth manifolds (like images or text), stick with Transformers and ResNets. But if your research involves functions that blow up, go to zero, or become undefined, you are fighting against the inductive bias of your neural network.

ZeroProofML provides a drop-in replacement layer that aligns the network's algebra with the physics of singularities. The code is open source. If you are working on power systems (voltage collapse), finance (correlation singularities), or fluid dynamics (shock waves), we would love to see if SCM solves your stability problems.

Modified on 06 Jan 2026, after the release of v0.4.0.

Join our newsletter!

Enter your email to receive our latest newsletter.

Don't worry, we don't spam

Related Articles

blog cover image
07 Jan 2026

Ghost Molecules: Why Neural Networks Fail at Atomic Repulsion

Standard MLPs create 'Soft Walls' that allow atoms to pass through each other. Here is how we built a 'Hard Wall' with much better physics.

blog cover image
08 Sep 2025

Why Your Neural Network Can't Learn 1/x (And What We Did About It)

Why smooth activations create "Soft Walls" near poles, and how Signed Common Meadows (SCM) fix it for robotics, pharma, and electronics.

blog cover image
15 Jul 2025

Looking for Problems: Where Should We Test ZeroProofML Next?

We validated the 'Physics Trinity' (Pharma, Electronics, Robotics). Now we need your help finding the next singularity.

Why Your Neural Network Can't Learn 1/x (And What We Did About It) | ZeroProofML | ZeroProofML