Inference & Deployment
Inference in v0.5.1 is strict SCM inference. There are no stochastic training thresholds, no gradient policies, and no hidden guard branches. A deployment receives numeric payloads plus masks and must treat those masks as the contract.
Stable Output Contract
Strict inference returns:
decoded, bottom_mask, gap_mask = result
| Field | Meaning | Deployment use |
|---|---|---|
decoded |
Finite decoded payload when accepted; commonly NaN for bottom samples |
Feed accepted samples to downstream code |
bottom_mask |
Authoritative fail-closed mask | Reject, abstain, or route to fallback |
gap_mask |
Finite samples where `tau_infer <= | Q |
bottom_mask and gap_mask should be disjoint. Do not treat gap_mask as bottom; it is a warning that a finite prediction is close to the training/inference boundary.
Strict Decode
from zeroproofml.inference import InferenceConfig, strict_inference
cfg = InferenceConfig(tau_infer=1e-6, tau_train=1e-4)
decoded, bottom_mask, gap_mask = strict_inference(P, Q, config=cfg)
Runtime rule:
|Q| < tau_infer -> bottom_mask=True
tau_infer <= |Q| < tau_train -> gap_mask=True
If tau_train is not known or not relevant, omit it and monitor only bottoms.
Choosing tau_infer
Pick tau_infer from held-out denominator data, not from a training default.
from zeroproofml.metrics import tau_infer_sweep_from_q_abs, write_tau_infer_sweep
curves = tau_infer_sweep_from_q_abs(
q_abs=q_abs_held_out,
is_in_range=is_in_range,
taus=[1e-6, 3e-6, 1e-5, 3e-5, 1e-4],
)
write_tau_infer_sweep(
"results/tau_calibration",
curves,
provenance={"split": "held_out"},
)
Selection trade-off:
- Larger
tau_infer: fewer unsafe finite accepts near singularities, more rejection. - Smaller
tau_infer: higher coverage, more risk near the denominator boundary.
Freeze the chosen value in InferenceConfig and ship it inside bundle metadata.
Wrapping A Model
from zeroproofml.inference import InferenceConfig, SCMInferenceWrapper
wrapped = SCMInferenceWrapper(
model,
config=InferenceConfig(tau_infer=1e-6, tau_train=1e-4),
).eval()
decoded, bottom_mask, gap_mask = wrapped(x)
In training mode, wrappers can pass projective outputs through for loss computation. In eval mode, they decode strictly.
ONNX Bundles
ONNX bundles are the preferred deployment handoff artifact:
from zeroproofml.inference import (
export_bundle,
load_onnx_runtime_bundle,
run_bundle_reference_smoke_test,
validate_bundle,
)
export_bundle(wrapped, "bundle_dir", example_inputs=(x_example,))
validate_bundle("bundle_dir")
runtime = load_onnx_runtime_bundle(
"bundle_dir",
providers=["CPUExecutionProvider"],
)
decoded, bottom_mask, gap_mask = runtime.run(x_numpy)
run_bundle_reference_smoke_test("bundle_dir", wrapped, (x_smoke,))
A stable bundle contains:
model.onnxmetadata.json
Common report artifacts beside the bundle:
VALIDATION_REPORT.mdVALIDATION_REPORT.summary.jsonVALIDATION_REPORT.summary.svg
metadata.json records the schema version, tau_infer, optional tau_train, input/output signatures, batch-axis semantics, mask semantics, and the strict output order:
decoded, bottom_mask, gap_mask
TorchScript support remains a legacy compatibility path. Prefer ONNX for new deployments.
Fallback Patterns
The simplest deployment action is reject-on-bottom:
from zeroproofml.inference import reject_on_bottom
decoded_safe, accept_mask = reject_on_bottom(decoded, bottom_mask)
For conservative systems, reject the gap region too:
from zeroproofml.inference import reject_on_gap
decoded_safe, accept_mask = reject_on_gap(decoded, bottom_mask, gap_mask)
For robotics or control systems, route bottom or risky samples to an analytic solver:
from zeroproofml.inference import route_to_analytic_solver
resolved, route_mask = route_to_analytic_solver(
decoded,
bottom_mask=bottom_mask,
analytic_solver=solver,
inputs=x,
)
Keep the mask and the selected route in logs. Silent replacement of rejected values makes audits much harder.
Monitoring
from zeroproofml.inference import StrictInferenceMonitor
from zeroproofml.utils.logging import JsonlLogger
events = JsonlLogger("strict_inference_events.jsonl")
monitor = StrictInferenceMonitor(
bundle_id="robotics_rr_ik_v1",
event_logger=events,
acceptance_rate_drift_threshold=0.1,
)
monitor.update(bottom_mask, gap_mask)
rates = monitor.rates()
state = monitor.export_state(
include_histograms=True,
include_batch_summaries=True,
)
Monitor at least:
- bottom rate
- gap rate
- acceptance rate
- route/fallback rate
- denominator minimums when available
- drift against calibration or validation rates
Optional Provenance Diagnostics
The stable output contract is still (decoded, bottom_mask, gap_mask). v0.5.1 also has opt-in experimental diagnostics that split bottom decisions into richer explanations.
from zeroproofml.inference import InferenceConfig, strict_inference
cfg = InferenceConfig(
tau_infer=1e-6,
tau_train=1e-4,
provenance="experimental",
provenance_representation="split_masks",
)
result = strict_inference(P, Q, config=cfg)
decoded, bottom_mask, gap_mask = result
fault_mask = getattr(result, "fault_mask", None)
semantic_bottom_mask = getattr(result, "semantic_bottom_mask", None)
bottom_provenance = getattr(result, "bottom_provenance", None)
Rules for consumers:
- Depend only on
decoded,bottom_mask, andgap_maskfor stable deployment. - Treat provenance fields as optional diagnostics.
- Keep working when provenance fields are absent.
- Do not change stable ONNX output names to expose experimental provenance.
The Q2 decision for v0.5.1 keeps provenance experimental until a review artifact and rerun show a clear downstream win without breaking the stable mask contract.
Direction-Aware Censoring
For three-way censoring problems, combine strict bottom gating with an optional direction head:
from zeroproofml.inference import decode_strict_censored_3way
decoded, bottom_mask, class_id = decode_strict_censored_3way(
P.squeeze(-1),
Q.squeeze(-1),
tau_infer=1e-6,
direction_logits=direction_logits,
)
Use this when a bottom output should still indicate below-limit versus above-limit, or a related task-specific direction.
Reference Deployment
The reference robotics deployment runs train -> bundle -> strict inference -> fallback -> report:
python scripts/reference_robotics_deployment.py --device cpu --epochs 2 --n-samples 6000
It writes a self-contained directory under results/reference_deploy_robotics/ with the ONNX bundle, validation report, inference summary, strict-inference audit, and output contract.
The same path is importable:
from zeroproofml.reference_robotics_deployment import (
ReferenceRoboticsDeploymentConfig,
run_reference_robotics_deployment,
)
artifacts = run_reference_robotics_deployment(
ReferenceRoboticsDeploymentConfig(device="cpu", epochs=2, n_samples=6000)
)
print(artifacts.bundle_model_path)
Deployment Checklist
- Freeze
InferenceConfigfrom held-out calibration data. - Export an ONNX bundle from the eval wrapper.
- Run
validate_bundle(...). - Run
run_bundle_reference_smoke_test(...)against saved smoke inputs. - Regenerate the validation report with
python -m zeroproofml.report bundle <bundle_dir>. - Confirm downstream consumers use
bottom_maskandgap_maskexplicitly. - Log route/fallback actions and acceptance-rate drift in production.