# NextStat — High-Performance Statistical Inference Engine
# https://nextstat.io

> NextStat is a Rust-powered statistical inference engine with Python bindings.
> It covers 12+ statistical verticals: HistFactory (HEP), GLM regression, Bayesian
> NUTS/MAMS sampling, survival analysis, econometrics, causal inference, hierarchical
> models, ordinal regression, PK/PD, EVT, meta-analysis, insurance reserving, churn
> analytics, and more. GPU acceleration (CUDA/Metal) and browser-based WASM playground.

## Documentation

- [Docs home](https://nextstat.io/docs): Getting started, architecture, API references
- [Installation](https://nextstat.io/docs/installation): pip, cargo, source
- [Quickstart](https://nextstat.io/docs/quickstart): 5-minute guide with expected outputs and troubleshooting
- [HEP Full Workflow Tutorial](https://nextstat.io/docs/hep-tutorial): Comprehensive 1200-line tutorial — workspace construction, all modifier types, MLE fitting, CLs hypothesis testing, upper limits (Brazil band), NP ranking, pulls, correlation matrix, profile likelihood scans, workspace combination, mass scans, GPU acceleration
- [Python API](https://nextstat.io/docs/python-api): from_pyhf, fit, hypotest, upper_limit, ranking, scan
- [Rust API](https://nextstat.io/docs/rust-api): ns-translate, ns-inference, ns-ad, ns-compute, ns-root
- [R Bindings](https://nextstat.io/docs/r-bindings): R bindings via extendr — fit, hypotest, upper_limit, scan, ranking
- [CLI](https://nextstat.io/docs/cli): fit, hypotest, upper-limit, scan, report, import, export
- [HistFactory](https://nextstat.io/docs/histfactory): Workspace format, modifiers, pyhf compatibility
- [GPU](https://nextstat.io/docs/gpu): CUDA/Metal batch toys, differentiable NLL
- [Bayesian](https://nextstat.io/docs/bayesian): NUTS sampler, HMC diagnostics
- [WASM](https://nextstat.io/docs/wasm): Browser-based inference playground
- [Inference Server](https://nextstat.io/docs/server): REST API for shared GPU inference

## ML / Data Science

- [ML Overview](https://nextstat.io/docs/ml-overview): Terminology bridge for ML engineers
- [ML Training](https://nextstat.io/docs/ml-training): SignificanceLoss, SoftHistogram, PyTorch
- [MLOps](https://nextstat.io/docs/ml-ops): W&B, MLflow, ranking, Optuna
- [Agentic Tools](https://nextstat.io/docs/agentic-tools): OpenAI, LangChain, MCP tool definitions
- [Surrogate Distillation](https://nextstat.io/docs/surrogate-distill): Neural surrogate training
- [Arrow / Polars](https://nextstat.io/docs/arrow-polars): Zero-copy columnar interchange
- [Differentiable](https://nextstat.io/docs/differentiable): Canonical spec — what is differentiable, CUDA zero-copy NLL, profiled q₀/qμ on GPU, envelope theorem gradients, DifferentiableSession native API, validation (FD error 2.07e⁻⁹), architecture decisions, Phase 2 implicit differentiation

## Statistical Models

- [Regression & GLM](https://nextstat.io/docs/regression): Linear, logistic, Poisson, NB, Gamma, Tweedie
- [Bayesian Sampling](https://nextstat.io/docs/bayesian): NUTS, MAMS, LAPS (GPU), unified sample() dispatcher, ArviZ integration, any model
- [Survival Analysis](https://nextstat.io/docs/survival): Cox PH, Weibull, Log-Normal AFT, Exponential, interval-censored, KM, log-rank
- [Hierarchical Models](https://nextstat.io/docs/hierarchical): Random intercepts/slopes, correlated RE, LMM
- [Ordinal Regression](https://nextstat.io/docs/ordinal): Ordered logit, ordered probit
- [Time Series](https://nextstat.io/docs/timeseries): Kalman filter/smoother, EM, forecasting, simulation
- [Econometrics](https://nextstat.io/docs/econometrics): Panel FE, DiD TWFE, Event Study, IV/2SLS
- [Causal Inference](https://nextstat.io/docs/causal): AIPW (ATE/ATT), propensity scores, Rosenbaum bounds, E-value
- [PK/PD](https://nextstat.io/docs/pkpd): 1-compartment oral PK, NLME
- [EVT](https://nextstat.io/docs/evt): GEV (block maxima), GPD (peaks over threshold)
- [Meta-Analysis](https://nextstat.io/docs/meta-analysis): Fixed effects, random effects (DL), I², Q, τ²
- [Insurance](https://nextstat.io/docs/insurance): Chain ladder, Mack chain ladder
- [Churn Analytics](https://nextstat.io/docs/churn): Retention, uplift, cohort matrix, bootstrap HR
- [Profile CI](https://nextstat.io/docs/python-api): Bisection-based profile likelihood CI with warm-start, any LogDensityModel
- [Fault Tree Analysis](https://nextstat.io/docs/python-api): CE-IS (multi-level, p~1e-16), vanilla MC (CPU/Metal/CUDA), all failure modes

## Use Cases (Persona Guides)

- [For Data Scientists](https://nextstat.io/docs/for-data-scientists): sklearn → NextStat concept translation, Arrow/Polars ingest, SignificanceLoss training
- [For Quants & Risk](https://nextstat.io/docs/for-quants): Panel FE, DiD, IV/2SLS, Kalman, SR 11-7 validation
- [For Biologists & Pharma](https://nextstat.io/docs/for-biologists): Cox PH, Weibull, PK/PD NLME, GxP validation artifacts
- [Glossary](https://nextstat.io/docs/glossary): HEP ↔ DS ↔ Quant ↔ Bio terminology mapping table

## Benchmarks & Validation

- [Parity Contract](https://nextstat.io/docs/parity-contract): 7-tier numerical tolerance hierarchy (1e-12 per-bin to 0.05 toy stats), Parity vs Fast mode, Kahan summation, CI integration
- [ROOT Comparison](https://nextstat.io/docs/root-comparison): 3-way ROOT vs pyhf vs NextStat comparison — sub-1e-5 on q(μ), timing (37×–880× vs ROOT), root cause analysis of ROOT divergences
- [Optimizer Convergence](https://nextstat.io/docs/optimizer): L-BFGS-B vs SLSQP philosophy, best-NLL by default, warm-start for pyhf reproducibility, mismatch scale on large models
- [Public Benchmarks](https://nextstat.io/docs/public-benchmarks): Canonical spec for reproducible benchmarks: protocols, correctness gates, environment pinning, artifacts, and suite structure across 6 verticals.
- [Benchmark Results](https://nextstat.io/docs/benchmark-results): Published GPU/CPU snapshots with DOI artifacts, replication bundles, and CI-produced archives across HEP, Pharma, Bayesian, ML suites.
- [Snapshot Registry](https://nextstat.io/docs/snapshot-registry): All published snapshots and replication bundles — snapshot IDs, CI run links, archive SHA-256 hashes, DOIs, runner environments.
- [Validation Report](https://nextstat.io/docs/validation-report): Unified validation artifact (JSON+PDF) with Apex2, dataset SHA-256, model spec, per-suite pass/fail

## Blog

- [Numerical Accuracy](https://nextstat.io/blog/numerical-accuracy): 3-way ROOT vs pyhf vs NextStat comparison
- [Differentiable Layer](https://nextstat.io/blog/differentiable-layer): How NextStat makes HistFactory differentiable in PyTorch
- [Trust Offensive](https://nextstat.io/blog/trust-offensive): Why we publish reproducible benchmarks for scientific software
- [End of Scripting Era](https://nextstat.io/blog/end-of-scripting-era): From scripts to benchmark snapshots — abstract, artifact sets (manifest + correctness gates + index), concrete snapshot directory layout, replication as endgame
- [Third-Party Replication](https://nextstat.io/blog/third-party-replication): Replication artifact set (snapshot_index.json, replication_report.json, validation pack), schema references, minimal 6-step replication loop, signed reports (GPG/Sigstore)
- [Benchmark Snapshots](https://nextstat.io/blog/benchmark-snapshots-ci-artifacts): Snapshot = product artifact set — definitions, anatomy (validation pack + manifest + correctness gates), determinism (bit-identical hashing), CI as publisher, baselines + DOI/Zenodo
- [HEP Benchmark Harness](https://nextstat.io/blog/hep-benchmark-harness): Threat model (5 failure modes: model/optimizer/warm-start/environment/reporting mismatch), correctness gates, parameter mapping by name, convergence metadata, pyhf + ROOT/RooFit
- [Bayesian Benchmarks](https://nextstat.io/blog/bayesian-benchmarks-ess-per-sec): ESS/sec + health metrics (divergence rate, R̂, E-BFMI), correctness gates (NUTS quality smoke + SBC), runnable commands, cargo bench
- [Pharma Benchmarks](https://nextstat.io/blog/pharma-benchmarks-pk-nlme): Threat model (6 failure modes), Apex2 pharma reference gate, seed harness (run.py + suite.py), correctness gates, scaling protocols
- [JAX Compile vs Execution](https://nextstat.io/blog/jax-compile-vs-execution): Two-regime measurement (TTFR + warm throughput), component timings (t_import, t_first_call, t_second_call, t_steady_state), 3 cache policy modes
- [Unbinned Event-Level Analysis](https://nextstat.io/blog/unbinned-event-level-analysis): Extended unbinned likelihood, PDF catalog, EventStore, resonance search workflow
- [Compiler vs Hybrid GPU Fits](https://nextstat.io/blog/compiler-vs-hybrid-gpu-fits): MoreFit (symbolic JIT) vs NextStat (CUDA kernels + ONNX flows + reverse-mode AD)
- [NUTS Progressive Sampling](https://nextstat.io/blog/nuts-progressive-sampling): NUTS v10 — progressive sampling at top-level tree join, ESS/leapfrog diagnostic, reproducible multi-seed benchmarks vs CmdStan 2.38, 3.2× ESS/sec on hierarchical posteriors

## Demos

- [Physics Assistant](https://nextstat.io/docs/physics-assistant): End-to-end demo — ROOT ingest → anomaly scan → discovery p-values → CLs limits → plot artifacts. Local, server, and Docker transport modes.

## Project

- [White Paper](https://nextstat.io/docs/whitepaper): Technical overview — motivation, architecture, inference algorithms, validation methodology, scope, performance highlights
- [Changelog](https://nextstat.io/docs/changelog): Release history

## Key facts

- Version: 0.9.5
- Language: Rust core + Python (PyO3) bindings
- License: AGPL-3.0-or-later OR LicenseRef-Commercial
- Install: pip install nextstat
- GPU backends: CUDA, Metal
- Interpolation: Code4/Code4p (default), Code1/Code0 (HistFactory/pyhf compat)
- Optimizer: L-BFGS-B with reverse-mode AD
- Formats: pyhf JSON, HS3 v0.2 (ROOT 6.37+), HistFactory XML, Arrow IPC, Parquet
- pyhf parity: 7-tier tolerance contract (1e-12 per-bin to 0.05 toy stats)
- Performance: 37x-880x faster than ROOT/RooFit on profile scans
- Agentic tools: 21 tools (HEP + GLM + Bayesian + Survival + Econometrics + Causal + Kalman + Meta + Churn + Insurance)

## Optional: full docs

For full documentation content see: https://nextstat.io/llms-full.txt