This repository contains: (1) the SNIF architecture paper, (2) a working BEAM + WASM implementation, (3) integration tests and benchmarking, (4) formal verification artifacts.
SNIFs — Safer (sandboxed) NIFs — give the BEAM crash-isolated native interfaces.
A normal NIF runs native code inside the VM’s address space, so any fault in it
(out-of-bounds access, overflow, @panic, …) kills the entire BEAM VM. A SNIF keeps
the NIF’s interface (you call a native-implemented function and get a value) but
changes its implementation: the guest is compiled to WebAssembly (Zig and Rust today)
and run in a wasmtime sandbox via wasmex, so a guest fault becomes a catchable
{:error, reason} tuple and the calling BEAM process survives.
It is a like-for-like replacement for the dominant kind of NIF — pure native computation
over numbers and byte-buffers (crypto, compression, parsing, DSP, math kernels) — with two
deliberate interface changes (the result is fallible, and only flat numeric data crosses the
boundary; see Scope: what it replaces (and what it doesn’t)). It is not a replacement for the full erl_nif surface (BEAM
terms, resources, monitors, message-sending, dirty scheduling, zero-copy binaries, I/O).
A SNIF preserves the NIF interface and sandboxes the implementation. That makes it a genuine drop-in for one class of NIF, with clearly-stated edges:
Like-for-like (plus crash isolation): native compute/buffer functions — the bread-and-butter NIF. Same call site, same native-speed class, plus isolation.
Two deliberate interface changes (so it is not a 100% transparent swap):
-
The result is fallible:
{:ok, [values]} | {:error, reason}. A normal NIF returns the value directly or raises; here the isolation is made visible and the caller handles it. -
Only numbers and byte-buffers cross. A normal NIF receives actual BEAM terms and can read binaries zero-copy; a SNIF sees only
i32/i64/f32/f64and bytes you marshal into linear memory.
Out of scope (a SNIF does none of this — use a real NIF or a Port): BEAM-term manipulation,
enif_make_resource/monitors, sending messages into the BEAM, dirty-scheduler integration/
yielding, zero-copy term access, and any I/O (the guest is wasm32-freestanding, no WASI — ADR-002).
So "Safer NIFs" is precise for crash-isolated native compute functions, with the scope above; read unqualified as "a safer version of every NIF," it would overclaim.
Scope ceiling (by design). SNIF stops at NIF-parity + isolation — it adapts and perfects the NIF, it does not extend past it. Graduated/tunable integration, capability negotiation, transaction-gating and mode-dependent permissions are deliberately not SNIF features; they belong to the cleave (the implementation of graduated integration) and groove (the new inter-service protocol). If a good idea takes SNIF past the NIF goal, it is routed there and considered there, not added here.
A SNIF costs more than a raw NIF (a direct in-VM call) but far less than a Port (an OS-process round-trip). The cost decomposes into three independent parts; which one matters depends on the workload:
| Source | Shape & cost |
|---|---|
Dispatch (per call) |
Fixed ~29 µs (median, pooled): the BEAM→sandbox boundary crossing. Dominates tiny, high-frequency calls; amortizes to nothing once a call does real work. |
Compute |
Multiplicative ~1.1–1.5× native (estimate, not yet measured for our kernels): the JIT’d, bounds-checked WASM running the actual work. The only unamortizable tax; dominates long/heavy calls. Reducible via SIMD/AOT and, ultimately, proving bounds-safety to drop runtime checks. |
Marshalling |
∝ bytes: copying buffers into/out of linear memory. Dominates data-heavy calls; mitigated by keeping buffers resident across calls. |
Measured in-BEAM (OTP 25, fibonacci(20), n=2000; from demo/, mix run bench/snif_bench.exs):
case mean_µs p50_µs p99_µs per-call (compile every call) 3978.21 3571.00 10594.00 <- the naive path pooled (compile once) 70.46 29.00 500.00 <- 56x faster buffer round-trip (sum_f32) 20.14 15.00 71.00 Port (OS process, isolated) 1617.36 1445.00 4483.00 <- the other isolated option
So pooled SNIF is ~23× cheaper than a Port for the same crash isolation, and the "slow SNIF" worry (~4 ms) is purely the naive compile-every-call path — pooling removes it. For a NIF doing real work the bridge tax is a flat ~29 µs plus a modest compute multiplier; for a tiny op in a hot loop the ~29 µs floor dominates (batch such calls).
Formal verification: 10 proofs machine-checked and CI-gated (just proof-check-all) — 6 Idris2
ABI proofs (incl. the buffer-guest ABI), 1 Idris2 core-types proof, 1 Lean4 API proof, and 2 Agda
proofs: the SnifVerdict echo×epistemic safety bridge (crash-isolation dichotomy, non-forgery,
restricted deniability) and SnifIsolation / SEC-1 — the operational crash-isolation theorem
(host survives every outcome; deniability re-derived over the run; a 6-origin trap taxonomy),
proven-modulo-an-explicit-runtime-TCB. The ABI/verdict proofs verify a logical model of the
interface; SEC-1 adds the operational layer modulo that TCB — see Honesty: what is proven, what is tested, what is trusted.
In-BEAM tests: 21/21 ExUnit pass (verified on OTP 25 / Elixir 1.18.4 — the precompiled wasmex
nif-2.15 loads fine): crash isolation across every failure mode (OOB, @panic, overflow,
divide-by-zero → {:error, _}, caller process alive after each), the worker pool (isolation,
no-shared-state, concurrency, Store recycling), a fuel-based liveness/DoS guard, and Zig
Rust→wasm32 guests at parity. Every failure mode traps correctly under ReleaseSafe; ReleaseFast
turns the same faults into silent wrong answers (the negative control).
The safety story is layered, and the name "Safer" (not "Safe") encodes the top of it:
-
Proven (machine-checked): the 10 proofs verify a logical model — the ABI layout, the result-type algebra, and the verdict’s structural properties (ok⊕trap; a trap yields no value; the residue is deniable). A proof guarantees the model is internally consistent.
-
The model↔code gap (NOT proven): that the model faithfully mirrors the actual Zig/Rust/wasmex code is argued by construction and established empirically by the tests — it is not itself machine-verified. Closing it fully would need extraction or a verified toolchain.
-
Tested (empirical): the operational claim — guest trap ⇒
{:error,_}∧ the BEAM survives — is demonstrated by the in-BEAM tests, and is now mechanically proven (modulo an explicit, minimal runtime TCB) inverification/proofs/agda/SnifIsolation.agda(agda --safe --without-K, gated) — over a small-step host↔guest model, deriving whole-run isolation by induction from per-step primitives. The residual TCB (“wasmtime` ⊨ `FaithfulRuntime”) is the WasmCert-Coq discharge that remains; this is a gap-2 (runtime-trust) result and does not close gap 1 (that the Agda/Idris model faithfully mirrors the actual code). -
Trusted (TCB):
wasmtime/wasmexcorrectness, and that a returning NIF does not corrupt the BEAM scheduler.
Per AUDIT.adoc, claims here must not outrun the proofs: "Safer" means crash-isolated by a
sandbox whose isolation is tested and model-verified, trusting a stated runtime TCB — not
"mathematically proven safe end-to-end."
Always compile Zig WASM with -OReleaseSafe. ReleaseFast removes bounds
checks and converts undefined behaviour into silent wrong answers (empirically
verified — see paper).
zig/src/safe_nif.zig Zig source with five crash modes + fibonacci priv/*.wasm Pre-built WASM binaries (ReleaseSafe + ReleaseFast) demo/ Mix project with wasmex loader + ExUnit test suite docs/whitepapers/ LaTeX paper + compiled PDF
cd demo
mix deps.get
mix testRequires Elixir 1.15+, OTP 25+ (the precompiled wasmex 0.14 nif-2.15 artifact loads on OTP 25 through 28 — verified on OTP 25 / Elixir 1.18.4; the earlier "OTP 26+" floor was conservative). Pooled use is wired via the application supervisor (SnifDemo.Application); the legacy per-call Loader needs no pool.
-
EXPLAINME — claim-to-implementation evidence and honest caveats.
-
Proof status — what is proven, what is tested, what is trusted.
This project is licensed under the Mozilla Public License, v. 2.0. See the LICENSE file for details.
SPDX-License-Identifier: MPL-2.0