Skip to main content
A Transaction Science Open Standard

Point-to-point media transport.
The decoder rides with the stream.

WAI is a container plus capability-dispatch envelope. One file declares the capability it needs; the sink either has it or falls back. Three mandatory floor codecs (PNG, FLAC, zstd) guarantee something opens everywhere. Modern and neural capabilities are optional and registered. Apache-2.0.

3
Floor codecs
14
Registered capabilities
1
Envelope
0
New codecs invented
Explore

The Pipeline

The sink used to be a toaster.

Every shipped codec since the 1990s was designed for a receiver that could afford an inverse transform fixed in silicon and not much else. That assumption stopped being true a decade ago. Every phone has an NPU. Every laptop has a GPU that idles ninety-five percent of the time. WAI is the envelope that takes the present day seriously: a sink with capabilities, a file that names the one it needs, and a deterministic dispatch in between.

Pre · sink as fixed-function decoder

One title becomes dozens.

A single title is encoded into every codec the device fleet might know — H.264, HEVC, VP9, AV1 — at every resolution rung, every audio track, every HDR variant. Caches push every variant near every viewer. ABR negotiates which one the sink can decode. The encode farm and the cache are the cost of pretending the sink is dumb.

format pretends the receiver is load-bearing
Post · sink as computer

One file, one capability, one decode.

The source picks the capability best fit for the content; the file declares it in a manifest; the sink either advertises that capability or falls back to a declared alternate. The payload bytes are the canonical bytes the codec already emits — PNG, JPEG-XL, Opus, AV1 — not a re-wrapping. Dispatch is six numbered steps long.

format dispatches by capability, not by container shape
01
read-envelope
WAI1 magic, manifest length, payload length
02
parse-manifest
UTF-8 JSON, capability, fallback
03
resolve-capability
Primary if advertised, else fallback
04
decode-payload
Canonical codec bytes for the resolved capability
Capability dispatch — the six-step algorithm (§4)
01 · Read manifest. Reject unknown major. wai "1.0" only in v1.0
02 · Let cap = model_requirement.capability. the dispatch key
03 · If sink advertises cap, decode and return. happy path
04 · Else let fb = model_requirement.fallback. may be null
05 · If sink advertises fb, decode via fb and return. declarative fallback
06 · Else file is inert at this sink. Return missing-capability. explicit, not silent

Every conforming sink MUST implement these six steps. The fallback is declarative in v1.0 (a string the sink can look for) and executable in v1.1 (a second payload carried in the same envelope, selectable by policy).

Two conditions, one envelope

Neural condition. Zeroth condition. Same dispatch.

The neural condition regenerates the media from a compact conditioning payload against a shared ambient prior the sink already holds — a model is a requirement of existence, never transported, never hash-pinned. Exactly as RFC 1952 gzip does not ship its decoder. The zeroth condition dispatches to a registered menu of SOTA royalty-free codecs: PNG (RFC 2083), FLAC (xiph.org), zstd (RFC 8478) at the mandatory floor; AVIF (ISO/IEC 23000-22), JPEG-XL (ISO/IEC 18181), JPEG (ITU-T T.81), Opus (RFC 6716), AV1 in the recommended modern set.

WAI does not define new codecs. The value is the envelope, the manifest, the dispatch rules, and the shared-prior neural slot — not in re-implementing libjxl or libopus poorly. The capability owner owns its payload format; the envelope is unchanged when new capabilities register.

The Anatomy

Correctness in the types.

Most media transport bugs are not algorithmic. They are mis-stated assumptions about who decodes, what fidelity is contracted, whether the model the conditioning is shaped for is actually present. WAI puts the invariants in the envelope and the manifest so the assumptions cannot be stated wrong in the first place. Three sections in the file, seven fields in the manifest, one dispatch algorithm.

Container envelope — v1.0 (§2)
magic ASCII WAI1 (4 B)
man_len u32 little-endian
manifest UTF-8 JSON
payload_len u32 little-endian
payload canonical codec bytes

Bytes 0..4 MUST be ASCII WAI1. A reader MUST reject any file whose first four bytes differ. All multi-byte integers little-endian. The manifest is UTF-8 JSON with no insignificant whitespace, preserving key insertion order so files round-trip byte-identically.

Manifest schema (§3)
wai "1.0"
media image | audio | video | text
intent replicate | create | improve
capability wai.image.jxl
fallback wai.image.png
conditioning.kind jxl
target { w, h, sr, dur, frames }

media is informational; the capability is authoritative. replicate is the only intent with normative reconstruction semantics in v1.0. target is informational; the payload is self-describing per the codec's own format.

Capability is the contract

Named, not supplied.

The capability string is what the sink dispatches on. wai.image.jxl means the payload is libjxl bytes; the sink either has libjxl wired up or falls back. WAI does not embed the decoder.

Payload is canonical

No re-framing.

The payload bytes are exactly what the named codec normally emits — a JXL file, an Opus stream, a PNG. WAI does not wrap or re-frame them. A bit-exact decode against the reference library is the conformance test.

Model is a precondition

Never transported.

A neural capability names a model the sink MAY have installed. WAI does not ship weights and does not hash-pin them — exactly as .mp3 does not ship its decoder. The capability declares what is required to exist at the sink.

Inert is loud

No silent fallthrough.

If the sink has neither the primary capability nor the declared fallback, the file is inert at this sink and the dispatch returns a clear missing-capability error. No silent reinterpretation, no first-payload assumption on major-version mismatch.

Versioning is strict

Reject unknown major.

The wai field is MAJOR.MINOR. A reader MUST reject an unknown MAJOR. A reader MUST accept an unknown MINOR by ignoring fields it does not understand. A registered capability's payload format is fixed forever.

v1.1 multi-rendition

Fallback becomes executable.

v1.1 (WAI2) adds a rendition table that carries every declared capability's payload in one envelope. The sink picks by deployer policy — quality vs latency, native vs WebNN, bandwidth headroom — not by which capability happens to be installed.

Why the envelope is the standard

Re-implementing JPEG poorly is strictly worse than calling libjpeg.

Earlier WAI drafts defined custom transform and entropy stages for each medium. That direction was dropped. The field's mature libraries (libjxl, libavif, libopus, libFLAC, libzstd, libdav1d) are SOTA. WAI's value is the envelope, the capability dispatch, and the neural-shared-prior model — all of which are absent from existing standards. The zeroth condition's purpose is availability, not codec novelty.

The Spec

Three floor codecs. A modern menu. A neural slot.

Every conforming sink MUST implement the mandatory floor — PNG, FLAC, zstd. These three guarantee any sink with the standard system codec libraries can open something in every WAI-supported media class. The recommended modern set is SHOULD-implement; neural capabilities are OPTIONAL and registered by their owners. Each capability owns its payload format. The envelope is unchanged when new capabilities are added.

Mandatory floor — MUST implement (§5)
wai.image.png

PNG (RFC 2083)

universal lossless image

wai.audio.flac

FLAC (xiph.org/flac)

universal lossless audio

wai.text.zstd

zstd (RFC 8478)

universal compressed bytes

Recommended modern set — SHOULD implement
wai.image.jxl JPEG-XL (ISO/IEC 18181) lossless + lossy, modern
wai.image.avif AVIF (ISO/IEC 23000-22) lossy SOTA (AV1-based)
wai.image.jpeg JPEG (ITU-T T.81) legacy compatibility
wai.audio.opus Opus (RFC 6716) lossy SOTA
wai.video.av1 AV1 (AOMedia) lossy SOTA
wai.video.av1.lossless AV1, quantizer=0 mathematically lossless YUV
wai.text.xz XZ/LZMA2 maximum classical text ratio
Neural capabilities — OPTIONAL, registered by capability owner
wai.neural.encodec32 Meta EnCodec 32 kHz ~3 kbps acceptable music
wai.neural.dac Descript Audio Codec 44.1 kHz 6 kbps near-transparent music
wai.neural.mimi Kyutai Mimi, 12.5 Hz frame ~1.1 kbps real-time speech
wai.neural.wavtokenizer WavTokenizer single-codebook VQ 0.9 kbps beats DAC @ 9 kbps on UTMOS
wai.neural.bmshj2018 bmshj2018-factorized + rANS ~80x vs raw RGB at q=3
wai.neural.video_bmshj2018 per-frame bmshj2018 1000x+ vs raw RGB, WebGPU-decodable
wai.neural.glc Generative Latent Coding <0.05 bpp where JPEG-XL collapses
wai.neural.dcvc_rt DCVC-RT (native sink only) AV1 quality at 21% less bitrate

Neural capabilities are picked on best-quality-per-bit at each bitrate band — not on whichever model is famous or small. DCVC-RT and similar inter-frame neural video codecs require NVIDIA CUDA and custom kernels at decode time and do not deploy to browser sinks. Browser sinks should declare wai.neural.video_bmshj2018 for the video media class.

One payload, one capability

The standard's value is the dispatch — not a new codec.

A WAI v1.0 payload always corresponds to exactly one capability. The payload bytes are the canonical bytes the named codec normally emits. Conformance is verified by bit-exact decode-equivalence with the reference library named in §5 for each registered capability. New capabilities are added by registration; existing capabilities MUST NOT be redefined — once a capability string is registered for a codec, its payload format is fixed forever. Removing or repurposing a capability requires a new MAJOR.

The Reference Implementation

One reference implementation. Many expected.

Transaction Science ships wai-rs in Rust as the reference. It builds as lib, cdylib, and staticlib so the C ABI in src/ffi.rs drops in as libwai.dylib / .so / .dll and can be called from any language. The crate wraps the field's mature libraries — it does not re-implement them. Conformance is bit-exact decode-equivalence with each registered capability's named reference library.

wai-rs — at a glance (Appendix A)
language Rust
crate types lib, cdylib, staticlib
C ABI src/ffi.rs (drops in as libwai.dylib / .so / .dll)
image stack image crate, ravif, jpegxl-rs (libjxl 0.11.x)
audio stack opus (libopus 1.x), claxon + flac-bound
video stack rav1e (encode), dav1d (decode)
text stack zstd (libzstd), xz2 (liblzma)
test cargo test --lib

Run cargo test --lib with RUSTFLAGS="-L /opt/homebrew/opt/flac/lib" on macOS, or via the committed .cargo/config.toml. The harness exercises every registered capability through the full WAI envelope.

Wraps, does not re-implement

The reference impl is built on libjxl, libopus, libFLAC, libavif, libzstd, libdav1d, and liblzma. The standard's value is the envelope and the dispatch; the codec libraries that already exist are the SOTA the envelope routes to.

Bit-exact conformance

A conforming encoder MUST emit containers that any conforming sink can read, with the payload set to the canonical byte stream of the registered capability's reference library. A bit-exact decode-equivalence check is the conformance test.

Drop-in C ABI

cdylib / staticlib with a stable C FFI surface in src/ffi.rs. Any language with a C FFI can read and write WAI envelopes without re-implementing the manifest parser or the dispatch.

Apache-2.0

The specification text and the wai-rs reference implementation are published under the Apache License 2.0. Royalty-free implementation is a precondition for adoption — following the AV1, AVIF, JPEG-XL open-codec precedent and avoiding the H.264 / H.265 royalty trap.

A call for second implementations

A standard with one implementation has one failure mode.

The reference implementation is a starting point, not the standard. The conformance test vectors are bit-exact decode-equivalence checks against the named reference libraries; anyone can build a second implementation in another language, on other assumptions, with another team's review, and prove it interoperates by passing the vectors. Until that second implementation exists, the ecosystem's robustness is bounded by a monoculture. The spec, the vectors, and the harness are all the invitation we can offer.

Services

The standard is free. The operations are the offer.

WAI is Apache-2.0; the envelope, the manifest schema, the dispatch algorithm, and the reference implementation are public. Transaction Science writes wai-rs and runs the optional hosted services for parties who want the standard without operating the standard. The customers are media platforms, conferencing vendors, archives, device-OS teams, and anyone shipping media to sinks they don't control.

Capability registration

Get a capability minted

A vendor or research group with a new codec submits a capability string, a payload format spec, and a reference encoder + decoder. Transaction Science runs the bit-exact harness, publishes the entry in the registered-capabilities table, and freezes the payload format. The capability owner owns the format; the registry hosts the entry.

you get: capability string · harness pass · frozen payload format entry
Conformance testing

Bit-exact decode-equivalence

For teams building their own WAI sink or encoder, the conformance service runs the full vector suite — every registered capability through the full envelope — and returns a pass/fail report per capability with the failing envelope bytes attached. The standard's identity is the vectors; a pass is the membership test.

you get: vector run · per-capability pass/fail · failure traces
Hosted encode pipeline

Operate the encode side

A managed deployment of the encode pipeline for parties who want the standard without running their own encode farm. The code that runs is the published code; the customer keeps the right to migrate to a self-hosted encoder at any time without changing a wire format.

you get: managed encode · SLA · wire-format-stable migration path
Reference-impl support

Commercial support on wai-rs

Support contracts on the Rust reference implementation, integration help for teams adopting it via the C ABI, and conformance review for teams building their own. The implementation is open; the engineering time around it is the product.

you get: support · integration · conformance review
Sibling open standards

WAI is one of four Apache-2.0 standards Transaction Science stewards: OpenPay for payment acceptance, Smart Byte for the neutral value substrate, and EOC for energy-optimized AI compute. All four share the same governance pattern — permissive licence, ownerless protocol, services at the edges.

Steward, not owner

The standard has no owner by construction.

Publishing the spec, shipping the reference implementation, running the services that keep it healthy — that's stewardship, and it's a business. It isn't ownership. Anyone can implement WAI; the conformance vectors are the membership test. The moat is at the edges — the hosted encode, the conformance service, the registry operation, the support — never in the envelope, the manifest, or the dispatch algorithm.

Governance

Apache-2.0. Reference, not certified.

WAI is published as a reference: the architecture is complete and the reference implementation compiles, runs, and tests cleanly across every supported platform. No part of it has been formally adopted by a standards body, audited by a regulator, or deployed in regulated production. The standard's identity is its conformance vectors and its registered capability table — not any brand.

Apache-2.0, royalty-free

The specification text and wai-rs reference implementation are Apache-2.0. Royalty-free implementation is a precondition for adoption — following the AV1 / AVIF / JPEG-XL open-codec precedent, avoiding the H.264 / H.265 royalty trap.

Frozen wire format

The container (§2) and manifest schema (§3) are frozen for all of v1.x. The magic bytes WAI1 and WAI2 are stable; sinks dispatch by magic. Removing or repurposing a registered capability requires a new MAJOR version.

Capability registry

New capabilities MAY be added in any MINOR revision. Once a capability string is registered for a codec, its payload format is fixed forever. The registry is the ledger; the entries are append-only by design.

Forkable, but conformance-gated

Permissive licensing means anyone can fork. But a fork that diverges fails the bit-exact decode-equivalence harness, and the divergence is reviewable. Forking is allowed; silent drift isn't.

Ingest, not compete

New codec work from open and private efforts is ingested into the registry under permissive licences. The standard is a coordination surface for the field's best codec work — not a competitor to any single implementation.

The membership test

An implementation either passes the published conformance vectors or it doesn't. Conformance is the membership test — not a relationship with Transaction Science, not a certification body's stamp.

Why the envelope holds

The protocol's identity is its vectors and its registry.

"WAI" is a name on a set of documents and a Rust crate. The documents define the envelope, the manifest schema, the dispatch algorithm, and the registered capability table. The vectors verify any implementation. Strip the name off the whole stack and nothing about how it works changes — that's the point: a media-transport envelope that can't be captured can't be captured by branding it, either.