0002 — the FFmpeg→WASI build pipeline¶
Status: SUPERSEDED 2026-06-28 by 0007 (kept as the record,
per spec-driven-development). This spec assumed compiling the ffmpeg CLI to wasm
(adapting go-ffmpreg). Research found that path forces either an EOL FFmpeg (n5.1) or,
on current FFmpeg, the multithreaded CLI which wazero cannot run without CGO. Spec 0007
pivots to a libav-direct engine (ffmpeg-wasi) that links the libraries and drives them
with our own thin C program — current FFmpeg, CGO-free, on wazero. The build moves to the
separate ffmpeg-wasi repo; afmpeg consumes its artifact. Read 0007 for the live design;
the build/licensing detail below remains useful background for that repo's own build spec.
Date: 2026-06-26 (superseded 2026-06-28)
Parent: 0001-afmpeg.md §6, §9, §10 (D-B, D-C)
Owns: R-AF-3 (the codec/filter set), R-AF-6 (reproducible build), R-AF-10 (LGPL
variant) — all carried forward to 0007
1. Purpose¶
Produce a reproducible ffmpeg.wasm — FFmpeg + x264 compiled to wasm32-wasi — that
contains exactly the codecs/filters afmpeg's consumers need, and nothing else. This is the
hard, separable sub-project (0001 §6): it has no Go dependency and can be built and
validated entirely on its own. The Go layers (0003/0004) develop against its output (or a
stand-in) in parallel.
Per D-B the pipeline is adapted from go-ffmpreg's build.sh (Codeberg —
https://codeberg.org/gruf/go-ffmpreg), the proven wasi-sdk start point, not built from
scratch. We extend its ./configure to add the filters/encoders go-ffmpreg's stock build
lacks (xfade, AAC), which is the precise gap the keyrx spike identified.
2. Scope¶
In scope:
- A pinned, containerised toolchain (wasi-sdk/clang) and a build/ tree that produces
ffmpeg.wasm deterministically from source.
- The ./configure line enabling only the R-AF-3 set.
- Cross-compiling x264 (GPL build) to wasm as an FFmpeg dependency.
- Two build variants: full/GPL (default, with x264) and an LGPL variant
(openh264 for H.264 encode) — D-C / R-AF-10. (Shipped: ffmpeg-wasi n8.1.2-2.)
- A provenance manifest (versions, configure line, sha256, size) emitted alongside the wasm.
- A gated CI job (slow; not on every push).
Out of scope: - The Go bridge, runtime, API (0003/0004) — they only consume this artifact. - wasm-threads / SIMD builds (0006 / R-AF-12) — pin to a no-pthreads FFmpeg for now. - Hardware accel (0001 non-goal).
3. The capability set — a general baseline (R-AF-3)¶
afmpeg is a general-purpose toolkit (spec 0001 D-F), so the build carries a curated
general baseline of codecs/filters/muxers covering common workflows — not one
customer's graph. The baseline is deliberately bounded (size + build time matter; this is
not "all of ffmpeg") and re-enabled explicitly from --disable-everything. A lean
variant (a smaller subset) and the full variant are selectable (R-AF-9); the table
below is the full baseline.
| Category | Baseline items | Licence |
|---|---|---|
| Demux | mp4/mov, matroska/webm, mp3, wav, image2, gif | LGPL |
| Video decode | h264, vp8/vp9, mjpeg, png, gif | LGPL |
| Audio decode | aac, mp3, opus, vorbis, pcm, flac | LGPL |
| Video filters | scale, crop, pad, fps, format, setsar, transpose, overlay, concat, xfade |
LGPL |
| Audio filters | amix, adelay, volume, afade, aresample, aformat, alimiter |
LGPL |
| Video encode | libx264 (H.264, full/GPL variant); mjpeg, png (thumbnails) |
x264 GPL; rest LGPL |
| Audio encode | aac (native), libopus/opus, pcm, flac |
LGPL |
| Mux | mp4/mov, matroska/webm, mp3, wav, image2 | LGPL |
| Probe | format=duration (and stream info) across the demux set (R-AF-5) |
LGPL |
The list is a starting point to refine during the build (entries may move between the lean/full variants, or drop if they bloat the module disproportionately) — the principle is a general baseline validated by several unrelated workflows, not a single consumer's command. Record the final enabled set in the provenance manifest (§4).
Validation — the proof-of-capability bar. The artifact must run a spread of unrelated
invocations to a valid output inside the guest: e.g. a transcode (mkv→mp4 h264/aac), a
scale, an overlay (-filter_complex), a concat, a single-frame thumbnail, an audio
extract, and — as one example among them — keryx's crossfade reel (looped stills →
xfade chain + amix/alimiter → libx264/AAC mp4 +faststart). These become the
0004/0005 end-to-end tests over the vfs bridge. keryx's reel is a subset of the
baseline, not its definition.
4. Build approach (adapt go-ffmpreg)¶
- Toolchain, pinned. wasi-sdk (clang →
wasm32-wasi) at a fixed version, in a Docker image pinned by digest. The image is the reproducibility boundary (R-AF-6). - x264 → wasm first. Cross-compile x264 for
wasm32as a static lib FFmpeg links. (Full/GPL variant only; the LGPL variant skips it — §6.) - FFmpeg
./configure, adapted from go-ffmpreg'sbuild.sh:--target-os=none --arch=wasm32 --enable-cross-compile --disable-everything, then--enable-*only the §3 set,--enable-gpl --enable-libx264(full variant). - Pin FFmpeg to a no-pthreads-requiring release (go-ffmpreg pins n5.1.x for exactly this — 0001 §6/§9): wazero has no wasm-threads yet, so the encode is single-threaded. The pinned version is recorded in the provenance manifest.
- Emit
ffmpeg.wasm+ffmpeg.wasm.json(provenance: ffmpeg/x264/wasi-sdk versions, full configure line, sha256, byte size, build date, variant).
build/ layout (proposed):
build/
Dockerfile # pinned wasi-sdk toolchain image
build.sh # adapted from go-ffmpreg; drives configure + make
configure.full.sh # the --enable-* set incl. x264 (GPL)
configure.lgpl.sh # the LGPL variant (openh264 / no x264)
versions.lock # ffmpeg, x264, openh264, wasi-sdk pins
README.md # how to build, how to bump a pin, provenance format
5. Licensing & distribution (D-C / R-AF-10)¶
Per D-C the build outputs are governed as separate artifacts, decoupled from the Go module's licence:
- The default full/GPL
ffmpeg.wasm(with x264) is published as a release/download artifact, never committed raw and never//go:embed-ed into the Go package..gitignorealready excludesinternal/wasm/*.wasm— keep it that way. 0004 fetches/loads it at runtime or build-time wiring (its decision), so the copyleft obligation attaches only to a consumer who bundles it, not to the afmpeg library source. - The Go module itself is permissively licensed (Apache-2.0 or MIT — confirm in the
scaffolding task; the repo currently carries an MIT-style
LICENSE). - An LGPL variant (
configure.lgpl.sh: openh264 or no H.264 encoder) is produced by the same pipeline and published alongside, for permissive consumers (R-AF-10). Its H.264 quality/patent caveats are documented (0001 §9, D-C rationale). It is tracked, not gating v1 — keyrx ships on the GPL/x264 build. - The provenance manifest records the variant and its licence so a consumer can choose knowingly.
6. Requirements¶
R-0002-1The full/GPL build runs the §3 spread of unrelated workflows end-to-end to valid outputs (transcode, scale, overlay, concat, thumbnail, audio extract, and the keryx reel as one example) — proving the general baseline, not a single command.R-0002-2The build is reproducible: same pinned inputs → identicalffmpeg.wasmsha256 (R-AF-6). The Docker image and all source versions are digest/tag-pinned inversions.lock.R-0002-3A provenance manifest (ffmpeg.wasm.json) is emitted with versions, the exact configure line, sha256, size, and variant.R-0002-4CGO_ENABLEDis irrelevant here (no Go), but the artifact MUST load under a pure-Go wazero runtime (validated by 0004) — i.e. plainwasm32-wasi, no host imports beyond WASI.R-0002-5The GPLffmpeg.wasmis not embedded in or committed to the Go module source (D-C); it is a published artifact.R-0002-6✅ An LGPL variant builds from the same pipeline (R-AF-10) with openh264 H.264 encode; the AVC patent caveat is documented (ffmpeg-wasidocs/explanation/licensing.md).R-0002-7Module size is recorded; a target ceiling is noted (go-ffmpreg ≈ 7.5 MB gzip as the reference — 0001 §9).
7. Definition of done¶
just(orbuild/build.sh) producesffmpeg.wasm+ manifest from a clean checkout.- The §3 command runs inside a throwaway WASI host (e.g.
wasmtimewith a temp preopen) to prove capability independently of the Go layers — this is the Phase-1 gate. - Two consecutive builds yield identical sha256 (R-0002-2).
- A gated, slow CI job builds + verifies the wasm; it does not run on every push.
build/README.mddocuments the build, a pin bump, and the provenance format.
8. Risks (carried from 0001 §9)¶
- Build maintenance burden — owning an FFmpeg-WASI build is ongoing (version/toolchain drift). Mitigated by pinning + adapting (not forking-and-diverging) go-ffmpreg's build.
- Single-threaded encode — pinned no-pthreads FFmpeg is slow; acceptable for the in-memory edge case (0001 §9). Revisit under 0006 / R-AF-12.
- wazero writable-fs needs (moov atom seek-on-write) surface here as a muxer concern but
are validated in 0003 — flagged so the configure keeps
+faststartworkable.
9. Sequencing¶
Independent — can start immediately, in isolation, before any Go code. 0003/0004 develop
against a stand-in module until this lands, then swap to the real artifact for the R-AF-3
end-to-end test. Reference: go-ffmpreg build.sh.