0001 — afmpeg: pure-Go FFmpeg on a virtual filesystem¶
Status: APPROVED / INTENT (scaffold — the build has not started. This is the design
+ requirements. The §10 decisions were resolved on 2026-06-26 (see §10) and the work is
decomposed into component specs 0002–0006; those are implemented test-first, each citing
this spec. This remains the source-of-truth thesis.)
Date: 2026-06-26 (§10 resolved 2026-06-26)
Provenance: extracted from the keryx ffmpeg-render-binding spike
(keryx/docs/development/spikes/ffmpeg-render-binding.md), which found no existing
option delivered "in-memory + pure-Go + the filters/codecs we need".
1. What & why¶
afmpeg is a pure-Go FFmpeg binding whose I/O runs on an afero.Fs — including a
fully in-memory filesystem — with no CGO, no host FFmpeg install, and no temp files.
FFmpeg is embedded as a WebAssembly (WASI) module and executed via wazero (a
zero-dependency, pure-Go WASM runtime). Its filesystem syscalls are bridged to an
afero.Fs, so a caller can hand afmpeg in-memory inputs and read in-memory outputs —
the whole pipeline staying in RAM and cross-compiling to a single static binary.
The driving need (keryx): render short reels (xfade crossfade chain + audio mix → H.264/AAC mp4) for in-memory projects (a git remote cloned into RAM, no local checkout) — which today is impossible because keryx shells out to the ffmpeg binary, which needs real files on disk. The spike rejected every existing path; afmpeg is the "wazero + WASM done right" option that fills the gap and is generally reusable.
2. Goals / non-goals¶
Goals
- Pure Go: CGO_ENABLED=0, cross-compiles, single static binary, no host FFmpeg.
- I/O over afero.Fs — in-memory (MemMapFs), OS (OsFs), or any backend.
- Customer-agnostic and general-purpose. afmpeg is a general ffmpeg toolkit for
any workflow (transcode, scale/crop/pad, overlay, concat, crossfade, thumbnail,
audio extract/mix, mux, …). afmpeg defines the abstraction; consumers adapt to it —
keryx is the first reference customer, not the API author. No consumer's
opinionated structure (e.g. keryx's "reel") is baked into afmpeg's surface.
- A general baseline codec/filter/muxer set covering common workflows (§5 R-AF-3), of
which keryx's reel render is one validating example, not the definition.
- A clean, idiomatic Go API: the universal Run(fs, args…) primitive plus a general,
use-case-agnostic command builder (R-AF-7).
- A reproducible FFmpeg-WASM build pipeline (the hard, separable sub-project).
Non-goals (initially) - Hardware acceleration (NVENC/VAAPI/VideoToolbox) — WASM is CPU-only. - Real-time / streaming / live capture. - A full typed libav* object API (à la go-astiav) — afmpeg is invocation-first; a typed API is a possible later layer. - Windows is best-effort, not a launch target (Linux + macOS first).
3. Approach¶
Three layers; the middle one is the novel engineering.
- Embedded FFmpeg-WASM module (
internal/wasm/ffmpeg.wasm). FFmpeg + its deps (x264, etc.) compiled towasm32-wasivia the wasi-sdk/clang toolchain, with a configure that includes only the needed codecs/filters/muxers (size + build time). Produced by the build pipeline (§6); embedded with//go:embed(or downloaded on first use — §9 open question). go-ffmpreg'sbuild.shis the reference start point. - The afero ↔ wazero
sys.FSbridge (internal/vfs, the heart of afmpeg). ffmpeg-in-the-guest issues WASI filesystem syscalls (path_open,fd_read,fd_write,fd_seek, …); wazero routes them to a mountedexperimental/sys.FS. afmpeg implements thatsys.FSbacked by anafero.Fs, so the guest's reads and writes hit the caller's afero (e.g. an in-memoryMemMapFs) with no host disk. Also provides/tmp(writable memfs) and/dev/nullthe guest may need. - The Go API (
pkg/afmpeg). Instantiate the runtime once (compile the module), thenRun(ctx, fs, args…)mountsfs, runs the ffmpeg command, returns exit/stderr. Higher-level helpers (a timeline/render builder mirroring keryx'sprovider.Timeline) layer on top.
pkg/afmpeg/ public API (Runtime, Run, options, errors; later: render helpers)
internal/vfs/ afero.Fs → wazero experimental/sys.FS adapter (THE core)
internal/wasm/ embedded ffmpeg.wasm + the wazero module wiring
build/ the FFmpeg→WASI build pipeline (wasi-sdk, configure flags, Docker)
docs/development/specs/
4. The Go API (sketch — not final)¶
package afmpeg
// Runtime holds the compiled wazero module + runtime; build once, reuse (compilation
// is the expensive step). Safe for sequential reuse; concurrency model is §10-D.
type Runtime struct { /* ... */ }
func New(ctx context.Context, opts ...Option) (*Runtime, error) // compiles the module
func (r *Runtime) Close(ctx context.Context) error
// Run executes one ffmpeg invocation with its filesystem bridged to fs (paths in
// args resolve against fs — e.g. "in/cover.png", "out/reel.mp4"). Returns the exit
// code + captured stderr; no host disk is touched.
func (r *Runtime) Run(ctx context.Context, fs afero.Fs, args ...string) (Result, error)
type Result struct { ExitCode int; Stderr string }
// Probe mirrors ffprobe (duration etc.) over the same fs bridge.
func (r *Runtime) Probe(ctx context.Context, fs afero.Fs, path string) (Probe, error)
Usage (keryx-shaped, fully in-memory):
rt, _ := afmpeg.New(ctx)
defer rt.Close(ctx)
fs := afero.NewMemMapFs() // or the caller's in-memory worktree
// ... write cover.png, cards/NN.png, vo/*.mp3, music.mp3 into fs ...
_, err := rt.Run(ctx, fs, "-i", "in.txt", /* xfade graph */, "out/reel.mp4")
mp4, _ := afero.ReadFile(fs, "out/reel.mp4") // the result, in memory
5. Requirements¶
Core (MUST)
- R-AF-1 Pure Go: builds with CGO_ENABLED=0 and cross-compiles (linux/macOS,
amd64/arm64) without a C toolchain or host FFmpeg.
- R-AF-2 I/O is afero.Fs-backed: a MemMapFs runs end-to-end with no host
filesystem access (verifiable — deny OS fs in tests).
- R-AF-3 The WASM build carries a general baseline codec/filter/muxer set covering
common workflows — not one customer's graph. Decode of common containers/codecs
(mp4/mov, mkv/webm, mp3, image2) and encode of at least H.264 (libx264) + AAC, plus a
general filter set (scale, crop, pad, fps, format, setsar, overlay,
concat, xfade, and the audio filters amix/adelay/volume/afade/alimiter/
aresample). The exact curated baseline (and a lean vs full variant, R-AF-9) is owned
by spec 0002. keryx's reel set (xfade + libx264 + AAC + mp4 + the audio filters) is a
subset of this baseline — one validating example, not the definition.
- R-AF-4 Run returns the exit code + stderr; a non-zero exit surfaces ffmpeg's
error tail (no silent failures).
- R-AF-5 ffprobe-equivalent duration probing over the same fs bridge (keryx needs
VO clip durations to drive card timing).
- R-AF-6 Deterministic, reproducible WASM build (pinned ffmpeg + deps + toolchain
versions; documented build/); the artifact's provenance is recorded.
SHOULD
- R-AF-7 A general, use-case-agnostic command builder (typed inputs/filtergraph/
outputs/options, with a raw escape hatch on every scope) so callers compose any
ffmpeg invocation without hand-assembling arg slices. Not tied to any workflow; a
consumer's reel/timeline is built on it, in the consumer's code (spec 0005).
- R-AF-8 Context cancellation aborts a running invocation promptly.
- R-AF-9 Pluggable codec/filter sets via alternate WASM builds (a lean build vs a
full build) selected at construction.
- R-AF-10 Licensing-clean option: an LGPL build variant (no GPL components like
x264) alongside the default full/GPL build (§9).
MAY (later) — dispatched 2026-06-29 (see 0006 §2):
- R-AF-11 A native (purego/CGO libav) backend for speed — dropped. The libav-direct
engine removes the need, and CGO re-introduces the posture §2 avoids; revisit only if a
concrete capability ffmpeg-wasi can't deliver forces a separate opt-in backend.
- R-AF-12 Performance — → 0008 (a measurement-first spike;
wasm-threads/SIMD turned out to be unavailable, so the lever is instance-level parallelism /
build tuning, not threads).
- R-AF-13 A thin cmd/afmpeg CLI — → 0009 (deferred, value-unproven;
job-spec-native only — not the drop-in ffmpeg arg path, which v0.4.0 removed).
6. The FFmpeg→WASI build pipeline (the hard sub-project)¶
Compiling FFmpeg + x264 to wasm32-wasi is the bulk of the effort and is separable
from the Go binding. Approach:
- Toolchain: wasi-sdk (clang targeting wasm32-wasi); a pinned Docker build for
reproducibility (R-AF-6).
- Build x264 (and any other deps) for wasm first, then FFmpeg ./configure with
--target-os=none --arch=wasm32 --enable-cross-compile, --disable-everything then
--enable-* only the §5 R-AF-3 set (keeps the module small + the build tractable).
- Pin to an FFmpeg version that doesn't require pthreads unless/until wazero
supports wasm-threads (go-ffmpreg pins n5.1.x for this reason — §9 perf).
- Output ffmpeg.wasm; record size + the exact configure line + versions.
- Reference: go-ffmpreg's build.sh (Codeberg) — adapt, extend the codec/filter set.
This sub-project can be spiked/built independently; the Go layers (§3.2/§3.3) can be developed against a stand-in WASM module meanwhile.
7. keryx integration (the first consumer)¶
keryx is the reference customer — it validates afmpeg, it does not define it.
keryx renders behind a provider.Renderer seam (internal/render/ffmpeg, the default
shells out to the ffmpeg binary). afmpeg lands as an alternate Renderer
implementation selected by config (providers.render: afmpeg) — no call-site changes
(keryx's pluggable-provider pattern). keryx adapts to afmpeg, not the reverse: keryx
keeps owning its reel decisions (segments, crossfade, encode profile) and builds an
afmpeg command/arg slice for them in keryx's repo (on the R-AF-7 builder or raw
Run); afmpeg carries no reel/timeline types. When afmpeg is usable, keryx's in-memory
render lock-out (spec 0015 D1) lifts: the in-memory worktree's afero fs is handed
straight to afmpeg. Until then keryx stays local-only + native ffmpeg.
8. Phased roadmap¶
- Phase 1 — the WASM build (§6): produce a reproducible
ffmpeg.wasmwith the R-AF-3 set. De-risks everything; can run as its own spike. - Phase 2 — the vfs bridge + Run (§3.2/§3.3): the afero↔sys.FS adapter, the wazero
wiring,
Run/Probe, R-AF-½/⅘. End-to-end in-memory transcode test. - Phase 3 — render helper + keryx backend (R-AF-7, §7): the timeline helper; wire
as a keryx
Renderer; validate parity with native ffmpeg output. - Phase 4 — hardening: LGPL build (R-AF-10), perf (R-AF-12), pluggable backends (R-AF-11), CLI (R-AF-13).
9. Risks & open questions¶
- FFmpeg licensing (the big one). FFmpeg + x264 is GPL; embedding a GPL
ffmpeg.wasmmakes the combined binary GPL. Fine for a private tool (keryx), a problem for a permissively-licensed library. Options: ship a GPL build and an LGPL build (drop x264; H.264 via openh264 or none), and/or distribute the wasm as a separate artifact rather than//go:embed. Decide before public release. - Performance. WASM ffmpeg without wasm-threads is single-threaded → materially slower than native encode (the keryx reel is ~30–45 s of H.264; a multi-minute render could be several× slower). Acceptable for the in-memory edge case; keep native ffmpeg as keryx's fast path for local projects. Re-evaluate with R-AF-12.
- Binary size. The embedded wasm is sizeable (go-ffmpreg's ~7.5 MB gzip). Embed vs download-on-first-use vs separate artifact — R-AF-6/§9 decision.
- wazero writable-fs maturity. The
experimental/sys.FSwrite path must support everything ffmpeg's muxer needs (seek-on-write for the mp4 moov atom,/tmp). Validate early in Phase 2. - Build maintenance. Owning an FFmpeg-WASI build is ongoing work (version bumps, toolchain drift). Weigh vs the materialise↔readback bridge keryx can use meanwhile.
10. Decisions — RESOLVED 2026-06-26¶
All five gating decisions were walked with Matt and resolved before any component spec was drafted. The resolutions below are binding on specs 0002–0006.
- D-A — name. RESOLVED: keep
afmpeg(afero + ffmpeg). Confirmed unclaimed — no GitHub repo, no Go module of that name; theaferoroot is the distinguishing signal vs the existing Go ffmpeg bindings.go.mod(gitlab.com/phpboyscout/afmpeg) stands. - D-B — WASM source. RESOLVED: adapt go-ffmpreg's
build.shas the proven wasi-sdk start point, extending its./configureto add the R-AF-3 set (xfade, AAC, the audio filters). Not from scratch; not interim-stock-only. Owned by spec 0002. - D-C — licensing posture. RESOLVED: permissive Go package + separable GPL wasm. The
Go code is permissively licensed (Apache-2.0/MIT — confirm in 0002); the GPL/x264
ffmpeg.wasmships as a separate downloadable artifact, never//go:embed-ed, so the copyleft obligation only attaches to a consumer who fetches+bundles it. keyrx uses the GPL/x264 build now (private; top quality). An LGPL/openh264 variant is tracked (R-AF-10) for public/permissive consumers. Rationale: x264 is the single GPL piece in keyrx's render — AAC (native LGPL encoder), xfade, scale/fps, amix/adelay/volume/afade/ alimiter, mp4 mux are all already LGPL-clean — so the GPL surface is one separable encoder, not the whole pipeline. Owned by spec 0002. - D-D — concurrency model. RESOLVED (provisional): one invocation at a time per
Runtime, with a pooled/parallel mode as a documented follow-up. Final shape and the instance-pool design are owned by spec 0004 (its local decision against the wazero module wiring). - D-E — scope of v1. RESOLVED: raw
Run(ctx, fs, args…)+Probefirst (the novel bridge + invocation core — specs 0003/0004). The general command builder (R-AF-7) follows as spec 0005, onceRunis proven end-to-end. - D-F — customer-agnostic surface. RESOLVED 2026-06-27 (Matt): afmpeg is a
general-purpose ffmpeg toolkit; no consumer's opinionated structure is baked into its
API. The first reel-shaped render helper (a verbatim port of keryx's
buildArgs) was rejected and reverted before merge; R-AF-7 is reframed as a general command builder and keryx's reel moves into keryx's repo, built on afmpeg. The §5 R-AF-3 codec set is a general baseline, not "keryx's set". keryx adapts to afmpeg (§7).
Component spec map (the compartmentalised work)¶
| Spec | Phase (§8) | Scope | Requirements owned |
|---|---|---|---|
| 0002 wasm-build-pipeline | 1 | FFmpeg+x264 → wasm32-wasi (adapt go-ffmpreg); reproducible Docker build; licence variants |
R-AF-3, R-AF-6, R-AF-10; D-B, D-C |
| 0003 vfs-bridge | 2 | afero.Fs → wazero experimental/sys.FS adapter (the core); /tmp, /dev/null; seek-on-write |
R-AF-2 |
| 0004 runtime-and-api | 2 | New/Run/Probe/Result/Close; module wiring; stderr/exit; ctx-cancel |
R-AF-1, R-AF-4, R-AF-5, R-AF-8; D-D, D-E |
| 0005 command-builder | 3 | R-AF-7 general, use-case-agnostic ffmpeg command builder (inputs/filtergraph/outputs/options + raw escape hatch); RunCommand. keyrx's reel is built on it, in keyrx's repo |
R-AF-7 |
| 0006 hardening-roadmap | 4 | dispatched — LGPL build-out + download-cache done; perf → 0008; CLI → 0009; native backend dropped | R-AF-10 (done) |
| 0007 libav-direct-engine | — | the pivot: ffmpeg-wasi libav-direct engine + job-spec vocabulary (supersedes 0002) |
R-AF-3, R-AF-6, R-AF-10 |
| 0008 performance-strategy | 4 | spike: measure Wasm-encode perf vs native; non-threaded levers (RuntimePool, build tuning) | R-AF-12 |
| 0009 afmpeg-cli | 4 | deferred (value-unproven): job-spec-native cmd/afmpeg, never ffmpeg-arg-compatible |
R-AF-13 |
| 0010 signed-release-acquisition | 4 | certified WithModuleRelease (KMS-signed checksum + provenance); BYO WithModuleURL stays uncertified |
R-AF-14 |
| 0011 wkd-attestation | 4 | fast-follow: domain-rooted second attestation layer vs the GitLab-compromise / poisoned-well gap | R-AF-15 |
11. Alternatives considered¶
See the keryx spike (keryx/docs/development/spikes/ffmpeg-render-binding.md): ffgo
(purego — immature), go-astiav (CGO — posture-breaking), go-ffmpreg (wazero — missing
filters/AAC, not vfs-integrated), materialise↔readback bridge (keryx's interim, no new
dep). afmpeg is the maintained "wazero + WASM + afero-vfs" synthesis those lack.
12. Dev method / DoD¶
Mirror keryx: TDD (failing test → code → green CI), tests assert no host-fs access
for the in-memory path (the core guarantee), reproducible-build check, docs per
component. The WASM build pipeline gets its own CI (slow; gated). CGO_ENABLED=0 is a
hard CI gate.