01 Motivation
Why move beyond containers for serverless?
Container-based FaaS platforms suffer predictable cold start penalties — each invocation may incur 100–900 ms of container pull, image extraction, and runtime initialization before a single line of user code executes.
WebAssembly's ahead-of-time (AOT) compilation and module-level isolation eliminate this overhead class entirely. A Wasm module starts in microseconds, with a security sandbox as strong as a container but orders of magnitude lighter.
This work validates that claim under production-realistic, bursty camera-event workloads on Kubernetes.
02 System Architecture
Event pipeline with dual-runtime dispatch
03 Latency Results
End-to-end p99 comparison
x2
latency reduction — WaaS vs FaaS at edge inference
WaaS (SpinKube)p99
p50: ~22ms
p95: ~55ms
overhead: 22%
FaaS (Knative)p99
p50: ~110ms
p95: ~230ms
overhead: 58%
04 Cold Start Distribution
Startup latency histogram
WaaS — tight distribution (5–20ms)
FaaS — fat tail (up to 500ms+)
Wasm AOT compilation loads modules in microseconds. Container-based FaaS exhibits a long tail driven by image pull, namespace creation, and JIT warm-up — unavoidable at scale-from-zero.
05 WaaS vs FaaS
Design trade-off comparison
06 Key Insights
Engineering findings at scale
01
AOT compilation is the decisive advantage
Wasm modules compiled ahead-of-time avoid JIT warm-up entirely. Under sustained burst loads of 50+ concurrent devices, this reduces p99 overhead from 58% of total latency (FaaS) to 22% (WaaS).
02
KEDA is runtime-agnostic — a genuine shared primitive
Both SpinKube and Knative consume the same KEDA ScaledObject targeting the camera.enriched Kafka consumer group lag. Autoscaling policy is identical; the runtime is the only variable.
03
Cilium eBPF closes the observability gap
Traditional container-centric tracing misses Wasm execution boundaries. Cilium network-level eBPF tracing captures request flows across both runtimes uniformly, without modifying application code.
04
Workload placement: Wasm wins at the edge, containers win for heavy I/O
Wasm's lightweight isolation excels in short-lived, latency-sensitive inference tasks. Stateful, I/O-heavy operations (ML training, large file processing) remain better suited to full containers.
07 Technology Stack
Kubernetes-native primitives
SpinKube
Wasm runtime operator
Knative
FaaS eventing layer
KEDA
Event-driven autoscaler
Rust
Programming language · safe concurrency
Kafka
Event transport layer
CloudEvents
Neutral event format
Wasm
Portable bytecode · sandboxed exec
Next.js
Live dashboard UI
08 Future Work
· Multi-tenancy sandboxing benchmarks (component model)
· WasmEdge GPU passthrough for inference acceleration
· Formal cold start model: M/G/1 queue w/ Wasm arrival process
· Wasm OCI artifact registry integration (wasm32-wasi targets)