Towards Software Containers with Guaranteed Reproducibility
December 04, 2018 at 12:00pm (lunch talk)
Deterministic execution is a capability with many upsides. It avoids heisenbugs and flaky tests; mitigates data races; prevents side-channel attacks; allows seamless fault-tolerance via active/active replication of stateful computations; and even aids in provenance tracking – such as ensuring binary packages contain only legitimate source code. Yet, for all this, determinism is rarely enforced with any rigor. (Blockchain smart contracts are the limited exception that proves the rule.) Applications that require determinism leave it as an informal requirement on the programmer. We don’t trust developers for safety properties like memory isolation, and neither should we for determinism.
In this talk, I will show how system software can enforce determinism at the container level, without changing the existing Linux system call interface, kernel, or the x86-64 ISA. We call these reproducible containers, because the output of any computation run in the container is a function only of its input state, and outputs can be replicated and validated on another machine (even one that differs in core count, kernel version, etc). This addresses problems with reproducibility we observe when running the Artifact Evaluation process for ACM conferences, and in the Debian Reproducible Builds project.
The technical challenge to build a reproducible container is twofold: (1) define a deterministic semantics for system resources that respects existing specifications and admits parallelism while minimizing synchronization, and (2) enable efficient interception and modification of application behavior in user land. I will describe two corresponding improvements to the state of the art: first, our instruction punning technique for in-place modification of running x86-64 code (without pausing threads); and, second, an extension to the traditional Kendo algorithm for deterministic logical clocks, which enables scalable parallel file-system access within a deterministic container.
Ryan Newton comes from South Florida and received his Ph.D. in computer science from MIT in 2009, advised by Arvind and Samuel Madden. His thesis introduced techniques for efficiently distributing a single logical program over a sensor network. From 2009 through 2011, Ryan was an engineer in the developer products division at Intel, where he worked on parallel programming tools including CnC and CilkPlus. Since 2011, Ryan leads a group of PL and systems researchers chasing the dream of programming against strong abstractions while also achieving high performance on today’s hardware.