Saving the World from Spreadsheets
April 2, 2019 at 12:00pm (lunch talk)
Spreadsheets are one of the most widely used programming environments, with roughly 1 billion users of Microsoft Excel alone. Unfortunately, spreadsheets make it all too easy to make errors that go unnoticed. These errors can have catastrophic consequences because spreadsheets are widely deployed in domains like finance and government. For instance, the infamous “London Whale” incident in 2012 cost JP Morgan approximately $2 billion; this was due in part to a spreadsheet programming error. A Harvard economic analysis used to support austerity measures imposed on Greece after the 2008 worldwide financial crisis. These austerity measures led to widespread protests and economic dislocation. The analysis was based on a single large spreadsheet, which was later found to contain numerous errors; when fixed, its conclusions were reversed.
Our research aims to dramatically reduce the risk of spreadsheet errors by developing algorithms that can effectively and accurately find them. This is challenging because traditional analyses for conventional programming languages do not apply in the spreadsheet domain (for example, spreadsheets don’t segfault). In this talk, I will present two systems we have developed that effectively find errors in spreadsheets: CheckCell uses a combination of program analysis and statistical analysis to automatically find likely data errors, while ExceLint combines program analysis with an information-theoretic approach to find likely formula errors. We implemented both of these as plugins for Microsoft Excel; both are principled, fast, and accurate (e.g., ExceLint’s median precision and recall are 1).
This work is joint with Dan Barowy (now a professor at Williams College) and Ben Zorn (Microsoft Research).
Emery Berger is a Professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst, the flagship campus of the UMass system. He graduated with a Ph.D. in Computer Science from the University of Texas at Austin in 2002. Professor Berger has been a Visiting Scientist at Microsoft Research and at the Universitat Politècnica de Catalunya (UPC) / Barcelona Supercomputing Center (BSC). Professor Berger’s research spans programming languages, runtime systems, and operating systems, with a particular focus on systems that transparently improve reliability, security, and performance. He and his collaborators have created a number of influential software systems including Hoard, a fast and scalable memory manager that accelerates multithreaded applications (used by companies including British Telecom, Cisco, Crédit Suisse, Reuters, Royal Bank of Canada, SAP, and Tata, and on which the Mac OS X memory manager is based); DieHard, an error-avoiding memory manager that directly influenced the design of the Windows 7 Fault-Tolerant Heap; and DieHarder, a secure memory manager that was an inspiration for hardening changes made to the Windows 8 heap. His honors include a Microsoft Research Fellowship, an NSF CAREER Award, a Lilly Teaching Fellowship, the Distinguished Artifact Award for PLDI 2014, the Most Influential Paper Award at OOPSLA 2012, the Most Influential Paper Award at PLDI 2016, three CACM Research Highlights, a Google Research Award, a Microsoft SEIF Award, and Best Paper Awards at FAST, OOPSLA, and SOSP; he was named an ACM Senior Member in 2010. Professor Berger is currently serving as an elected member of the SIGPLAN Executive Committee; he served for a decade (2007-2017) as Associate Editor of the ACM Transactions on Programming Languages and Systems, and was Program Chair for PLDI 2016.