Ramen, Beans, and Potatoes, Oh My!
Unparalleled work-life balance, frequent travel to exotic locales for conferences, and plentiful free food.
Unparalleled work-life balance, frequent travel to exotic locales for conferences, and plentiful free food.
E-Graphs are a data structures that allow us to represent many equivalent programs at once. This post won’t get into background on e-graphs, so check out previous blog posts or the original egg talk. A variety of research projects at PLSE use e-graphs because they are a powerful and flexible tool for performing program optimization. Unfortunately, e-graphs suffer from a serious problem: their run-time performance is unpredictable.
I like compilers. I like how they whittle away inefficiencies into a tight core of NEON instructions. There’s something satisfying about opening up Compiler Explorer and seeing your code snippet turned into unfathomably fast vectorized code.
Every project needs to store and visualize data. Most of us have generated a static plot with matplotlib from data in a CSV file. However, when the data format becomes slightly more complicated and the data size increases, both correctness and performance issues arise. This blog post will try to convince you that databases and interactive visualization provide a better pipeline. It is less error-prone, easier and faster to process your data, and has a shorter feedback loop, all without necessarily adding to your workload.
Despite most programming tasks today usually reasoning about concurrency at a higher abstraction level, such as transactions or managed thread pools, low-level concurrent programming maneuvers that fiddle with atomic instructions and locks never go out of fashion. In my humble opinion, the challenge of designing and debugging those programs perfectly resembles that of solving a mathematical puzzle: the description of the problem is deceivingly simple as most of them are trivial if done sequentially, but the solution that enables concurrency is often subtle, and worst, reasoning about why a solution works (or does not work) can be terribly complicated.
PyTorch is a popular open-source tensor library for machine learning (ML) and scientific computing in Python.
It’s especially popular among the research community because of its active open-source community and its flexibility for experimenting with new ML architectures.
For all of its benefits, it has a clear downfall compared to other ML frameworks like TensorFlow.
It’s slow!
Recent work from the PyTorch team at Meta attempts to bridge the flexibility-performance gap with torch.compile, a feature that speeds up PyTorch code with compilation.
In this blog post, I’ll discuss the motivation for torch.compile and its implementation as a Python-level just-in-time (JIT) compiler called TorchDynamo.
This is the first part of a two-part blog post. In this post, I’ll set up the problem we’re trying to solve, and a future post will go into more detail about how our solution works.
As programming languages researchers, we’re entitled to a certain level of mathematical rigor behind the languages we write and analyze. Programming languages have semantics, which are definitions of what statements in the language mean. We can use those semantics to do all sorts of useful things, like error checking, compiling for efficiency, code transformation, and so on.
This post is also available as a GitHub repo.
As a follow up to Alexandra’s recent post about speculative execution and how your mental model of a computer is probably wrong, I thought I’d follow up with another example of how our programming abstraction has drifted from the underlying hardware implementation—this time with respect to when shared memory operations are visible to another program. These specifications, called memory consistency models (also just memory models or MCMs), are unfortunately tricky to understand and are famously ambiguously specified.