Programming with Initial Value Problems
Post Metadata
I like compilers. I like how they whittle away inefficiencies into a tight core of NEON instructions. There’s something satisfying about opening up Compiler Explorer and seeing your code snippet turned into unfathomably fast vectorized code.
I also really enjoy simulations. Seriously, just give me some popcorn and a fluid simulation and I’ll be content. I’ve recently enjoyed the series of videos: Simulating Fluids, then Rendering Fluids, and finally Planetary Fluid Sim.
In graduate school,
I’ve been fortunate enough to work on
blending those two interests in rich and surprising ways.
I’m not literally working on simulation code but
the same fundamental mathematics pops up in surprising places.
Typically the “language” of simulations is that of differential equations.
For instance, math libraries have lots of mini differential equations lurking around.
They are at the heart of not only functions like sin
and exp
but also less friendly ones like asin
or ln
.
My collaborators and I are developing a language including a particular differential equation called an initial value problem (IVP). Our goal is to demo how compilers and other domain-specific languages can leverage IVPs to automatically optimize/reason about user-defined transcendental functions.
In this post, I’m gonna share some of the fundamentals that we’re building on. I recommend checking out my previous post for some additional context.
Polynomial IVPs
First we need to build our language using a fundamental structure: the polynomial. We’ll show later how limiting ourselves to polynomials and IVPs gives us access to powerful algorithms. We can define polynomials syntactically with the following grammar:
\[e ::= c ~|~ x ~|~ e + e ~|~ e * e\]which defines c
to represent rational constants,
x
to represent variables and then
addition and multiplication recursively.
Here we add in our special ingredient:
the initial value problem.
The idea is to include a new expression that
represents a function defined by an IVP.
Fortunately, we only need two ingredients:
the value of the function at zero and an expression for the derivative.
It looks like this:
where y
is a newly introduced variable,
c
is a number representing the value of the function at zero.
This is often called the initial condition.
The subexpression e_d
is the derivative of the function we want to define.
The +
syntax means “one or more of”.
In this case, we mean multiples tuples and not the entire IVP statement.
This definition is a totally super clear and obvious way to define functions, right?
Just kidding!
Let’s break down this new expression with an example.
How about we encode functions we know and love like sin
?
We have three pieces we need to fill out.
The first placeholder is the variable y
.
Our function is sin
so let’s bind the variable s
.
The second is the value of sin
at 0
which is 0
.
Third, we need the derivative of sin
which is cos
.
Here’s the funny part about cos
: we don’t have it but we can
encode it simultaneously.
This is where the +
syntax comes in handy.
It allows to define multiple functions at once.
We just need a variable, c
, the value cos(0) = 1
,
and the derivative -sin
.
Finally, we encode sin
as:
notice how we used -s
as the derivative of cos
.
We don’t need any external functions so the entire IVP is
self-contained.
Incredibly, this one line definition
is all we need to completely define sin
and cos
.
That is the expressive power of IVPs.
Automatic Theorem Proving with Polynomial IVPs
As much fun as it is just building trees,
we want to actually do something with them.
In my previous post
I talked about proving mathematical identities using ODEs.
We can actually apply those same ideas to the language
we just defined in this post.
The rough idea is this:
Imagine we want to prove that sin(x)^2 + cos(x)^2 = 1
.
We can do so by first encoding sin
and cos
as IVPs as above and then
checking two additional things.
First, that the right and left-hand-sides evaluate to the same value at their initial condition.
As a technical note, we’re using s
and c
below in a different
way from above.
Above, they are bound variables inside the IVP expressions.
Below they are a short-hand for the full IVP
definition.
Roughly speaking, read c
below as “get me the corresponding c
tuple
from the IVP definition above”.
Second, that the derivatives for the “same cycle”.
\[\begin{aligned} &\partial_x \left(s^2 + c^2 \right) \\ &= \partial_x(s^2) + \partial_x(c^2) \\ &= 2*s*c + 2*c*(-1*s) \\ &= 2*(s*c - c*s) \\ &= 2*(0) \\ &= 0 \\ \end{aligned}\]and d/dx(1) = 0
.
In this case the “cycle” is that both the left and right
side have literally the same derivative, 0
.
We strategically defined IVPs to work nicely with this procedure.
Any expression build using the grammar e
can work with this equality test.
However, this is still manual and it turns out that we
can do better.
When we restrict ourselves to expressions
build using e
we can automatically prove equality.
In particular, the “find the same cycle” problem reduces to a polynomial division. This is described in great detail in this paper. I’ll give the gist here.
Imagine we have two polynomials g
, h
and we want
to prove g(x) = h(x)
.
First, we let f(x)
be g(x) - h(x)
and instead prove f(x) = 0
.
This is just a more standard representation of the equality.
Next, we want to find some cycle in the derivatives
of f
.
In particular, we want to find another polynomial
P
such that the n-th derivative of f
is
a polynomial of the lower-order derivatives.
In math:
where the superscript n
is terse notation for the n-th derivative of f
.
We can find P
using a technique from algebraic geometry
called an Ideal membership test.
This in conjunction with n
point evaluations:
will, amazingly, either find a P
or find a counterexample
to equality.
Although, it’s unclear how many derivatives
we need to take to find P
or a counterexample.
OK, that was super abstract, so how about an example: Let’s define a new function using IVP:
\[\text{ivp}(g,0,2*g')(g',2,-4*g)\]which should appear similar to the encoding for sin
because it encodes sin(2x)
.
Let’s prove the double angle formula sin(2x) = 2*sin(x)*cos(x)
Encoding this into standard polynomial form gives the predicate g - 2*s*c = 0
.
We can algorithmically take derivatives (reminder that g
and g'
are used to lookup their corresponding tuples in the IVP definition):
from here we can perform the Ideal membership test
some sympy
code.
from sympy import groebner, expand, symbols
from sympy.abc import g,s,c
g_ = symbols("g_")
# copy over polynomials
f0 = g - 2*s*c
f1 = 2*g_ - 2*(c**2 - s**2)
f2 = -4*g - 2*(2*c*(-s) - 2*s*c)
# Setup Ideal Membership test
gb = groebner([f0, f1])
assert gb.contains(f2) # Success!
The groebner
function computes a special set of polynomials
called a Gröbner Basis.
These basis polynomials have a deep relationship with f0
and f1
and essentially give us a robust way to perform the Ideal member test.
Because of the automation, the mechanics can be a bit opaque.
Fortunately, we can show how f2
was factored as follows:
# show the coefficients for factoring
Q, r = gb.reduce(f2)
# Q = [0, 4, 0, 0]
# r = 0
assert gb[1] == -1 * f0
assert f2 == Q[1] * gb[1]
assert f2 == -4 * f0
We can check manually as well:
\[\begin{aligned} f^{(2)} &= -4*g - 2(2*c*(-s) - 2*s*c) \\ &= -4*g - 2(-2*c*s + -2*s*c) \\ &= -4*g - 2(-4*c*s) \\ &= -4*g - -4(2*c*s) \\ &= -4*\left(g - 2*c*s\right) \\ &= -4*f \end{aligned}\]Although though we ignored the point evaluations here,
all we needed was to compute some derivatives
and use standard computer algebra routines
to automatically prove identities of trig functions.
There are some downsides to this method,
namely that the groebner
steps run in double exponential time in the number of variables used.
The gb.contains
is quite expensive as well.
That is to say that none of this is particularly “fast”.
So how does this all fit into my research?
Currently me and my collaborators are exploring extensions to this polynomial core
including function composition and division.
Surprise, surprise, these are both fraught with unsoundness.
They enable non-differentiable function to be defined
(like abs(x)
)
from differentiable pieces.
We’re also developing applications like generating fast floating-point approximations of IVPs. Hopefully compiler optimizations to come!!