Programming with Initial Value Problems

Author: Ryan Zambrotta
Date Published: 16 June 2025

I like compilers. I like how they whittle away inefficiencies into a tight core of NEON instructions. There’s something satisfying about opening up Compiler Explorer and seeing your code snippet turned into unfathomably fast vectorized code.

I also really enjoy simulations. Seriously, just give me some popcorn and a fluid simulation and I’ll be content. I’ve recently enjoyed the series of videos: Simulating Fluids, then Rendering Fluids, and finally Planetary Fluid Sim.

In graduate school, I’ve been fortunate enough to work on blending those two interests in rich and surprising ways. I’m not literally working on simulation code but the same fundamental mathematics pops up in surprising places. Typically the “language” of simulations is that of differential equations. For instance, math libraries have lots of mini differential equations lurking around. They are at the heart of not only functions like sin and exp but also less friendly ones like asin or ln.

My collaborators and I are developing a language including a particular differential equation called an initial value problem (IVP). Our goal is to demo how compilers and other domain-specific languages can leverage IVPs to automatically optimize/reason about user-defined transcendental functions.

In this post, I’m gonna share some of the fundamentals that we’re building on. I recommend checking out my previous post for some additional context.

Polynomial IVPs

First we need to build our language using a fundamental structure: the polynomial. We’ll show later how limiting ourselves to polynomials and IVPs gives us access to powerful algorithms. We can define polynomials syntactically with the following grammar:

\[e ::= c ~|~ x ~|~ e + e ~|~ e * e\]

which defines c to represent rational constants, x to represent variables and then addition and multiplication recursively. Here we add in our special ingredient: the initial value problem. The idea is to include a new expression that represents a function defined by an IVP. Fortunately, we only need two ingredients: the value of the function at zero and an expression for the derivative. It looks like this:

\[e ::= \dots ~|~ \text{ivp}(y,c,e_d)^+\]

where y is a newly introduced variable, c is a number representing the value of the function at zero. This is often called the initial condition. The subexpression e_d is the derivative of the function we want to define. The + syntax means “one or more of”. In this case, we mean multiples tuples and not the entire IVP statement. This definition is a totally super clear and obvious way to define functions, right? Just kidding! Let’s break down this new expression with an example. How about we encode functions we know and love like sin?

We have three pieces we need to fill out. The first placeholder is the variable y. Our function is sin so let’s bind the variable s. The second is the value of sin at 0 which is 0. Third, we need the derivative of sin which is cos. Here’s the funny part about cos: we don’t have it but we can encode it simultaneously. This is where the + syntax comes in handy. It allows to define multiple functions at once. We just need a variable, c, the value cos(0) = 1, and the derivative -sin. Finally, we encode sin as:

\[\text{ivp}(s,0,c)(c,1,-s)\]

notice how we used -s as the derivative of cos. We don’t need any external functions so the entire IVP is self-contained. Incredibly, this one line definition is all we need to completely define sin and cos. That is the expressive power of IVPs.

Automatic Theorem Proving with Polynomial IVPs

As much fun as it is just building trees, we want to actually do something with them. In my previous post I talked about proving mathematical identities using ODEs. We can actually apply those same ideas to the language we just defined in this post. The rough idea is this: Imagine we want to prove that sin(x)^2 + cos(x)^2 = 1. We can do so by first encoding sin and cos as IVPs as above and then checking two additional things. First, that the right and left-hand-sides evaluate to the same value at their initial condition.

As a technical note, we’re using s and c below in a different way from above. Above, they are bound variables inside the IVP expressions. Below they are a short-hand for the full IVP definition. Roughly speaking, read c below as “get me the corresponding c tuple from the IVP definition above”.

\[\begin{aligned} s^2 + c^2 &= 1 \\ 0^2 + 1^2 & = 1\\ 1 &= 1 \\ \end{aligned}\]

Second, that the derivatives for the “same cycle”.

\[\begin{aligned} &\partial_x \left(s^2 + c^2 \right) \\ &= \partial_x(s^2) + \partial_x(c^2) \\ &= 2*s*c + 2*c*(-1*s) \\ &= 2*(s*c - c*s) \\ &= 2*(0) \\ &= 0 \\ \end{aligned}\]

and d/dx(1) = 0. In this case the “cycle” is that both the left and right side have literally the same derivative, 0.

We strategically defined IVPs to work nicely with this procedure. Any expression build using the grammar e can work with this equality test. However, this is still manual and it turns out that we can do better. When we restrict ourselves to expressions build using e we can automatically prove equality.

In particular, the “find the same cycle” problem reduces to a polynomial division. This is described in great detail in this paper. I’ll give the gist here.

Imagine we have two polynomials g, h and we want to prove g(x) = h(x). First, we let f(x) be g(x) - h(x) and instead prove f(x) = 0. This is just a more standard representation of the equality. Next, we want to find some cycle in the derivatives of f. In particular, we want to find another polynomial P such that the n-th derivative of f is a polynomial of the lower-order derivatives. In math:

\[f^{(n)} = P\left(f, f^{(1)}, \cdots, f^{(n-1)}\right)\]

where the superscript n is terse notation for the n-th derivative of f. We can find P using a technique from algebraic geometry called an Ideal membership test. This in conjunction with n point evaluations:

\[f(0) = f^{(1)}(0) = ... = f^{(n-1)} = 0\]

will, amazingly, either find a P or find a counterexample to equality. Although, it’s unclear how many derivatives we need to take to find P or a counterexample.

OK, that was super abstract, so how about an example: Let’s define a new function using IVP:

\[\text{ivp}(g,0,2*g')(g',2,-4*g)\]

which should appear similar to the encoding for sin because it encodes sin(2x). Let’s prove the double angle formula sin(2x) = 2*sin(x)*cos(x) Encoding this into standard polynomial form gives the predicate g - 2*s*c = 0.

We can algorithmically take derivatives (reminder that g and g' are used to lookup their corresponding tuples in the IVP definition):

\[\begin{aligned} f &= g - 2*s*c \\ f^{(1)} &= 2*g' - 2*(c^2 - s^2) \\ f^{(2)} &= -4*g - 2*(2*c*(-s) - 2*s*c) \end{aligned}\]

from here we can perform the Ideal membership test some sympy code.

from sympy import groebner, expand, symbols
from sympy.abc import g,s,c
g_ = symbols("g_")

# copy over polynomials
f0 = g - 2*s*c
f1 = 2*g_ - 2*(c**2 - s**2)
f2 = -4*g - 2*(2*c*(-s) - 2*s*c)

# Setup Ideal Membership test
gb = groebner([f0, f1])
assert gb.contains(f2) # Success!

The groebner function computes a special set of polynomials called a Gröbner Basis. These basis polynomials have a deep relationship with f0 and f1 and essentially give us a robust way to perform the Ideal member test. Because of the automation, the mechanics can be a bit opaque. Fortunately, we can show how f2 was factored as follows:

# show the coefficients for factoring 
Q, r = gb.reduce(f2)
# Q = [0, 4, 0, 0]
# r = 0

assert gb[1] == -1 * f0
assert f2 == Q[1] * gb[1]
assert f2 == -4 * f0

We can check manually as well:

\[\begin{aligned} f^{(2)} &= -4*g - 2(2*c*(-s) - 2*s*c) \\ &= -4*g - 2(-2*c*s + -2*s*c) \\ &= -4*g - 2(-4*c*s) \\ &= -4*g - -4(2*c*s) \\ &= -4*\left(g - 2*c*s\right) \\ &= -4*f \end{aligned}\]

Although though we ignored the point evaluations here, all we needed was to compute some derivatives and use standard computer algebra routines to automatically prove identities of trig functions. There are some downsides to this method, namely that the groebner steps run in double exponential time in the number of variables used. The gb.contains is quite expensive as well. That is to say that none of this is particularly “fast”.

So how does this all fit into my research? Currently me and my collaborators are exploring extensions to this polynomial core including function composition and division. Surprise, surprise, these are both fraught with unsoundness. They enable non-differentiable function to be defined (like abs(x)) from differentiable pieces.

We’re also developing applications like generating fast floating-point approximations of IVPs. Hopefully compiler optimizations to come!!

Keywords:

Back to blog posts...

Post Metadata

Polynomial IVPs

Automatic Theorem Proving with Polynomial IVPs