Algebraic Semantics for Machine Knitting

Post Metadata

As programming languages researchers, we’re entitled to a certain level of mathematical rigor behind the languages we write and analyze. Programming languages have semantics, which are definitions of what statements in the language mean. We can use those semantics to do all sorts of useful things, like error checking, compiling for efficiency, code transformation, and so on.

This blog post is about a programming domain that doesn’t yet enjoy the same level of rigor in its semantics: machine knitting. People write programs to control massive arrays of needles that manipulate yarn into useful 3D objects. In this blog post, I’ll run through the process of finding “the right” semantics for machine knitting, touching on why we want semantics, connections to traditional programming languages, and what we might use these semantics for in the future. In our search, there are a surprising number of guest appearances by fields of study outside of programming languages: algebraic topology, group theory, knot theory, category theory, and even quantum computing!

I’ll motivate semantics with a toy problem: can two given statements commute with each other? Here’s an example where they can:

x = a + b;
y = c + d;

By “commute”, I mean that if I swapped the order of the lines, the program would do the same thing. The commuting problem is important: a compiler might move around the order of instructions to optimize memory usage by batching related instructions together, an analyzer might want to see if two programs are equivalent, and we might also want to know whether we can run two instructions in parallel. In the above example, we know that the statements commute because of the implied semantics of the statements – they don’t have anything in common, so they shouldn’t affect each other!

Here are some more examples. These statements don’t commute because there’s a data dependency:

x = y + 4;
y = c + d;

If we swapped those two statements, the statement assigning x’s value would use a potentially different value for y.

Non-commuting statements might happen for reasons other than direct data dependencies. Imagine we’re in C – do these function calls commute?

x = f(a, b);
y = f(c, d);

We can’t be sure that they do – perhaps the f function mutates some counter variable every time it’s called.

Apart from explicit mutable state, there’s one more C-like thing that commonly prevents commuting:

*x = *x + 3;
*y = *y * 2;

If the x and y pointers are the same, these statements don’t commute.

It’s notable that in a pure language, there’s no hidden mutable state or pointer aliasing. Haskell is a popular language that’s very close to being pure – since it supports IO operations like printing to the console and reading/writing files, it’s not pure. In Haskell, these functions do commute, as long as there’s no IO monad malarkey:

x <- f a b
y <- f c d

The reason that Haskell’s commuting result is different than C is that its semantics are different – any time we want to prove a property of a language’s behavior, we’ll have to turn to that language’s semantics. We know that proving properties of machine knitting programs could be a powerful tool: a 2024 paper showed massive speedups using an optimizing compiler, using only a few program transformations that are intuitively correct. To scale that result and prove the compiler’s accuracy, we’ll need to develop semantics for machine knitting.

Machine knitting background

This introduction is only for the details of machine knitting that are relevant to this blog post. Unfortunately, the mechanisms behind a machine that automatically creates useful objects from spools of yarn are pretty complicated! If you’d like to read more about machine knitting, this paper has a good introduction.

Knitting machines have an array of hundreds of needles, called a bed. This is analogous to the memory of a traditional computer – registers hold values, and needles hold loops of yarn. There are also carrier strands that move throughout the machine, winding through, around, and past the loops of needles to create stitches, which I’ll later compare to basic operations in traditional programming languages. Here’s how a basic stitch is formed in knitting:

A cyan carrier strand to the left of a loop The carrier strand has been pulled through the loop, creating a new cyan loop A box has been drawn around part of the previous image; the stitch takes a cyan carrier strand and gold loop
  in and returns a cyan loop and a cyan carrier strand

A carrier strand (in cyan) is pulled through a loop (in gold) to create a new loop. Note how the bottom gold loop is held in place as long as the top cyan loop and carrier strand are held up. After the stitch, the knitting machine drops the bottom loop, but the bottom loop stays connected and stable. This is like a value falling out of scope, but since some in-scope value points to it, it’s not garbage-collected (gravity is the garbage collector of machine knitting!).

There are many variants of stitches, but they all follow the same input-output pattern: loops and carrier strands in, loops and carrier strands out. In that way, they’re kind of like basic operations we’re used to in computer science, like addition or bitwise AND: values in, values out. The third image above draws a box around the stitch to show its inputs and outputs: carrier strand and loop in, loop and carrier strand out.

There’s one more technicality in machine knitting: in order to do a stitch involving some values, those values must be all adjacent to each other. In the example below, we can’t immediately knit the cyan carrier on the far left with the gold loop on the far right – instead, we have to move the values next to each other before creating a stitch.

A cyan carrier strand, red loop, and gold loop The cyan carrier strand crosses over the red loop, and makes a stitch with the gold loop

This is also true for traditional computing – to compute the boolean AND of two bits, we need to connect those two bits to an AND gate, getting them right next to each other. Computer architects use complicated routing mechanisms like multiplexers to do this; when we code in traditional programming languages, we don’t have to worry about those constraints because the computer architects have generously handled it for us.

The languages used for machine knitting don’t include many of the traditional programming language features that make code hard to analyze: no if statements (branching) or for loops, and no functions. Machine knitting code is just a series of operations that perform stitches and move carrier strands and loops around. This should make analyzing machine knitting easier: there’s far less complexity than traditional programming languages. In fact, for our specific question of whether two operations commute, the problem seems almost trivial: similar to pure functional programming languages, there’s no global state or aliasing in machine knitting. However, there’s something tricky hiding in machine knitting that isn’t a worry in traditional computing contexts: since knitting is done in 3 dimensions, when strands cross, one goes over and the other goes under. This can cause operations to snag on each other, even if no strand directly connects them. I’ll illustrate this with some diagrams.

Diagrams

Let’s start by carefully analyzing something core to traditional computing: combinatorial boolean circuits. Here’s a simple example myfunc that maps 3 bits to 3 bits.

fn myfunc(x1, x2, x3) {
    y1, y2 = dup(x2); // split the wire in two
    y3 = and(x3, x1);

    return y1, y2, y3;
}

It’s usually easier to reason about circuits when they’re drawn (forgive my very strange choices when drawing the diagram; all will be clear once we get to knitting):

Circuit diagram for the above code

I pose the same question: can the dup and and operations commute? Here, the answer is a definite yes: global state and aliasing aren’t present, and there’s no data dependency. Here is the same function with the operations commuted:

fn myfunc(x1, x2, x3) {
    y3 = and(x3, x1);
    y1, y2 = dup(x2);

    return y1, y2, y3;
}
Circuit diagram for the above code

Deciding whether two operations commute in a combinatorial circuit is easy – they can commute if and only if there’s no directed path connecting them, just like how two functions can commute in Haskell as long as there’s no data dependency between them and no IO trickery.

Knitting diagrams

Now, let’s try something similar for knitting. For simplicity, we’ll work in a made-up machine knitting language that hides some of the technicalities that aren’t important for this post. Our knitting function takes three strands as input, does two stitches, and returns three new strands. I’ve taken the liberty to simplify the diagrams by not drawing the “internals” of the stitches (now they’re just boxes) and I’m drawing both loops and carrier strands as single strands.

fn myknit(s1, s2, s3) {
    t1, t2 = stitch1(s2);
    cross t1 over s1;
    cross s3 over t2;
    cross s1 over s3;
    t3 = stitch2(s3, s1);
    cross t2 over t3;

    return t1, t2, t3;
}
Diagram for the above knitting code. The stitch1 and stitch2 boxes aren't directly connected by a
  strand, but due to the way the strands have crossed, there is no way to move stitch2 below stitch1

Well, the diagram looks almost the same as the circuit diagram, but now every time two strands cross, we have to specify which one goes on top of the other. This is extra noticeable in the code: all crossings between strands are listed and annotated. We’re careful to record and display these over/under crossings in machine knitting because they’re a feature of working in a physical medium: the way strands cross affects how a knitted object looks and feels, and it can drastically change the shape of the final object!

These over/under crossings are what make the commuting problem difficult for machine knitting. In the above example, there’s no data dependency (i.e., no connecting strand) between the two operations. However, because of how their strands are crossing, the operations “snag” on each other so they can’t commute. It’s perhaps easy for humans to look at the above diagram and decide whether two operations snag on each other, but how can we author an accurate algorithm to apply automated analysis to these affairs? We need some way of formalizing these ideas so a computer can reason about them… ah yes, those semantics we mentioned earlier.

Now is a good time to mention that there actually ARE semantics for machine knitting: a 2023 paper set up rigorous mathematical semantics for all of machine knitting. However, these semantics are defined in tangles from knot theory. This is a natural way to define knitted objects – they really are a tangle of strands in 3D space! However, the equivalences of knot theory are defined by continuous deformations – two knots are equivalent whenever we can stretch, move, expand, and contract one knot into the other without tearing strands. Since the goal is to write computer programs that analyze machine knitting, the proposed semantics aren’t directly useful to computers. Computers don’t have any notion of what a “continuous deformation” is, and they’re particularly bad at doing anything involving continuous quantities. The semantics are useful for humans to do basic hand-written proofs and are a great starting point, but we’d like to extend them so we can use computers to perform automatic analysis.

Aside: quantum computing

It should be noted that if we actually constructed the combinatorial circuit from earlier by plugging wires into tangible logic gates, we would also have to make some decision for each crossing as to which wire goes on top. However, this is simply not a concern for computer scientists: it doesn’t matter which wire goes on top, because electrons don’t care whether they go above or below other electrons – it doesn’t change the resulting computation.

Models in quantum physics allow for particles that do remember how they pass over and under each other. This could have big effects in quantum computing, where a so-called topological quantum computer using these particles could be far more resistant to decoherence than conventional quantum computing. A team at Microsoft has recently published some experimental results in topological quantum computing using a setup to braid quasiparticles together in 2+1 dimensions of space and time respectively. The methods, setup, and goals are certainly different than machine knitting, but it’s quite satisfying to see the two seemingly unrelated topics of machine knitting and quantum computing bound by similar mathematical ideas.

Algebraicizing our topology

As I hinted earlier, we’d like to formalize the previous diagrams so we can study their properties, using something more computer-friendly than knot theory. Really, this is an exercise in algebraic topology – representing a topology (like our deformations of 3D space) with algebra, which is a lot easier to work with. The braid group is a great starter example of how mathematicians represent a topological object with algebra.

The braid group

For folks familiar with group theory, the braid group on \(n\) strands is

\[B_n = \langle \sigma_1, \ldots, \sigma_{n-1} | \sigma_i ; \sigma_{i+1} ; \sigma_i = \sigma_{i+1} ; \sigma_i ; \sigma_{i+1}, \sigma_i ; \sigma_j = \sigma_j ; \sigma_i \rangle\]

for \(|i - j| \geq 2,\) and it represents the equivalence classes of \(n\) strands under ambient isotopy. For the folks confused by this deluge of notation, read on!

A group is a set of elements \(G = \{x_1, x_2, \ldots \}\) with

  1. Some way to combine or compose elements together, which I’ll represent using a semicolon. For any \(x, y \in G,\) we have \(x ; y \in G.\)
  2. Some identity element that represents “nothing”. We’ll call that element \(\text{id}\in G,\) and it has the property \(\text{id} ; x = x = x ; \text{id}.\)
  3. Some way to invert elements: for any \(x \in G,\) there’s some \(x^{-1} \in G\) where \(x;x^{-1} = \text{id} = x^{-1} ; x.\)

One group we’re all familiar with is the integers under addition: the composition of integers is addition, the identity is \(0,\) and inverting is negation.

For any natural number \(n\) (let’s say \(n=5\)), the braid group on \(n\) strands is one of these groups. There are infinite elements in that group, and every element is 5 strands, oriented bottom-to-top, passing over/under each other. It’s good to think of braids as nailed or glued down at either end, and two braids are equivalent whenever one can be transformed into the other by shifting around strands, keeping the nailed-down ends fixed. Here are two examples of braids on 5 strands; I’ll call the left one \(x\) and the right one \(y\):

Braid x = sigma_1^{-1} ; sigma_2 ; sigma_1 ; sigma_3^{-1} in B_5 Braid y = sigma_2^{-1} ; sigma_1 ; sigma_3^{-1} ; sigma_4 in B_5

Next, we’ll define what composition, the identity, and inverses are:

Composition is vertical concatenation, read bottom to top. So \(x ; y\) is \(x\) pasted below \(y,\) with the strands connected in order:

Braid x = sigma_1^{-1} ; sigma_2 ; sigma_1 ; sigma_3^{-1} ; sigma_2^{-1} ; sigma_1 ; sigma_3^{-1} ; sigma_4 in B_5

The identity in the braid group is just \(n\) strands going straight:

5 vertical strands, none of them crossing

With the definition of composition being vertical concatenation, hopefully we can agree that pasting on the identity at either end doesn’t really change anything, so that braid is indeed the identity. Finally, inverses are mirror images about the horizontal line through the middle of the braid – below is \(x\) on the left and \(x^{-1}\) on the right:

Braid x = sigma_1^{-1} ; sigma_2 ; sigma_1 ; sigma_3^{-1} in B_5 Braid x^{-1} = sigma_3 ; sigma_1^{-1} ; sigma_2^{-1} ; sigma_1 in B_5

When we compose \(x ; x^{-1},\) we get a braid that untangles to be the identity:

Braid x = sigma_1^{-1} ; sigma_2 ; sigma_1 ; sigma_3^{-1} ; sigma_3 ; sigma_1^{-1} ; sigma_2^{-1} ; sigma_1 in B_5

This is all well and good, but we need some way to write braids down so computers can use them. We’ll do this by listing their crossings in order, from bottom to top. We’ll use \(\sigma_i\) to refer to the \(i\)th strand from the left crossing over the \((i+1)\)th strand, and \(\sigma_i^{-1}\) for under. Then we can write the braid \(x\) as \(x = \sigma_1^{-1} ; \sigma_2 ; \sigma_1 ; \sigma_3^{-1}.\)

There’s one more complication: there’s more than one way to write down braids. For example, we could write \(x\) as \(x = \sigma_1^{-1} ; \sigma_2 ; \sigma_3^{-1} ; \sigma_1\) instead – here’s the original \(x\) diagram on the left and the new one on the right:

Braid x = sigma_1^{-1} ; sigma_2 ; sigma_1 ; sigma_3^{-1} in B_5 Braid x = sigma_1^{-1} ; sigma_2 ; sigma_3^{-1} ; sigma_1 in B_5

This diagram of \(x\) and the original are equivalent – by shifting the crossings up/down, we transform one into the other. So the braid group has extra relations to account for this: \(\sigma_i ; \sigma_{i+1} ; \sigma_i = \sigma_{i+1} ; \sigma_i ; \sigma_{i+1}\) and \(\sigma_i ; \sigma_j = \sigma_j ; \sigma_i.\) Here are those relations drawn out, with the left and right sides being equal in each row:

A red strand passes over a cyan and green strand, then the cyan passes over the green Cyan passes over green, then red passes over green and cyan
Red over cyan, then green over purple Green over purple, then red over cyan

Physics note: the first equation is called the Yang-Baxter equation, and it appears in many more places than just the braid group!

It should be noted that the relation \(\sigma_i ; \sigma_i^{-1} = \text{id}\) is implicit in all groups:

A red strand passes over a cyan strand, then over the cyan strand from the other side The red and cyan strands don't cross

Now, we should all be on the same page for the braid group: the notation

\[B_n = \langle \sigma_1, \ldots, \sigma_{n-1} | \sigma_i ; \sigma_{i+1} ; \sigma_i = \sigma_{i+1} ; \sigma_i ; \sigma_{i+1}, \sigma_i ; \sigma_j = \sigma_j ; \sigma_i \rangle\]

for \(|i - j| \geq 2\) tells us how to write braids down (with \(\sigma_1, \ldots, \sigma_{n-1}\)) and which words mean the same thing.

The braid group is well-studied and comes with some powerful algorithms. The presentation is canonicalizable in polynomial time, meaning there are algorithms that we can run to efficiently tell whether two braids written down using \(\sigma_i\) are equivalent. This is a great sign for our goal of computable semantics for machine knitting – mathematicians have taken a topological object with a lot of the structure we want in machine knitting, boiled it down to something computer-friendly, and even authored some useful algorithms! However, the braid group can only represent the pieces of machine knitting programs without stitches. Stitches operate like functions on strands, with input and output strands. The count of inputs can be different than the count of outputs, changing the count of strands as we perform the stitch. The braid group is limited to some fixed \(n\) strands, so it can’t represent stitches. To get the complete picture of machine knitting, we’ll need to generalize our mathematical assumptions to include the boxes that represent stitches.

Monoidal categories and their diagrams

In programming languages research, category theory is best known for research in type theory. I’ve spent all my math allowance for this blog post explaining the braid group so I won’t get into the nitty-gritty of what categories are – in short, it’s an algebraic object, like a group, but more general: composition doesn’t always have to be defined, and inverses don’t always have to exist.

Of particular note are the ideas of internal logics and internal languages of categories: correspondences between a category, some logic, and some theoretical programming language model. The Curry-Howard correspondence describes the connection between logic and programming languages, and these ideas add category theory to the party.

One such correspondence lies between linear logic, where values can only be used and must be used once, and symmetric monoidal categories. All categories have composition, which I’m interpreting as vertical concatenation like in the braid group; monoidal categories additionally have horizontal concatenation (like multiple values in scope, running two circuits in sequence, or an array of needles holding loops). Symmetric monoidal categories can be drawn as boxes connected by strands where the over/under-ness of strand crossings isn’t recorded or drawn: exactly the diagrams we draw for combinatorial circuits! The symmetric monoidal category gives notation and axioms for us to rigorously study those circuits, just like the braid group gives notation and axioms for studying braids. Symmetric monoidal category is a great name because it’s closely related to the symmetric group.

Following the same idea, braided monoidal categories generalize symmetric monoidal categories by recording over/under crossings, and we draw our diagrams as such – just like our diagram for machine knitting! Back in 1991 when braided monoidal categories were still young, Joyal and Street showed that the axioms of braided monoidal categories correspond with the topology of the diagrams we use for them. Since our diagrams really do represent physical topological objects, this means that braided monoidal categories are perfect for machine knitting!

Actually using these semantics

Now that we’ve identified an algebraic structure for machine knitting, we’d like to use that formalism to perform useful analysis of machine knitting programs. One immediate goal is program equivalence: given two machine knitting programs, will they produce the same object up to topological equivalence? We can reduce those programs to their braided monoidal category representations and work with those. This is closely related to the braid group’s canonicalization I mentioned earlier – can we extend that to the braided monoidal category? I’ve developed an algorithm that does just that in polynomial time, but it’s too complicated to fit in a blog post. To borrow some language from Fermat, I’ve discovered a truly marvelous algorithm and proof of correctness, which this blog post’s margin is too narrow to contain. The algorithm works by using some new ideas to canonicalize the positions and order of stitches, and then uses the braid group canonicalization to canonicalize the crossings between stitches.

We could use the canonicalization algorithm to compile and optimize machine knitting programs. Normal forms are important to compilers because they can greatly simplify the language to be compiled – a canonical form builds on that by additionally providing a uniqueness guarantee. The axioms of braided monoidal categories lay out exactly all the program transformations we should consider. Finally, I also have an interest in developing a user-facing programming language for machine knitting that’s closer to the abstraction provided by category theory. The current machine knitting languages are closely tied to controlling knitting machines, so they require the user to specify which needles on the machine hold strands (like how assembly requires users to specify which registers use values). On top of usability, a new programming language might also come with features for performance, fabrication reliability, or modularity of programs.

Big thanks to my advisors at UW, Gilbert Bernstein and Adriana Schulz, for being flexible as a first-year PhD student learns category theory and topology through the lens of machine knitting. This work is in part a collaboration with folks currently and previously at CMU, including Jenny Lin, Tom Price, Jim McCann, and Hannah Fechtner.

Further reading

  1. KODA, the optimizing knitting compiler
  2. Topological machine knitting semantics
  3. Microsoft’s experimental progress in topological quantum computing
  4. Wikipedia article for the braid group
  5. nLab’s page on internal logics
  6. Joyal and Street’s paper connecting category theory diagrams and topology