At a high level, compilers, optimizers, and synthesizers are all trying to do the same thing: transform programs into better, equivalent programs. We see this pattern all over the place: in vectorizing computations for digital signal processing, performing operator fusion for machine learning kernels, and automating numerical analysis to improve floating point accuracy. This pattern even shows up in CAD when we want to reverse engineer some low-level, geometric description of an object back up to a high-level, editable design (above).

What if, rather than developing ad hoc, application-specific search strategies for each problem, we could adopt a general, principled approach for searching through the space of programs equivalent to our input to find the “best” simplified candidate?

It may seem that such searches will inevitably suffer exponential blow up. However, if we use a rewriting technique known as *Equality Saturation* (EqSat) coupled with recent insights for efficiently implementing *equality graphs* (e-graphs, the core data structure driving EqSat), we can develop a generic, reusable library for building compilers, optimizers, and synthesizers across many domains.

This post introduces *egg*, our fast and flexible e-graph library implemented in Rust. egg lets you bring the power of EqSat to your problem domain. egg is open-source, packaged, and documented, and many research projects across diverse domains have used its speed and flexibility to achieve state-of-the-art results:

- Szalinski shrinks 3D CAD programs to make them more editable. [PLDI 2020]
- Diospyros automatically vectorizes digital signal processing code. [ASPLOS 2021]
- Tensat optimizes deep learning compute graphs both better and faster (up to 50x) than the state of the art. [MLSys 2021]
- Herbie improves the accuracy of floating point expressions. The egg-herbie library made parts of Herbie over 3000x faster! [PLDI 2015]
- SPORES optimizes linear algebra expressions up to 5x better than state-of-the-art. [VLDB 2020]

We’ll begin with background on term rewriting and some its problems, and then we’ll introduce e-graphs and equality saturation and show how they address those issues. Finally, we’ll touch on some of the novel techniques that egg brings to the table to make EqSat faster and more flexible than before. More information about anything in this post can be found on the egg website or our POPL 2021 Distinguished Paper.

# Term Rewriting

*Term rewriting* is a common technique for working with terms, whether in optimizers, synthesizers, or verifiers. You start with a term *t* and a set of rewrites, each of which take the form *l → r*. For each rewrite, you try to match the pattern *l* against the term *t*, generating a substitution *σ* at some subterm, and then you apply that substitution to the right-hand pattern *r* to generate a new subterm that replaces the matched one.

For example, consider the term 42 * (7 + 7) and the rewrite *x + x → 2 * x*:

- The left-hand side would match at the subterm 7 + 7, generating the substitution
*σ*= {*x*: 7}. - Applying the substitution gives
*r[σ] =*2 * 7. (*r[σ]*means apply the substitution*σ*to*r*) - Replacing the matched subterm 7 + 7 with
*r[σ] =*2 * 7 gives the result 42 * (2 * 7).

One issue with term rewriting (and other programming language techniques that involve manipulating terms) is the matter of choice. Rewriting is *destructive*, meaning that once you do a rewrite, the initial term is gone. If you have many rewrites to choose from (or many ways to apply a single rewrite), choosing one is a fork in the road: going back and choosing another way is expensive or impossible, so you’re committed to the path you’ve chosen.

Picking the right rewrite at the right time is difficult: a good choice locally can be a bad choice globally, or after later rewrites. As an example, should a C optimizer rewrite a * 2 to a << 1? Normally yes; but what about (a * 2) / 2? Here, (a << 1) / 2 is good but makes the even better expression “a” harder to find. Even rewrites which seem to “go backwards” might enable some critical rewrite much later.

There are some techniques to mitigate this (backtracking and value-numbering), but fundamentally, the only way around it is to take *all choices at the same time*. But since traditional rewriting is destructive, applying *n* rewrites to a term results in *n* terms, and then applying *n* rewrites to each of those *n* terms yields *n ^{2}* more terms, and so on. If you want to explore

*m*“layers” deep in this tree, you’ll have to store

*n*terms, which is prohibitively expensive.

^{m}# E-graphs and Equality Saturation

E-graphs (Gregory Nelson’s PhD Thesis, 1980) solve this representation problem. When rewriting a bunch of terms, the results are often structurally similar. So e-graphs store large sets of structurally similar expressions compactly. Using e-graphs, it’s possible to apply many rewrites simultaneously without multiplicative growth in the space needed. Equality saturation (Tate et. al., 2009) is a technique to do this kind of rewriting for program optimization.

So, what’s an e-graph? An *e-graph* is a data structure to compactly store an equivalence relation (really a congruence relation) over terms. Some e-graphs are shown below. An e-graph is a set of equivalence classes (*e-classes*, dotted boxes), each of which contains equivalent *e-nodes* (solid boxes). An e-node is an operator with children, but instead of children being other operators or values, the children are e-classes; arrows point from parent e-node to child e-class.

Even small e-graphs can represent a large number of expressions, exponential in the number of e-nodes. This compactness is what makes e-graphs a compelling data structure. We define what it means to *represent* (or *contain*) a term as follows:

- An e-graph represents a term if any of its e-classes do.
- An e-class represents a term if any of its e-nodes do.
- An e-node f(n1, n2, …) represents a term f(t1, t2, …) if e-node ni represents term ti.

For example, in the second e-graph above, the division e-node is the only one in its e-class, but the e-class represents two terms: *(a * 2) / 2 *and *(a << 1) / 2.*

Since we can represent many terms in a single e-class, we can now do term “rewriting” in a way that* never forgets* old versions of terms.

- A procedure called
*e-matching*(matching up to equivalence) searches an e-graph for represented terms that match the pattern. For a rewrite*l → r*, e-matching*l*yields pairs of (*c*,*σ*), where*c*is the matching e-class and*σ*is the substitution. For each pair (*c*,*σ*), do the following: - Add
*r[σ]*to the matching e-graph. The substitution here*σ*maps variables from the pattern*l*to e-classes, not sub-terms. So*r[σ]*is actually an entire e-class of equivalent terms, rather than a single term! - Union (or combine) the e-class
*c*with the e-class containing*r[σ]*. Combined with step 2, this has the effect of adding*r[σ]*to e-class*c*.

Let’s put it all together with an example referring to the four e-graphs in the image above.

- The initial e-graph represents the term
*(a * 2) / 2*. Since each e-class only has one e-node, the e-graph is basically an abstract syntax tree with sharing (the 2 is not duplicated). - Applying the rewrite
*x * 2 → x << 1*has recorded the fact that*a * 2 = a << 1*without forgetting about*a * 2*. Note how the newly added*a << 1*refers to the existing “*a*” e-node, and the “<<” e-node has been added into the same e-class as the equivalent “*” e-node where the pattern*x * 2*matched. - The rewrite
*(x * y) / z → x * (y / z)*associates division with multiplication. This rewrite is critical to discovering the cancellation of 2s that we are looking for, and it still works despite the fact that we applied the “wrong” rewrite previously. - Applying rewrites
*x / x → 1*and*1 * x → x*doesn’t add any new e-nodes, since all the e-nodes were already present in the e-graph.So these rewrites only union e-classes, meaning that the e-graph actually got*smaller*from applying these rewrites, even though it now represents more terms. In fact, observe that the top-right “*” e-node’s left child is*itself*; this cycle means the e-class represents*infinite*(!) terms, including*a*,*a * 1*,*a * 1 * 1*, and so on, as well as*a * (2 / 2), a * (2 / 2) * 1*, etc.

Now that we can apply rewrites non-destructively, we can finally perform *equality saturation*, which basically just applies all of the rewrites over and over.

Once a timeout is reached or the e-graph saturates (meaning no rewrites are adding new information), we’re left with an e-graph that represents a massive (even infinite) number of terms. All that remains is to answer our original question. If we wanted to know if two terms are equivalent, we can just check to see if they are represented in the same e-class. If we are doing optimization, we use a procedure called *extraction* to choose a “best” represented term that is equivalent to our input term. Extraction takes in an e-class and a cost function, and it returns the minimum-cost term represented in that e-class. If the cost function is relatively simple (e.g. AST size), then a greedy, bottom-up traversal is guaranteed to yield the minimum-cost term. More complex cost functions require more complex extraction; see our paper or the egg docs for more.

# egg

Both e-graphs and equality saturation were discovered well before egg came onto the scene. So what is egg bringing to the table?

egg is the first general-purpose, reusable implementation of e-graphs and equality saturation. This let us focus on ease of use and optimization, knowing that any benefits will be seen across many use cases as opposed to a single, ad hoc implementation. Features like a Runner API that abstracts away the details of equality saturation, dynamic and conditional rewrite rules, custom extraction procedures, and more are natively supported in egg.

In addition to implementing equality saturation, egg extends it with two novel techniques, amortized invariant maintenance and e-class analyses. The former makes equality saturation much faster by loosening the e-graph’s data structure invariants and only enforcing them at the right time; the latter makes it more flexible by allowing users to bring program analyses like constant propagation or nullability into e-graph land. You can read more about both techniques in our POPL 2021 paper.

If you’re interested in using egg in your project, check out the egg website or tutorial.

**Bio:*** Max Willsey is a PhD student (and soon-to-be postdoc) at the University of Washington working with Zachary Tatlock and Luis Ceze.*

**Disclaimer:** *These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.*