Software for physics, geometry, finance, and probability often calls library functions like `sin`

, `cos`

, `exp`

, and `log`

. These library functions should be fast and accurate for science, engineering, and finance to work best.

Unsurprisingly, lots of algorithms have been invented over the years for implementing these math functions, and new algorithms are still being discovered. For example, the RLIBM project—previously featured on this blog—has developed new tools for deriving approximating polynomials for single-precision functions.

The performance benefits of platform-, language-, and application-specific implementations are large. In some previous work, we saw speedups as large as 5× on microbenchmarks, and 10% on full applications. For this reason, hardware vendors (like NVIDIA and Intel), compiler projects (like GCC and LLVM), and applications (like CERN’s Geant4) all use custom math libraries specialized for their hardware, language, and application domain.

But writing a platform-, language-, or application-specific math library is error-prone: opaque constants, low-level bit-twiddling, and tricky algebraic transformations abound. Get any wrong, and the resulting function silently returns the wrong result—no crashes or exceptions help narrow down the issue. Even just spot-checking outputs requires numerical sophistication and access to a known-good library.

So we created a DSL, MegaLibm, to make writing math function implementations simpler and safer. It’s all thanks to the classic tools of programming language theory: languages, type systems, and compilation.

# Symbolic Expression Types

Typically, math library functions are written in a low-level language like C or raw assembler to maximize performance. But general purpose languages (like these) don’t help developers avoid semantic errors in mathematical code.

How many times has this happened to you: you’re writing some math computation, and you accidentally write a plus sign instead of a minus sign, or put in the wrong constant? Your programming language can’t catch these bugs for you because its types, like `float`

and `double`

, don’t distinguish between `x`

and `-x`

or between different constants.

But numerical code could really benefit from compiler assistance with precisely this task, especially since we expect the user to test out several different implementations of some mathematical expression and compare them for accuracy and performance. Numerical errors really throw a wrench in that process (through misleading performance or accuracy numbers) and MegaLibm therefore aims to prevent them.

It does this through a more stringent type system that tracks the real-valued expression that MegaLibm code is implementing. The main type is `Impl<e>`

, whose members are implementations of the symbolic mathematical expression `e`

. Operators like negation have signatures like `Impl<e> -> Impl<-e>`

where the input and output have different types and can be distinguished. Two expressions that have the same real-number semantics, like `(1 - x)*(1 + x)`

and `1 - x*x`

, have the same type.

To make this more concrete, here’s an example MegaLibm program:

```
from megalibm import x
e = 1 - x*x/2
e.typecheck() # returns Impl<1 - x*x/2, [-inf, inf]>
```

Note that the syntax here looks like Python. It actually is Python; MegaLibm is a DSL embedded in Python making heavy use of operator overloading. Each operation builds up a MegaLibm AST, starting from the MegaLibm variable `x`

, and methods like `typecheck`

operate over that AST. Also note that `Impl`

actually has a second argument, which is the input range for `x`

. The default range is all real numbers, `[-inf, inf]`

.

# Approximation

Of course there’s a downside to reflecting the program source to the type level: we can’t hide implementation details, and thus lose modularity, if every real expression gets a different type. Luckily, that’s not the case: MegaLibm also has an `approx`

function if some computation doesn’t exactly evaluate a given expression.

For example, `1 - x*x/2`

is a passable approximation to `cos(x)`

, at least near 0. The MegaLibm `approx`

function lets us express this:

```
from megalibm import cos, approx, narrow
p = narrow(1 - x*x/2, [-1e-7, 1e-7])
e = approx(cos(x), 1e-12, p)
e.typecheck() # returns Impl<cos(x), [-1e-7, 1e-7]>
```

Here `p`

is the same `1 - x*x/2`

expression as before, except that `narrow`

reduces the input range being considered to `[-1e-7, 1e-7]`

. The `approx`

line then expresses that in evaluating `1 - x*x/2`

, `p`

is approximating `cos(x)`

to an accuracy of `1e-12`

. In other words, `approx`

implements a mathematical function by evaluating something that approximates it, usually a polynomial or rational function.

At the type level, `approx`

behaves like a cast, an escape hatch, sort of like `unsafe`

or `Obj.magic`

in mainstream programming languages. But in MegaLibm it’s also a place where we can plug in additional tools. MegaLibm, for example, can use a verification tool like Sollya or Gelpia to check that the approximation bound actually holds, making this “escape hatch” safe.

# Range Reduction

This type information helps MegaLibm catch common implementation bugs. For example, consider how the fdlibm library implements the `asin`

function. For small inputs, `asin`

is well approximated by a polynomial. But near ±1, `asin`

has an asymptote, and approximating it by polynomial becomes hard. So fdlibm instead leverages this identity:

```
asin(x) = pi/2 - 2 asin(sqrt((1-x)/2))
```

The genius of this identity is that, if `x`

is between `0.5`

and `1`

, then `sqrt((1 - x) / 2)`

is between `0`

and `0.5`

. So this identity is a kind of “Rosetta’s Stone” that lets us translate `asin`

calls with arguments between `0.5`

and `1`

to `asin`

calls with easier arguments between `0`

and `0.5`

. The C code would look something like this:

```
double asin(double original_x) {
double x = original_x > 0.5
? sqrt((1 - original_x) / 2)
: original_x;
double y = /* … compute asin(x) */;
return original_x > 0.5 ? M_PI_2 - 2*y : y;
}
```

Tricks like this are common in library function implementations, and are absolutely necessary for some functions like `sin`

. But they’re easy to get wrong. For example, in the `asin`

identity, it’s pretty easy to forget the “multiply by 2” step, which only matters for values close to 0.5.

MegaLibm can catch these kinds of mistakes using types. MegaLibm’s `right`

construct, which captures this pattern of reducing `x`

to a smaller domain and then using the output on that smaller domain to reconstruct the output for the original `x`

, has the following type:

```
right(
reduce : Impl<s, I>,
body : Impl<e, J>,
reconstruct : Impl<t, K>
) -> Impl<e, I | J>
```

Our example would look like this:

```
from megalibm import x, y, sqrt, pi
p = # …
p.typecheck() # => Impl<asin(x), [0, 0.5]>
e = right(sqrt(1 - x)/2, p, pi/2 - 2*y)
e.typecheck() # => Impl<asin(x), [0, 1.0]>
```

Here `p`

is only called on inputs in `[0, 0.5]`

, so it can be something like a polynomial approximation, and `right`

wraps that implementation to handle inputs on `[0, 1.0]`

. Importantly, every time you use `right`

, MegaLibm can try to check the identity you’re trying to use, and warn you if it doesn’t check out. Formally, MegaLibm checks this side condition:

You can check these side conditions by invoking MegaLibm’s `verify`

function:

```
e.verify() # Should not error
```

Like with `approx`

, MegaLibm dispatches these side conditions (and a variety of other type checking constraints) to verification tools, including both sound rewrite-based equality checkers and also unsound options like Sollya’s `dirtyinfnorm`

. (It can even just compare randomly-sampled inputs up to a tolerance.) The unsound tools enable faster iteration, and then a final check can use a sound tool.

Overall, I think MegaLibm’s type system is a good example of taking a complex design problem (range reduction and reconstruction algorithms) and extracting its natural modularity into a type system. That type system can then make the design problem easier to solve and safer as well.

# Tunable Compilation

So MegaLibm’s type system helps catch the kinds of typos and errors that make math functions so hard to implement. But how are the resulting functions run?

Naturally, for maximum performance, MegaLibm compiles to a low-level language: C. But compiling to a low-level language reintroduces all the challenges MegaLibm aimed to solve: the developer might now want low-level control over exactly how a polynomial is evaluated or what precision various computations are done in. In C, the need to control all of these factors is exactly why the compiler has to be fairly ignorant of the code itself.

For example, a polynomial `a + b x + c x^2 + d x^3`

can be evaluated in either Horner form:

```
a + x (b + x (c + x (d)))
```

Or in Estrin form:

```
(a + x b) + x^2 (c + x d)
```

The choice matters: Horner form is typically more accurate while Estrin form is typically faster (due to a shorter critical path). For longer polynomials, it’s also possible to mix the two forms (Horner form for the first few terms, and then Estrin for later ones) and to use different precisions for different steps in the evaluation. For MegaLibm to be useful, it has to expose this level of control.

Still, it’s usually best to delay low-level choices like polynomial evaluation until the implementation can be run on real inputs to test various options. So MegaLibm takes inspiration from projects like Halide, which separate the semantics of the language from the semantics of tuning parameters that developers can adjust to maximize performance.

In MegaLibm, the tuning parameters are simple keyword arguments that can be added to MegaLibm constructs. For example, the polynomial above can be defined in MegaLibm with the `polynomial`

shorthand:

```
from megalibm import polynomial
p = polynomial([a, b, c, d])
```

Then to evaluate it in Estrin form (rather than the default Horner form) one would use:

```
p = polynomial([a, b, c, d], scheme="estrin")
```

And to evaluate the polynomial in a higher precision, one could pass:

```
p = polynomial([a, b, c, d], scheme="estrin", prec="fp64")
```

More generally, all of the MegaLibm constructs have these kinds of tuning parameters, which are guaranteed not to affect the real-number semantics (that is, they don’t influence the type system) and can be freely tuned by the developer to tweak accuracy or precision until the implementation is perfect for the developer’s use case.

I think this kind of tunable compilation is a powerful language design primitive for performance-sensitive code. Separating performance tuning from correctness reasoning makes it easier to do both: correctness reasoning can happen at a higher level, and performance tuning can be done fearlessly. Moreover, it supports an iterative, data-driven tuning workflow.

# Jupyter as UI

The type system and tuning parameter design ultimately enables an iterative workflow for writing math functions. Developers write an implementation, type check it, and then tune it for accuracy and performance by tweaking tuning parameters. To really make this workflow click, MegaLibm is made for use in Jupyter:

Of course, in one sense any Python library is usable from Jupyter (and Jupyter increasingly supports other languages as well). But MegaLibm makes a specific effort to support the Jupyter workflow.

Part of that is plotting. Any implementation can be run and its accuracy plotted using the `plot_lambda`

function, and two implementations compared with `compare_plot_lambda`

. The Jupyter environment means that the result plot can be shown inline, allowing the developer to test several implementations and record why a specific design decision was made.

Other MegaLibm functions, synthesize polynomials (and even full implementations) with Sollya, also leverage Jupyter’s ability to show synthesis results inline, where they can be copy-and-pasted into a new cell to form a new implementation:

Jupyter notebooks have proven successful in data science, and MegaLibm is similar in many ways: it, too, involves writing a program iteratively, testing lots of approaches, and paying attention to evaluation results. Jupyter makes that workflow convenient, much more so than the usual mess of bash scripts, shell commands, and copy-and-paste. It also makes it easy to record why specific decisions were made, which helps maintainability.

Using Jupyter as the UI for MegaLibm was an unexpected success, and I plan to try it for future projects. While it’s not as good as a dedicated user interface for numerical programming might be, it involved almost no additional work and immediately made using MegaLibm easier. Where it required changes to MegaLibm’s APIs, those changes were almost always positive ones.

# Using MegaLibm

MegaLibm is a research project, and not directly usable. Ian Briggs, who built most of it, recently defended his PhD and joined Annapurna Labs, a part of Amazon. Co-author Yash Lad has also moved on to other pursuits. And some of MegaLibm’s dependencies have already broken compatibility. So the code, which can be found on Github, is not in good health.

But the real test of a research project is transfer to other projects, and in that sense MegaLibm has already shown results. The Herbie numerical repair tool is investigating using a MegaLibm-like DSL as an IR, thanks to the work of undergraduate Jackson Brough (he’ll be looking for PhD programs soon—get in touch!) and graduate student Brett Saiki. MegaLibm-like explicit `approx`

and structured range reduction should help organize and simplify Herbie’s internal passes. And I also hope that MegaLibm’s incremental, iterative design style becomes more widely used in other verification and synthesis tools.

**Bio:** Pavel Panchekha is an assistant professor at the University of Utah. Besides his research on numerical methods, including maintaining the Herbie numerical assistant, he is also interested in web page layout and web browsers more generally. Ian Briggs is a Formal Verification Engineer at Annapurna Labs and was the lead researcher on MegaLibm. Yash Lad is an MS student at the University of Utah.

**Disclaimer:** *These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.*