Program synthesis addresses an age-old problem in computer science: can a computer program itself? This isn’t just a matter of laziness on the part of programmers; humans are notorious for introducing bugs in their programs. And, to compound this risk, today’s highly asynchronous, heterogenous, and distributed systems are even more prone to programming errors. What if we could just tell the computer what it should do—provide it with a specification of what we want—and let it do the legwork of figuring out how to do it? This is the promise of program synthesis: given a specification, automatically generate a program that satisfies it.
The art of specification
Before the computer can automatically generate a program, we need to give it a specification of what the program should do. The specification needs to describe the program’s desired behavior in enough detail to ensure the program does what we want, but ideally, it should be much simpler than the program itself, since otherwise we may as well just write the program.
We should be at least a little skeptical that the task of writing a specification is feasible for anything but the most trivial examples. De Millo, Lipton, and Perlis argued as much in their seminal 1979 treatise against software verification. However, our conception of specifications has evolved over the past several decades. Indeed, we have ample evidence that simple, trustworthy specifications abound in practice.
Traditional formal specifications are logical artifacts that precisely capture every behavior of a system. As a trivial example, we might write a specification “∀x. P(x) = 3x”, saying that when given an input number x, our program P should return 3*x. The formal methods community has shown formal specifications for many applications, from operating systems to encryption schemes and beyond, and the DeepSpec collaboration is exploring building these specifications for even larger systems.
Even without a formal specification, program synthesis can still be effective. One popular style of specification is examples of desired input—output pairs. To specify our 3*x program above, we might instead provide the input-output examples (0, 0) and (1, 3). But this style of specification is often ambiguous—for our example pairs, the program if x == 0 then 0 else 3
would also work—and so a synthesizer for these specifications needs a way to select “good” programs.
The first wave of synthesis: deductive reasoning
The earliest examples of program synthesis used formal specifications to automatically deduce a correct program. The idea is to take a formal specification and work backwards, searching for a program that provably satisfies the specification. Some of these early synthesizers built on the successes of automated theorem provers, using them to automatically construct a proof of the specification, and then extract a program from that proof.
Another deductive synthesis technique is to apply a comprehensive set of rewriting rules to try to transform the specification into an equivalent program. This technique looks a lot like a regular compiler, which transforms your input program into an output program through rewriting rules. The difference is that the synthesizer’s rewriting rules should be complete—they should be able to transform the specification into every equivalent program, so that the synthesizer can find the best program. Alexia Massalin’s work on superoptimization and the Synthesis kernel are landmarks in this area, showing the early promise of synthesis techniques, and now might be a good time to revive them.
The second wave of synthesis: inductive reasoning
The disadvantage of deductive synthesis is that it requires both a complete axiomatization of the target language and a complete formal specification. The second wave of synthesis research focuses on inductive synthesis to try to relax these requirements. The idea of inductive reasoning is to find programs that work on a subset of inputs and generalize them to work for every input. Done right, this technique promises to be much more tractable than deductive synthesis, since we need reason only about a few inputs.

Counterexample-guided, inductive synthesis
The most popular style of inductive synthesis is counterexample-guided (CEGIS), pioneered by the Sketch synthesis system. The idea of CEGIS is to guess a program that works on a few inputs, and then check if that program works on every input. If not, we learn something from the failure to inform future guesses, and repeat the guess-check cycle again. There are many different CEGIS techniques, ranging from satisfiability solving to brute-force enumeration, but they all share this common structure.
Flash Fill, a data transformation feature in Microsoft Excel developed by Sumit Gulwani, is a flagship success of the inductive approach. Flash Fill automatically synthesizes string transformations in spreadsheets from very few examples of the desired change (often just one). Synthesizing from so few examples is particularly challenging—many string transformations would produce the desired result—and so much of the art of Flash Fill is in selecting programs that match what the user probably intended.
Underpinning many inductive synthesis tools is the idea of a domain-specific language (DSL). Programming environments such as Sketch or Rosette make it easy to build a DSL for a problem domain and then construct synthesis tools for that DSL. This solver-aided language approach to synthesis has yielded impactful synthesis tools in domains such as superoptimization, memory consistency, education, and databases.
The third wave of synthesis: statistical reasoning
The recent surge in the effectiveness of machine learning has driven interest in program synthesis using statistical techniques. One approach is to train a model to guide existing synthesis techniques more effectively—for example, DeepCoder uses a learned model to guide the brute-force enumeration of candidate programs. These learned models replace hand-tuned heuristics with ones learned from data.
A more aggressive approach is to replace existing synthesis techniques entirely, by training a model to understand program semantics and using it to infer programs that may satisfy the specification. Training such a model is notoriously difficult, but we can borrow new advances from natural language processing as inspiration. We can also use synthesis techniques to improve machine learning, by helping to build interpretable ML models or generating more efficient inference code.
One of the most exciting reasons to use machine learning for synthesis is that it can extend our reach beyond written specifications. For example, we can use images as specifications to synthesize the programs that draw them, or reverse-engineer 3D printing designs to customize them for our own use. Data-driven specifications like these bring synthesis techniques to more users without imposing a formality burden.
What’s next?
Synthesis techniques are notorious for poor scalability—working on small examples (e.g., Excel macros), but not on many of the types of programs we care about. Tackling the scalability challenge is one of the most important directions for synthesis research. There’s been recent work on profiling synthesis tools and on automatically inferring good synthesis templates, but there’s plenty more to do here.
Another important question is to understand when program synthesis is effective—when should you reach for it as a tool? Recent superoptimization tools like Lens and Souper blur the distinction between synthesis and compilation—they are particularly effective at finding small sequences of unusual optimized code, but integrating them into a larger compilation pipeline remains a challenge.
Bio: James Bornholt is an Assistant Professor in the Department of Computer Science at the University of Texas at Austin. His research is in programming languages and formal methods, with a focus on automated program verification and synthesis.
Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.