Every month, the Communications of the ACM presents one or two Research Highlights, showcasing top research to its broad computer science audience. The selection criteria for these articles are special: a Research Highlight needs to be of the highest quality, but it must also be of broad interest and accessible to the whole computer science community. As such, papers that are core to a particular conference and highly rated therein may not be great candidates, since they may be inaccessible outside a specialist audience.
Like all CS sub-areas, PL has its share of specialist topics and techniques, but it also strives to develop and apply tools in its expanding toolbox to problems of broad interest. We have previously blogged about PL Research’s connection to other areas. CACM Research Highlights thus affords a great opportunity to showcase these sorts of results.
In 2009, the SIGPLAN Research Highlights committee was formed to systematically and rigorously select articles published in SIGPLAN venues that meet CACM’s criteria, and collect them as SIGPLAN Research Highlights. The collected articles were then nominated for consideration as CACM Research Highlights. The SIGPLAN RH committee has had a remarkable record of success: in recent years, more than two-thirds of our nominated papers have been selected. (See this list for more info: https://www.sigplan.org/Highlights/Papers/.)
This year, four papers were chosen as SIGPLAN Research Highlights, and so far, three have been invited to appear in CACM. These papers showcase PL connections to areas as diverse as chemical microfluidics, blockchain smart contracts, and automated debugging. Below we provide a brief description of each paper. Adrien Colyer (author of the well-known, “The Morning Paper” blog) wrote about two of them, and we’ve included links to those posts, along with videos of their presentations if available.
TYSON LOVELESS, University of California, Riverside, USA
CHRIS CURTIS, University of California, Riverside, USA
MOHSEN LESANI, University of California, Riverside, USA
PHILIP BRISK, University of California, Riverside, USA
Laboratories on a chip (LoCs) can be seen as the ‘general-purpose’ CPU of microfluidic devices. This paper describes the design and implementation of “BioScript”, a programming language for LoCs that allows biochemists to express microfluid experiments as programs. This research exemplifies the use of techniques from Programming Language theory and practice in order to satisfy the constraints imposed by these systems while producing a language that is fit for purpose. This includes designing a type systems that helps prevent scientists from accidentally causing unsafe chemical reactions that could damage the lab itself or create toxic gas.
The authors demonstrate that as more platforms require programming, increasingly by non-computer scientists, the creative mix and application of well-studied theory can provide meaningful design guidance.
MICHAEL KONG, The University of Sydney, Australia
ANTON JURISEVIC, The University of Sydney, Australia
LEXI BRENT, The University of Sydney, Australia
BERNHARD SCHOLZ, The University of Sydney, Australia
YANNIS SMARAGDAKIS, University of Athens, Greece
Ethereum is the world’s second largest cryptocurrency, with a current market capitalization over $20 billion. It was the first to support smart contracts—Turing complete programs stored on the blockchain, and executed by the Ethereum Virtual Machine, which perform transactions with no human interaction. Bugs in these contracts can enable theft, or render the managed funds permanently inaccessible.
This paper addresses a particular kind of bug, exploiting the fact that smart contracts consume “gas” as they run, and calls that run out of gas are aborted. If an attacker can corrupt the state of a contract so that all future calls run out of gas, then the funds it manages are permanently lost. To address the bug, the paper proposes a static program analysis technique, called MadMax. MadMax can analyze any Ethereum contract by decompiling its byte code, then searching for code patterns that are likely to be associated with three common kinds of “out of gas” vulnerabilities, using the successful Datalog approach to implement a data-flow analysis, context sensitive flow analysis, and memory modeling for data structures. The result is an accurate and scalable analysis that the authors applied to every smart contract on the Ethereum blockchain—over 90,000 of them. In 10 hours of analysis, MadMax flagged over 5% of these contracts (managing about $2.8 billion) as potentially vulnerable to one of the three attacks. Inspection of the first 13 flagged contracts, with 16 flagged vulnerabilities, showed that 13 vulnerabilities were real—so only around 20% of flagged vulnerabilities are false positives.
Blockchain technology is currently attracting significant interest, and security requirements are at its core. This paper shows how state-of-the-art programming-language techniques can be brought to bear to improve the security of the blockchain infrastructure we may all depend on in the future.
EMERY D. BERGER, University of Massachusetts Amherst, USA
Web browsers are one of the most popular application platforms. They also have an established reputation for consuming significant amounts of memory, and memory leaks in web applications only exacerbate this problem. Memory leaks — caused when the web application inadvertently maintains references to state that would otherwise be reclaimed by a garbage collector — gradually slow applications and can cause them to fail by running out of memory. This paper observes that leaks in web applications differ from traditional memory leaks in ways that make all past approaches at identifying leaks inapplicable.
The paper presents BLeak, a manifestly extremely useful framework for detecting memory leaks in web applications. It takes advantage of the fact that nearly all users of web applications repeatedly return to the same “visual state” (such as the inbox view in Google Mail, or the map of all properties in AirBnb). Because these states are semantically identical, any persistent growth across multiple round-trips indicates a memory leak. To use BLeak, the programmer provides a short script that performs a round-trip to the same visual state; the BLeak system then operates automatically, flagging objects that consistently exhibit a growing set of outgoing references as potential leaks and identifying their root cause in the source code. BLeak attributes importance to leak roots by counting the sizes of reachable nodes (normalized by their count) in order to help programmers identify where they should be focusing their debugging effort first.
The results demonstrate the great effectiveness and value of this approach: using BLeak, the authors identified and fixed nearly 60 actual and never-found-before bugs in a wide range of real-world applications, leading to significant reductions in memory.
Bios: Michael Hicks is a Professor of Computer Science at the University of Maryland, the past SIGPLAN Chair (2015-2018), and the editor of this blog. Emery Berger is Professor of Computer Science at the University of Massachusetts, Amherst, and a two-term member of the SIGPLAN Executive Committee; in that function, he is Chair of the SIGPLAN Research Highlights Committee. Emery would like to point out that he was not involved in the selection of his own paper as a RH.
Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.