A differential equation-solving analog device is a reconfigurable computing platform which leverages the physics of the underlying substrate to implement dynamical system computations. In the last article, we discussed how a programmer would manually configure an analog device to perform computation. In this article, we will discuss how to automatically configure an analog device to run a target dynamical system. We frame this problem of automatically configuring the analog hardware as a compilation problem.
The compiler takes, as input, a specification of the dynamical system and a specification of the analog device. The dynamical system specification provides the differential equations to implement on the hardware. The analog device specification describes the input ports, output ports, programming interface (data fields and modes), and behavior of each block. This specification also defines all of the possible digitally settable connections available on the device. The compiler produces as output an analog circuit specification which describes a circuit made up of configured blocks. The analog circuit specification annotates the signals in the described circuit which implement dynamical system variables.
The compiler produces the analog circuit in two stages:
- The circuit synthesis stage of compilation forms a circuit whose physics matches the dynamics of the target dynamical system. The circuit synthesis stage of compilation produces an analog circuit specification which implements the provided dynamical system assuming all the analog blocks perfectly implement the promised behavior under all operating conditions.
- The circuit scaling stage of compilation transforms a circuit to respect the low-level physical behaviors present in the analog hardware while also preserving the dynamical system dynamics. The transformed circuit respects all of the operating range and frequency limitations imposed by the hardware and reduces the effect of noise, process variation, and quantization error on the computation.
This article covers the circuit synthesis step of compilation and demonstrates the operation of this compilation pass on the harmonic oscillator and HCDCv2 analog hardware introduced previously. Reconfigurable analog devices are a challenging target for circuit synthesis for the following reasons:
- Complex Blocks: Reconfigurable analog devices often offer highly parametric blocks and complex blocks which implement non-standard functions. The compiler needs to creatively program blocks to effectively map the dynamical system to the analog hardware.
- Non-Computational Blocks: Analog devices often provide special-use blocks which copy and convert signals (assembly blocks) or internally route signals (route blocks) throughout the device. The compiler needs to specially handle such blocks to use them correctly.
- Restrictive Routing: Analog devices support a limited number of digitally settable connections and typically prioritize offering connections between spatially co-located blocks. Long-distance connections often involve multiple route blocks and are limited in supply. The compiler needs to intelligently lay out the circuit on the device so that all of the connections can be implemented on the hardware.
A Bird’s Eye View of Circuit Synthesis
The compiler breaks down the circuit synthesis process into multiple, more specialized passes, where each pass refines the circuit by adding and configuring blocks. This compiler architecture enables the compiler to more scalably map dynamical systems to the target hardware:
- The compiler invokes the fragment synthesis procedure for each differential equation and algebraic relation in the dynamical system to obtain a collection of circuit fragments which implement the dynamical system variables. The fragment synthesis procedure uses a powerful, but computationally expensive, algebraic rewrite-based search algorithm to identify non-trivial compositions of blocks. The performance of the algorithm depends on the complexity of the target equation rather than the size of the overall system.
- The assembly pass assembles together the individual fragments produced in the fragment synthesis step to form a complete circuit, inserting assembly blocks when necessary to copy and convert signals. The assembly pass is lightweight and scales well to large dynamical systems.
- The place+route pass maps the circuit blocks to locations on the analog device, inserting route blocks when necessary. Route blocks are specialized blocks which forward signals between spatially distant regions within the device. The place and route procedure carefully partitions the circuit to reduce the number of long-distance connections made.
The compiler produces circuits which implement dynamical systems that are algebraically equivalent to the provided dynamical system by construction. Two dynamical systems are algebraically equivalent if one can be transformed into the other by successively applying algebraic rewrite rules.
HCDCv2 Revisited: The compiler targets the HCDCv2 analog hardware from the previous article. The HCDCv2 provides several types of computational, assembly and route blocks:
Type | Blocks |
Compute Blocks | MUL,INT,OBS |
Assembly Blocks | COPY |
Route Blocks | TIN,TOUT |
Harmonic Oscillator Revisited: In this article, we will be mapping the harmonic oscillator to the HCDCv2 analog hardware. The harmonic oscillator dynamical system has differential equations modeling the oscillator position (P) and velocity (V) and an algebraic relation which records the position over time to an observation variable (Pos):
V = integ(-0.25*P, 0) | // the position is the velocity integrated over time |
P = integ(V, 10) | // the position is the velocity integrated over time |
Pos = emit(p) | // observe p and name the observation Pos |
Circuit Fragment Synthesis
Pictured above are the harmonic oscillator circuit fragments generated by the compiler. Each circuit fragment has one or more input and output signals (grey circle and blue box) which implement dynamical system expressions. The circuit fragment also instantiates the mode (green box) and initializes the digitally settable values (yellow callouts) for each block. We identify a specific block port with the notation Block.Port.
Fragment 1: This fragment models the position P=integ(V, 10) of the harmonic oscillator. It produces a current implementing integ(0.1*10*V, 2.0*5.0) at output port INT.z provided the input port MUL.x is supplied with a current implementing the velocity V.
Fragment 2: This fragment models the velocity V=integ(-0.25*P, 0) of the harmonic oscillator. The fragment produces a current implementing integ(0.25*(-P), 2.0*0) at output port INT.z provided the input port MUL.x is supplied with a current implementing the negated position –P.
Fragment 3: This fragment implements the externally observable position Pos=P. The fragment produces a current implementing 0.6*1.66*P at port OBS.z provided MUL.x is supplied with a current implementing P.
We next describe how the circuit synthesis procedure derives a fragment, using the first fragment as an example.
The fragment synthesis procedure accepts as input a starting differential equation or algebraic relation (starting equation) and synthesizes a circuit fragment which implements the provided relation (steps 1-3). The fragment synthesis problem is formulated as a search over tableaus. Each tableau contains a circuit fragment, a set of goals to solve, and a set of available computational blocks. The procedure begins with an initial tableau and searches over tableaus until it discovers a solved tableau. The initial tableau contains a single goal specifying the algebraic relation to implement and contains an empty circuit fragment. The solved tableau has no goals left and presents a circuit fragment which implements the algebraic relation from the initial tableau. The solved tableau’s circuit fragment only requires input signals which can be fulfilled during the assembly pass of compilation.
The synthesis procedure repeatedly applies the tableau transition relation to derive new tableaus from a target tableau. The transition relation non-deterministically selects a tableau goal and available block and attempts to solve the goal by configuring and incorporating the selected block into the tableau circuit fragment. If successful, the transition relation returns a new tableau which captures the state of the circuit fragment after the goal has been solved.
Starting Tableau (Starting Equation): The starting tableau contains the starting algebraic relation, an empty circuit, and the collection of the available computational blocks (MUL, INT, and OBS) on the HCDCv2. Each available block is partially configured with a block mode — available blocks are written with the notation Block[Mode]. Refer to the previous article for a summary of the mode-dependent functions implemented by the blocks presented above.
First Derived Tableau (Step 1): The compiler first applies the transition relation to the starting tableau. It solves the goal P=integ(V, 10) with the INT[hm] block which computes integ(0.1*x, 2*z0) at INT.z. The INT[hm] block implements the expression integ(0.1*(10*V), 2*5) at output port INT.z when input port INT.x is supplied with 10*V and the digital value z0 is set to 5. The transition relation affixes a label indicating the signal at INT.z implements the dynamical system variable P. The INT.x=10*V goal in the derived tableau ensures that a signal implementing 10*V is provided to port INT.x.
Solved Tableau (Steps 2 and 3): The compiler applies the transition relation again to solve the INT.x=10*V goal. It selects the MUL[mm] block which computes c*x at port MUL.z. The MUL[mm] block implements the expression 10*V when c is 10 and MUL.x is V. The port MUL.z is connected to INT.x to resolve the INT.x=10*V goal. The transition relation then affixes a V label to input port MUL.x. This label tells the compiler to apply a signal implementing V to port MUL.x in the assembly step of compilation.
The above derived tableau is is considered a solved tableau since it has no goals left. The circuit fragment in the solved tableau implements the starting relation P=integ(0.1*10*V, 2.0*5.0) at INT.z if a signal implementing V is supplied to the MUL.x port.
Assembly
The assembly pass links together the circuit fragments to form a fully connected circuit. The assembler routes each circuit fragment output signal (blue box) to all matching inputs (grey circle). Because analog currents cannot be directly routed to multiple destinations without compromising the signal integrity, the assembler introduces current copiers which duplicate the input signal. Each block is an assigned a numerical identifier (number under block name) which distinguishes between multiple uses of the same computational block — we use the notation Block# to identify a block of type Block with identifier #. This identifier is later mapped to a device location during the place+route step.
In the above assembled circuit, the COPY0[pn] block is inserted to produce a positive and negative copy of the signal implementing the oscillator position P. The positive copy is forwarded to port MUL2.x of the observation (Obs) fragment. The negative copy is forwarded to port MUL0.x of the velocity (V) fragment. Note that the assembly pass re-numbers the fragment blocks so that each block instance is distinct when forming the completed circuit.
Place and Route
The place+route pass maps blocks in the analog circuit to block locations, inserting route blocks when necessary. The HCDCv2 contains a total of four distinct locations spread over two tiles. Locations (0,0), (0,1) reside on tile 0 and locations (1,0), and (1,1) reside on tile 1. All locations contain one MUL,INT, and COPY block. The (0,0) and (1,0) locations additionally each contain one OBS,TIN, and TOUT block. The HCDCv2 supports the following connections:
Source Block | Source Location | Destination Block | Destination Location |
MUL,INT, TIN, or COPY | (0,0) or (0,1) | MUL,INT. OBS, TOUT, or COPY | (0,0) or (0,1) |
MUL,INT, TIN, or COPY | (1,0) or (1,1) | MUL,INT. OBS, TOUT, or COPY | (1,0) or (1,1) |
TOUT | (0,0) | TIN | (1,0) |
TOUT | (1,0) | TIN | (0,0) |
The HCDCv2 supports direct connections between blocks residing on the same tile. The HCDCv2 also supports indirect connections between blocks residing on different tiles — all signals spanning two tiles must be routed through a TOUT and a TIN block. Because there are only two pairs of TOUT and TIN blocks, the HCDCv2 only supports a maximum of two inter-tile connections.
In the above mapped circuit, the MUL0, INT0, and COPY0 blocks are mapped to location (0,0), the MUL1 and INT1 blocks are mapped to location (0,1), and the MUL2 and OBS0 blocks are mapped to location (1,0). Because the device does not offer any digitally settable connections between the COPY block at (0,0) and the MUL block at (1,0), the place+route pass routes the signal through a TOUT route block at (0,0) and a TIN route block at (1,0) to make the connection.
The compiler decomposes the place+route problem into a sequence of placement problems which map blocks to progressively finer grain structures on the device. This decomposition works well in practice because connections between large structures on the chip are much scarcer than connections between small, co-located structures. Each placement problem is an integer linear programming (ILP) problem which encodes the following placement properties:
- The compiler does not map more blocks than are available at each structure. At most two MUL blocks are mapped to tile 0 (locations (0,0) or (0,1)).
- Each connection in the analog circuit must be mapped to a distinct path in the analog device. This path may contain any number of route blocks. There is at most one connection which routes a signal from tile 0 (locations (0,0) or (0,1)) to tile 1 (locations (1,0) or (1,1)).
- If a block has been previously mapped to a coarse-grained structure, ensure that it is only mapped to locations within that structure. MUL2 was previously mapped to tile 1 and therefore can only be mapped to locations (1,0) or (1,1).
Next Up
In the next article, we will discuss the circuit scaling step of compilation. Stay tuned!
Bio: Sara Achour is a PhD candidate at the Computer Science and Artificial Intelligence Laboratory at Massachusetts Institute of Technology (CSAIL MIT). She will be joining Stanford’s Computer Science department as an Assistant Professor starting Summer 2021.
Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.