Energy has become one the biggest problems today's computing systems face. Already, many transistors in today's multicore processors and systems on chips (SoCs) exist as dark silicon: they go unused for long periods because of the risk of the entire chip overheating when switched on. The problem is set to increase as engineers move to three-dimensional integration: stacking layers of logic transistors on top of each other in order to overcome the problems of two-dimensional (2D) scaling, which the IEEE International Roadmap for Devices and Systems predicts will slow to almost a complete halt within a decade. Circuitry in the inner layers runs even higher risks of overheating.
"If you want to do real 3D (three-dimensional) integration right now, you get one logic layer and that's it, because you just can't get the heat out," says Gregory Snider, a professor of electrical engineering at the University of Notre Dame.
There is no need to rely on circuit designs that generate such large quantities of heat, argues Michael Frank, senior member of technical staff at Sandia National Laboratories' Center for Computing Research. The answer he sees as essential to future development is to build computers that reverse every calculation they make, in order to avoid them deleting information unnecessarily. This counterintuitive proposal stems from a branch of information theory developed by German-American physicist Rolf Landauer and colleagues at IBM more than 50 years ago.
Landauer argued that conventional computations, in which information is erased after a calculation, imposed a fundamental lower limit on the energy needed, related to the temperature and the Boltzmann constant that underpins much of thermodynamics theory. This value, approximately 3 zeptojoules at room temperature, is far below the energy lost through resistive heating in practical computers. But it a loss that is entirely avoidable. Embracing reversibility would mean no theoretical lower limit to energy demand other than the parasitics that afflict any computing technology. In turn, according to proponents, this would allow density scaling far beyond what is possible using irreversible circuits.
Full reversibility is not easy to implement and involves a step away from familiar Boolean logic functions. Snider likens it to the decisions we see in the wider environment: "Recycling is a pain: if you can throw some toxic waste in the river, it's often easier to do that than to recycle it."
Even a basic Boolean AND gate represents an irreversible operation that discards information. Given its output of 1 or 0, the only state that can be recovered is with two true inputs; a 0 output could be any of the other three input combinations. Circuit designers must perform a careful balancing act to ensure that for any output, it is possible to recover the original inputs. That can result not just in an increase in the number of signal lines—many of which may be "garbage outputs" that exist only as a way of preserving information—but also in the total number of gates the operation needs, which increases energy where parasitics are more important than the benefits of reversibility. Some researchers, such as Robert Wille, professor of design automation at Germany's Technical Institute of Munich, have worked on synthesis methods that attempt to minimize the number of garbage outputs and gate counts by merging operations to try to map a conventional circuit onto reversible equivalents element by element.
Circuit designers must perform a careful balancing act to ensure that for any output, it is possible to recover the original outputs.
The lack of a lower limit does not matter if circuit designers cannot deal with the losses inherent in conventional electronic circuitry. This is where a technique first developed in the 1980s by Charles Seitz and colleagues at the California Institute of Technology (CalTech) can step in. They deduced that by carefully controlling the rate at which metal-oxide semiconductor (MOS) devices switch state, they could avoid much of the energy used simply being wasted as heat. The secret lies in only altering the voltage gradually as the device switches rather than trying to force a rapid transition, which results in high instantaneous resistance and, consequently, large losses through heating. Were the transitions slowed to be infinitely long, the switching operation would dissipate no heat other than that caused by leakage currents.
To provide such gradual transitions, the CalTech team proposed using a collection of near-sinusoidal clock waveforms in place of conventional square waves, a scheme that is now classified by researchers in the field as quasi-adiabatic. The term is a nod to thermodynamic processes in which no heat is transferred when a gas expands or contracts.
The CalTech team also observed the Boolean logic states, represented as stored charge in the circuitry, provide a cache of energy that simply goes to waste when the time comes to switch states again. In typical MOS logic even today once a state is discarded, all the charge used to represent it is directed to ground. Seitz and colleagues proposed using inductors to generate the undulating clocks to not just supply charge for the capacitances within circuits that encode logic states, but to recycle it when the capacitors discharge.
A major downside of quasi-adiabatic operation is that it needs to switch far more slowly and gently than conventional logic, which has limited its appeal to designers, though various research teams have developed variants that vary in silicon overhead and their ability to support fully reversible or adiabatic operation. However, trends in silicon design have worked in adiabatic computing's favor.
The speed of the adiabatic switching rate is dominated by parasitic capacitances and resistances in the circuit. These have been reduced significantly through silicon scaling over the past three decades. CMOS logic speed, on the other hand, has not increased in more than 15 years because of its thermal issues. Even though nanometer-scale transistors can easily switch at more than 100GHz, they have had to be limited in logic circuits to less than 5GHz. By the time Snider's group at Notre Dame came to apply for a grant from the U.S. Air Force to design and build an experimental 16-bit processor using adiabatic principles, that gap had closed to less than 10x. Their device has a target operating frequency of 500MHz
Frank argues that by embracing greater parallelism, designers can deliver improved aggregate performance by avoiding the thermal limits that lead to much of the silicon in today's processors having to remain dormant much of the time. "Even in today's technology, you can increase raw throughput in terms of switching events per second per unit area substantially with adiabatic switching due to the constraints [caused by heating at high power densities]," Frank says. "The advantages only increase further down the roadmap."
In the nearer term, the focus for designers aiming to exploit adiabatic principles still seems most likely away from high-performance applications, where switching frequency is far less of an issue. "The place where it probably has the biggest payoff lies in things where the energy supply is constrained, either energy supply or cooling," says Snider, adding that one possible target for a processor like the Notre Dame implementation lies in satellites. Another is in the interface circuitry for cryogenically cooled quantum computers, which themselves use reversible circuits. Conventional CMOS runs the risk of delivering unwanted heat to the sensitive qubit chambers. "The effect of any heat dissipation that happens at the qubit level gets multiplied by a factor of 1,000," Snider points out, adding that adiabatic logic can avoid this issue.
A major downside of quasi-adiabatic operation is that it needs to switch far more slowly and gently than conventional logic.
The benefits of reversibility and quasi-adiabatic charge recycling are not evenly distributed across circuits, which may limit the short-term advantages of their adoption. In work to evaluate the efficiency of random logic in contrast to the datapath-focused work of many earlier projects, Krishnan Rengarajan and coworkers at the Birla Institute of Technology and Science in Hyderabad, India, found the likely power-efficiency benefit for a full SoC could be somewhat lower than expected.
In their work, using an adiabatic logic family with comparatively low area overhead pointed to just a threefold saving in energy over CMOS, without taking the power consumed by the need for multiple clock-generation and recycling circuits into account. "Relatively, adders and such pipelined or chain structures are easier to implement with adiabatic logic and can misleadingly show a high promise for adiabatic logic," Rengarajan says. Datapath circuitry can deliver 10x improvements in efficiency using today's transistors.
The remaining issue that presents an obstacle to more widespread adoption of adiabatic computing lies in design tools, which do not take into account the major changes in logic cells that it needs. Notre Dame's team could not use conventional electronic design automation (EDA) tools to design the circuitry for their processor. A key gap lay in logic synthesis, which forced the researchers to create and lay out the circuits more or less by hand. Designs based on post-CMOS devices that may suit reversible computing better will likely face the same issue.
"The EDA tool vendors aren't going to extend their flows to support adiabatic design until there's a major customer that's demanding it and willing to pay to help support the tool development. And without tool support, these techniques aren't very accessible to ordinary designers," says Frank, who calls for greater focus on the technology. "It's important to start now, with substantial industry investments, because it will take a while for the enabling technologies, including EDA tools, to be developed, and it will definitely be needed to deliver large improvements in efficiency needed for general digital applications."
Younis, S.G. and Knight, T.F.
Asymptotically Zero Energy Split-Level Charge Recovery Logic Technical Report AITR-1500, MIT AI Laboratory (1994)
DeBenedictis, E., Frank, M., Ganesh, N., and Anderson, N.
A Path Toward Ultra-Low-Energy Computing Proceedings of the 2016 IEEE International Conference on Rebooting Computing
Reversible Computing: The Design of an Adiabatic Microprocessor Ph.D. Thesis, University of Notre Dame (2019)
Rengarajan, K.S., Mondal, S., and Kapre, R.
Challenges to Adopting Adiabatic Circuits for Systems-on-a-Chip IET Circuits, Devices & Systems, Volume 15, Issue 6 (2021)
©2022 ACM 0001-0782/22/11
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.