Title: A Case for Fine-Grain Many-Core Computational Arrays
Speaker: Prof. Bevan Baas
University of California, Davis
When: 10:00-11:30am, October.12, 2011(Wednesday)
Where: Room 401, Building of School of Microelectronics
Host: Prof. Yuzhuo Fu
Abstract:
Future fabrication technologies will provide ever-increasing numbers of available devices but both system-level and chip-level power constraints will limit achievable throughputs–thus highlighting the criticalness of energy-efficient design. Die device counts in the 100s of millions and billions virtually guarantee designs will have many processors and interesting research questions deal with how the available devices and wires are organized into 100s and 1000s of processing elements per die.
The Asynchronous Array of Simple Processors (AsAP) project explores a region of the core-granularity spectrum that is not well explored but is well-matched to DSP, multimedia, embedded, and other applications.
AsAP is composed of a large number of programmable reduced-complexity processors with per-processor digitally-tunable clock oscillators operating completely independently with respect to each other (GALS) and without PLLs, DLLs, or crystal oscillators. Oscillators fully halt when there is no work to do, and restart at full speed in less than one cycle after work becomes available.
The homogeneous array is well suited for deep submicron VLSI fabrication technologies–mapping tools place tasks onto the array accommodating process variations, and avoiding faulty processors to increase yield or for self-healing. Overall system throughput is actually increased by systematic variations by mapping critical tasks to the highest-performance regions of processor arrays.
A chip containing 36 610 MHz programmable processors was fabricated in 0.18 um CMOS and is fully functional. [ISSCC06]
A second generation 65 nm CMOS design contains 167 processors including programmable processors that are able to individually and dynamically change their clock frequency and supply voltage (choosing among VddHi, VddLo, or disconnected). The chip is fully-functional with measurements showing the programmable processors operating up to 1.2 GHz at 1.3 V. At 1.2 V, they operate at 1.07 GHz and 47 mW when 100% active. At 0.675 V, they operate at 66 MHz and dissipate only 608 uW when 100% active. [SympVLSI08,JSSC09]
Recent work explored non-rectangular 2D processor meshes and shows significant improvements with novel processor shapes and inter- processor connection topologies.
Coded applications include several dozen DSP and general tasks, JPEG encoders, AES encryption engines, a full-rate 1080p 30fps HDTV residual encoder, a fully-compliant 802.11a/11g Wi-Fi wireless LAN baseband transmitter and receiver, a complete first-pass H.264 encoder, and a large portion of the mid- and back-end processing for a medical ultrasound unit. Power, throughput, and chip area results compare very well with solutions on existing programmable DSP processors. A simple C compiler and automatic mapping tool greatly simplify programming.
Biography
Bevan Baas received M.S. and Ph.D. degrees in electrical engineering from Stanford University in 1990 and 1999 respectively. After graduation, he joined Atheros Communications as the second full-time employee after the founders and served as a core member of the team which developed the first IEEE 802.11a (54 Mbps, 5 GHz) Wi-Fi solution. In 2003, he joined the Department of Electrical and Computer Engineering at the University of California, Davis where he is now an Associate Professor.
Dr. Baas’ research interests are in the algorithms, architectures, circuits, and VLSI for high-performance, energy-efficient, and area-efficient computation with strong consideration of the challenges and opportunities of future fabrication technologies. He is interested in both programmable and special-purpose processors with an emphasis on DSP, multimedia, embedded, and other workloads.
Dr. Baas was an NSF Fellow from 1990-93 and a NASA GSR Fellow from 1993-96. He received the National Science Foundation CAREER award in 2006, and the Most Promising Engineer/Scientist Award by AISES in 2006. Since 2007 has has been an Associate Editor for
the IEEE Journal of Solid-State Circuits. He has served and is serving as: Program Committee Co-Chair of the IEEE HotChips Symposium on High-Performance Chips in 2011 and Program Committee member in 2009-10; Parallel Architecture Co-Chair of the 2011 Design Automation Conference (DAC) Workshop on Parallel Algorithms, Programming, and Architectures; Technical Program Committee member of the International Conference on Computer Design (ICCD) 2004-05, 2007-09; Technical Program Committee member of the IEEE International Symposium on Asynchronous Circuits and Systems in 2010; International Solid-State Circuits Conference (ISSCC) Student Research Preview Committee member in 2012; IEEE Micro Guest Editor April 2012, and the Technical Advisory Board of an early stage technology company.
http://www.ece.ucdavis.edu/~bbaas/