|
|||||||||||||||||||
|
advertisement |
|
|
Managing Mil Tasks With Multicore Processors Jun 10, 2011 3:19 PM William Wong, Technology Editor The growing number of low-power microprocessors with multiple cores is providing military designers with numerous options in terms of how the computer processing is segmented in their systems.
Military electronics applications such as phased-array radars and software-defined radios (SDRs) require a great deal of computer processing power, and circuit and systems designs have long leaned heavily on multicore processors and field-programmable gate arrays (FPGAs) for the computational power. Many of these applications are pushing processors into the supercomputing realm, also known as high-performance computing (HPC). One of the challenges is providing the processing power without adding to the input supply power, and companies such as Adapteva (www.adapteva.com) and BittWare (www.bittware.com) offer low-power, high-performance solutions that target this space. In some cases, FPGAs can provide the processing power, but they also require added system-level development time, not to mention additional cost and power consumption. FPGAs lack the power management that microprocessors and application-specific integrated circuits (ASICs) can provide. Multicore microprocessors are receiving their share of attention now for their increasing processing power and efficiency in many military systems. For example, microprocessors from Intel Corp. (www.intel.com) and Advanced Micro Devices (www.amd.com) may integrate as many as a dozen cores for processing flexibility. They can run in symmetrical multiprocessing (SMP) mode, and multiple chips can be combined into a nonuniform-memory-architecture (NUMA) configuration for larger systems. And these microprocessors are growing in power: Intel is rumored to have a 16-core Atom-based chip in the works. At the other end of the programming spectrum, graphics processing units (GPUs) incorporate a large number of cores, can process large amounts of integer and floating point data, and work best when running through large arrays. For example, Tesla boards from NVidia (www.nvidia.com) have 240 cores per chip (see SIMT Architecture Delivers Double-Precision Teraflops,” on www.electronicdesign.com). Programming frameworks like OpenCL and NVidia’s CUDA simplify GPU programming chores, but the greatest advantage occurs when the algorithms and GPU architecture mesh well. Algorithms developed using tools such as MATLAB from The Mathworks (www.mathworks.com) can automatically target GPU platforms (for more details, see “Mathworks Matlab GPU Q and A,” on www.electronicdesign.com). Part of the challenge is connecting a large number of cores together while staying within the limitations of the underlying semiconductor technology. Utilizing mesh networks is one popular approach. On the high side, a 1.5-GHz Tile-GX chip from Tilera (www.tilera.com) has a 100 very-long-instruction-word (VLIW), 64-b core (see “Single Chip Packs In 100 VLIW Cores,” on www.electronicdesign.com). This chip targets communication environments and includes numerous high-speed serial interfaces with Ethernet support. The chips can also be linked together to enlarge the mesh. Programmers can partition these devices into isolated SMP computing blocks with shared memory. The caching system utilizes the underlying mesh network that programmers can effectively ignore. This is where Adapteva’s new Epiphany multicore architecture (Fig. 1) comes into play. At a very, very high level, Epiphany looks like Tilera’s mesh where each processing core is paired with a communications node positioned at an intersection of the communications array. Data flow is similar. The processor gives it to the communications node that routes it through its peers to the destination. Communication is still transparent, with the communications node handling transfers and performing all handshaking and error control. The difference is in the processing side. The cores are 32-b processors that support single precision floating point. They have 32 kB memory that is shared between code and data. There are no caches, making deterministic programming easier. Core-to-core communication is done explicitly like IBM’s Cell processor (see “CELL Processor Gets Ready To Entertain The Masses,” on www.electronicdesign.com) that is found inside Sony’s Playstation 3 (PS3). The Cell processor has eight Synergistic Processing Elements (SPE) that have 256 kB memory. They also use explicit data exchange, which puts communication and caching into software. Adapteva’s Epiphany architecture lends itself to local computations, with results sent to an adjacent core for added processing. The system supports a zero start-up message passing system with DMA support for larger amounts of memory. The approach leads to a very compact node that is only 0.5 mm2 with a 65-nm process. The initial 1-GHz chips have 16 cores in a 4 x 4 grid. The roadmap targets 64, 256, 1024, and 4096 core chips in a 28-nm process. The E16G301 chip can deliver 32 GFLOPS of computational power. The chip has a 64-word register file, and 512 kB memory with a 32 GB/s/processor memory bandwidth. There are two DMA channels/core. The communication link bandwidth is 128 GB/s full duplex. Four 8-GB/s, external LVDS interfaces are oriented so chips can be efficiently laid out in a grid. These 8-b full duplex ports run at 500 MHz DDR. The 15 x 15 mm BGA package has 324 balls. The chips consume only 2 W power. Quantity pricing for the E16G301 starts at $499. Bittware’s FMC (VITA 57) module (Fig. 2) packs four Epiphany-based Anemone chips on a single board. The chips are linked using the four high-speed ports. An FMC interface is easily incorporated into the system since the links were designed to be terminated in an FPGA. VITA 57 sites are common on VPX (VITA46/48/65) boards. They can also be found on AdvancedMC (AMC) boards. Multiple Bittware boards can be linked for larger arrays.
|
|
||||||||||||||||
| Back to Top |