RF Design Magazine


Low power design techniques for high-speed programmable embedded Infosec systems
Sep 1, 2003 12:00 PM  By Todd Moore and Rick Schmalbach

[For a copy of this article in PDF format, which displays figures and equations, click

here

. Requires Adobe Acrobat Reader,

free download.

]

Many of today's high assurance embedded systems require highly capable, high-speed cryptographic solutions. Many applications need to sustain lengthy mission profiles using standard, commercial battery technology (such as unmanned aerial vehicles, smart sensors, smart munitions, and handheld soldier radios). Low power consumption is essential to the success of these missions by maximizing the application's battery life. This paper will describe various techniques that can be used to meet these programmable Infosec systems' requirements.

FPGA power consumption

Programmable Infosec solutions typically have been composed of large FPGA-like programmable elements capable of implementing a variety of cryptographic algorithms. These programmable elements require complex structures that need large numbers of gates. Incorporation of these elements inflates the die size, increases cost, increases power, and reduces the maximum operating frequency.

The dynamic power consumption on FPGAs can be separated into three parts: datapath, synchronization and off-chip power.

  • Datapath corresponds to the combinational blocks and associated interconnection power.
  • Synchronization is the consumption by registers, clock lines and buffers.
  • Off-chip power is the fraction dissipated in the circuit output pads.

Knowledge of the relationship between these components for a given FPGA technology is fundamental in calculating an FPGA-based system's power consumption1.

The power consumption of the datapath interconnection (programmability) is the highest of the three parts and will increase linearly based on the input clocking frequency. Various techniques (including pipelining and partitioning) can be applied to an FPGA design to reduce this datapath power consumption1. Even though an FPGA-based design provides the highest degree of flexibility and programmability, it also consumes the largest amount of power.

System-on-chip designs

One alternative to an FPGA-based design is a system-on-chip (SoC) application specific integrated circuit (ASIC). The SoC ASIC provides the optimal mix between hardware and software, allowing functional components to be partitioned to provide the best mix of speed and power enhancements.

In particular, components that can gain from the benefits of hardware implementation will be implemented in hardware accelerators and discrete logic. Software is written to provide the necessary hardware initialization and configuration, but many time-extensive, number crunching operations (such as power-hungry) are provided by the hardware.

Overall, the power-budget of a SoC ASIC will be much less than an FPGA-based design. The tradeoff is that programmability will be limited to the flexibility of the hardware accelerators. Lower power consumption (and subsequent higher speeds, in some cases) may be an acceptable compromise for many power conscious applications. Additional hardware interfaces, as well as software functionality, will help offset any programmability concerns.

Power reduction techniques

Various techniques can be used to reduce the power consumption of SoC ASIC designs, including dynamic frequency control, dynamic power management and the ability to idle embedded processors.

An SoC ASIC external reference clock and internal clock generator can be used to provide dynamic frequency control. The reference clock frequency is proportionally related to the SoC ASIC's power consumption (e.g. lower reference clock frequency results in lower power consumption).

The reference clock is provided by the system (host) and can be scaled (externally) based on the intended mode of operation. An internal clock generator can also be used to scale system clock frequencies (and power consumption) dependent on the desired mode of operation. This internal clock generator will contain a PLL used for setting the internal clock rate.

The PLL logic contains three programmable dividers designated as reference, feedback, and output. The maximum and minimum values of the reference clock frequency input and the VCO output affect the phase jitter, which affects the ASIC's performance. Figure 1 shows a sample PLL-based variable clock-generator circuit.

Disabling, or turning-off, the internal clock to unused or idled functional SoC ASIC sub-blocks will decrease the amount of power consumed. For example, every piece of logic hardware (or gate) that is clocked will consumes some amount of power. By applying the appropriate amount of dynamic clock control or power management, the amount of power consumption can be reduced significantly for a specific mode of operation.

Dynamic power management requires some degree of up-front planning and organization. The SoC ASIC needs to be divided into the appropriate functional blocks to ensure that the maximum benefit can be achieved by disabling a specific piece of the hardware design.

The SoC ASIC will need to contain the logic necessary to control power up, power down, and reset of individual function blocks. This may include a clock tree register that enables or disables the clock to a specific functional block.

Each functional block can be powered down by setting the appropriate power down bit in this register that disables the clock to that block. Each functional block can also be initialized to a known state by setting the reset bit. Dynamic power management is an internal SoC ASIC function controlled by external software.

Some SoC ASIC designs contain an embedded processor. Software is written for this processor to perform the necessary configuration and control operations. Most modern-day embedded processors contain an instruction that will place the processor into an idle, or sleep, state. Once the processor enters this state, only an external stimulus (such as defined interrupt) can wake-up the processor.

The processor will consume a very minimal amount of power while in the idle state. This low power consumption is a benefit for SoC ASIC designs assuming that no, or limited, software intervention is required for a particular function.

Once the SoC ASIC has been configured, the processor can idle itself and only be utilized during specific times (such as initialization or mode change). A complete up-front system design and hardware/software partitioning is required to reap the maximum benefits of processor idling.

These techniques can be generically applied to SoC ASIC designs, but may not be adequate in themselves to meet programmable Infosec device requirements. The SoC ASIC approach may need to be extended to programmable Infosec devices by dividing the composite set of cryptographic algorithms into classes and applying optimal hardware/software tradeoffs.

Infosec considerations

A SoC ASIC approach can be applied to today's cryptographic algorithms to meet Infosec device requirements. By reviewing the requirements for various cryptographic algorithms, the maximum benefit from the SoC ASIC technology and low power consumption can be achieved.

Cryptographic algorithm functionality can be broken down into several types of classes, parallel and serial. Other unique cryptographic functions, including pattern detection, shift-registers, multiplication, randomization, hash, permutation and combining can be optimized in hardware.

The appropriate segmentation of each cryptographic algorithm will ensure that the optimal engine interface can be designed, thus reducing the amount of required software intervention. Optimizing interfaces to a specific cryptographic engine will reduce the time needed to perform various types of bit manipulation. For example, data conversion from a serial to a parallel interface is both time and power consuming. Cryptographic engine segmentation will produce the maximum clock and power control.

Cryptographic engines can be grouped into two types of classes: parallel and serial. Parallel engines are designed to interface directly with first-in-first-out (FIFO) buffers to provide high-speed data throughput. Serial engines interface to serial channels using a selector or router with various serial transceivers. In general, legacy algorithms relied on serial engines where newer algorithms use parallel engines.

Interactions between various cryptographic engines of the same class should utilize a common bus with a common controller. Each class of cryptographic algorithm should have its own bus controller. This architecture allows various cryptographic engine classes to be disabled (not clocked) for minimum power consumption.

Hardware functions required by both the parallel and serial engines should also reside on their own unique bus structure. A particular cryptographic engine class may only need some hardware functions. In this case, the hardware function should be optimized for that cryptographic engine. For example, a permuter/combiner function is only required for serial cryptographic engines and should be optimized for their use.

The parallel and serial engines may also provide connectivity to a processor to allow the processor to manipulate a particular data or traffic stream. For example, allowing the processor to generate or decode cryptographic preambles or packet header information. This type of processor intervention should be limited as there is a direct correlation between processor intervention and higher power consumption.

Cryptographic ASICs

An example of a partitioned cryptographic engine SoC ASIC is shown in Figure 2, a Harris Corp. SoC ASIC (referred to as the Raven ASIC) Sierra II module programmable, embeddable cryptographic Infosec module.

The Raven ASIC contains redundant RISC processors and partitioned serial and parallel cryptographic hardware accelerators. The processors perform the overall control functions, command decode processing and data flow to and from the peripherals and input/output ports.

The ASIC was designed with power consumption in mind. Through the proper classification of the cryptographic engines and basic power management design techniques, the Raven ASIC provides lower power consumption in many applications. The ASIC's clock rate can be dynamically varied to the minimum necessary rate for a specific application (function). The ASIC was designed (partitioned) such that the clock can be turned off to circuit blocks that are not in use. The RISC processors can be disabled when not in use.

Conclusion

Through the proper partitioning and classification of hardware and software requirements, optimal SoC ASICs can be designed and developed. Dynamic frequency and clock control, processor idling and functional grouping are common techniques to provide low power consumption for SoC ASIC designs.

References

  1. E. Boemo, “Some Notes on Power Management on FPGA-based Systems,” Lecture Notes in Computer Science, No. 975, pp. 149-157 (Berlin: Springer-Verlag 1995).

  2. N. Sklavos, “Low-Power Implementation of an Encryption/Decryption System with Asynchronous Techniques,” VLSI Design, vol. 15, Issue 1, 2000, pp. 455-468.

  3. R. Schmalbach, “Specifications for Sierra II ASIC,” 12016-0317, rev 5.1, February 2003.

ABOUT THE AUTHORS

Todd Moore is senior engineering manager of Harris Corp.'s (www.harris.com) Annapolis Junction, Maryland, engineering facility. He is responsible for Harris' Sierra II product development. Moore received his B.S.E.E. degree from Cornell University and M.S.E.E. degree from the Rochester Institute of Technology. He can be reached at tmoore@harris.com.

Rick Schmalbach has been involved in the design and development of Infosec systems for over 20 years, primarily in the areas of link encryption and tactical communications. He is currently leading the system design and certification effort on the Sierra II programmable cryptologic module for Harris Corp., and is involved in developing Sierra II applications.



February/March 2012
Part Finder
Search our directory of over 10 million parts.



Popular Searches:
AMP/Tyco Electronics
Maxim Integrated Products
Analog Devices
Molex
Freescale Semiconductor
Advanced Micro Devices
Texas Instruments

 
Back to Top