|Asif Iqbal, SDM '11|
Fig 1. Evolution of cyber-security threats over time[ii]
So, what's the next big thing in cyber security—the ultimate level of sophistication, the unthinkable destructive impact, and the crack in the backbone? The following short excerpt from an article in IEEE Spectrum[iii] builds context for the discussions to follow.
September 2007—Israeli jets bombed a suspected nuclear installation in northeastern Syria. Among the many mysteries still surrounding that strike was the failure of Syrian radar, supposedly state of the art, to warn the Syrian military of the incoming assault. It wasn't long before military and technology bloggers concluded that this was an incident of electronic warfare and not just any kind. Post after post speculated that the commercial off-the-shelf microprocessors in the Syrian radar might have been purposely fabricated with a hidden "back door" inside. By sending a preprogrammed code to those chips, an unknown antagonist had disrupted the chips' function and temporarily blocked the radar.
The above example was a case of an infected integrated circuit (IC) leaking information, a Type II attack that will be discussed later. If we think of the case mentioned above, the damage was the leak of information. However, thinking this through more deeply, it could have easily been a "kill switch" (Type III attack) with the potential to detonate the missile in the carrier jet or a Type IV attack capable of changing the target's location. This is an infection at the most fundamental level, difficult to detect, incurable, and potentially destructive not only to finance and global resources, but also to human life.
Recently there have been numerous media reports that confirm this. For years, fake and infected ICs have been deeply infiltrating military warfare systems. With embedded smart processors handling data of increasing value, such as consumer banking credentials, security of other critical infrastructures is at risk. There are additional case studies noted in the appendix.
In response to this threat, hardware security has started to emerge as an important research topic. In the current literature, the agent for malicious tampering is referred to as a hardware Trojan horse (HTH). An HTH causes an integrated circuit to malfunction to perform some additional malicious functions along with the intended one(s). Conventional design-time verification and post-manufacturing testing cannot readily be extended to detect HTHs due to their stealth nature, inordinately large number of possible instances, and large variety of structures and operating modes.
An HTH can be designed to disable or destroy a system at some future time, or to leak confidential information and secret keys covertly to the adversary[iv]. Trojans can be implemented as hardware modifications to microprocessors, digital signal processors (DSP), application-specific ICs (ASIC) and commercial off-the-shelf (COTS) parts. They can also be implemented as FPGA bit streams[v].
This paper borrows theoretical concepts and design examples from current research literature and my prior experience in circuit design. To build a theoretical context, I will start with the definition of hardware security and explain the intent of a secure hardware design. Building on this concept, I will expose threats posed by HTHs and methods for detecting them. Types of attacks with associated agents will be discussed. In the latter half of this paper, taxonomy is also presented along with design examples for a few classes.
What Is Hardware Security?In abstract terms, the word "security" can be used to cover several very different underlying features of a design. Every system design will require a different set of security properties, depending on the type and value of the assets or the resource worth protecting; security is about trying to defend against malicious attack.
A property of the system that ensures that resources of value cannot be copied, damaged, or made unavailable to genuine users.
The fundamental security properties on which nearly every higher-level property can be based are those of confidentiality and integrity.
ConfidentialityAn asset that is confidential cannot be copied or stolen by a defined set of attacks. This property is essential for assets such as passwords and cryptographic keys.
IntegrityAn asset that has its integrity assured is defended against modification by a known set of attacks. This property is essential for some of the on-chip root secrets (keys, encryption algorithms) on which the rest of the system's security is based.
AuthenticityIn some circumstances, a design cannot provide integrity and instead provides the property of authenticity. In this case, an attacker can change the value of the asset, but the defender will be able to detect the change (by verifying authenticity) before the chip function is compromised. In some implementations, the chip may cease to function in the event of tampering.
Types of AttacksIC security issues are mainly attributed or at least traced back to the physical security of the design or manufacturing facilities. Different mechanisms for performing attacks are broken down into four classes: hack attacks, shack attacks, lab attacks, and fab attacks.
Hack AttackA hack attack is one where the hacker is only capable of executing a software attack. Examples include viruses and malware, which are downloaded to the device via a physical or a wireless connection. In many cases of a successful hack attack, the device user inadvertently approves the installation of the software, which then executes the attack. This is either because the malware pretends to be a piece of software that the user actually wants to install or because the user does not understand the warning messages displayed by the operating environment.
Shack AttackA shack attack is a low-budget hardware attack using equipment that could be bought from a store like Radio Shack. In this scenario, attackers have physical access to the device, but not enough equipment or expertise to attack within the integrated circuit packages. They can use logic probes and network analyzers to snoop bus lines, pins, and system signals. They may be able to perform simple active hardware attacks, such as forcing pins and bus lines to be at a high or low voltage, reprogramming memory devices, or replacing hardware components with malicious alternatives. Some of the existing IC testability features, such as JTAG debug, boundary scan I/O, and BIST (built-in self-test) facilities, can be used to hack a chip's functional state.
Lab AttackThe lab attack is more comprehensive and invasive. If attackers have access to laboratory equipment, such as electron microscopes, they can perform unlimited reverse engineering of the device. It must be assumed that attackers can reverse engineer transistor-level detail for any sensitive part of the design, including logic and memory. Attackers can reverse engineer a design, attach microscopic logic probes to silicon metal layers, and introduce glitches into a running circuit using lasers or other techniques. They can also monitor analog signals, such as device power usage and electromagnetic emissions, to perform attacks such as cryptographic key analysis.
Fab AttackA fab attack is the lowest level of attack wherein malicious code is inserted into the net list or layout of an integrated circuit in the foundry or fabrication plant. Circuitry fabricated in the chip cannot be easily detected by chip validation.
Trust in Integrated CircuitsSecurity in integrated circuit design and manufacture is the final line of defense for securing hardware systems. Because of the fabless business model, third-party IP reuse, and untrusted manufacturing of the semiconductor industry, ICs are becoming increasingly vulnerable to malicious activities and alterations.[vi] [vii] These concerns have caused the Defense Advance Research Projects Agency (DARPA) to initiate the Trust in ICs program.[viii]
An IC product development process contains three major steps and agents: design, fabrication, and test and validation. These steps are pictorially represented below along with their trust levels. An untrusted agent is a potential source of infection. IC security is more of a physical security issue, which can be held in check by tight control and vertical integration over the complete manufacturing process.
Fig 2. Trusted and untrusted components of design and manufacturing chain
SpecificationDesign starts with specifications wherein alterations can be made to modify the functions and protocols or design constraints. This is considered to be a trusted component and insider attack is very unlikely. From my research to date, no cases have been reported; however, the possibility cannot be negated.
Third-party IPs and LibrariesDue to the ever-increasing complexity of designs and time-to-market constraints, high reuse is prevalent in the IC industry. This includes third-party soft/firm/hard IP blocks, models, and standard cells used by the designer during the design process and by the foundry during the post-design processes. These third-party IPs and libraries are considered untrusted.
CAD ToolsCadence, Mentor Graphics, Magma, and Synopsys provide the industry-standard CAD tools for design. These tools are considered trusted. However, from my personal experience and interviews, design engineers have been using untrusted third-party TCL[ix] scripts (open source or proprietary) on trusted CAD software for design automation even in big design houses.
FabricationFabrication involves preparing masks and wafers, which is an integrated manufacturing process of oxidation, diffusion, ion implantation, chemical vapor decomposition, metallization, and lithography. In the present context, with fabrication being outsourced to the third-party foundries, trust is in question. The adversary could change the parameters of the manufacturing process, geometries of the mask, or even embed a malicious circuit at the mask layout level. The mask information is contained in an electronic file format called GDS. Entire mask sets may be replaced by replacing the GDS and the adversary could substitute a compromised Trojan IC mask for the genuine one.[x]
Manufacturing TestIn the testing phase, test vectors are applied to the inputs of the manufactured IC, and output ports are monitored for expected behavior. Generally, the automated test equipment fails to detect a Trojan. However, test vectors or automated test equipment can be constructed to mask Trojans. Hence testing would be considered trusted only if it is done in the production test center of the client (semiconductor company or government agency).
Fig.3. Vulnerable steps of modern IC life cycle [Source: R.S. Chakraborty et al.]
Design Abstraction LevelsTrojan circuits can be embedded at various hardware abstraction levels. As we move to a lower abstraction level, the level of sophistication required increases, i.e. it is more difficult to embed a desired malicious functionality into lower levels of abstraction, as compared to higher levels.
The netlist or the gate level of a design is considered to be secure and must not be tampered with by hand. It is interesting to note that changes are made directly in the netlist or gate level at late design stages for legitimate purposes. An experienced engineer can insert a malicious circuit directly in the gate level.
The different levels of abstraction at which design is done and a Trojan may be inserted are listed below.
- At the system level in different hardware modules and interconnection and communication protocols. This requires a low level of sophistication.
- At the register transfer level (RTL), a Trojan can be inserted by coding its behavioral description along with the intended functionality of the chip. This is difficult in terms of physical access, but low in complexity of attack.
- At the gate level a hacker can carefully control all aspects of the inserted Trojan, including size and location. Physical access is difficult and the hack is complicated.
- At the transistor level, hacks are related to changing circuit parameters to compromise the reliability of the chip and cause ultimate mission mode failure. This is a very sophisticated attack, still in the trusted zone with difficult physical access.
- At the layout level, hacks are related to foundry attacks and physical access is easier because of the untrusted zone. However, this hack has the highest level of sophistication.
Ensuring AuthenticityThere are two main options to ensure that a chip used by a client is authentic, meaning it performs only those functions originally intended and nothing more. They are:
- Make the entire fabrication process trusted.
- Verify the trustworthiness of manufactured chips upon return to the clients.
Deep Dive into Hardware TrojansHardware Trojans are modifications to original circuitry that are inserted by adversaries who have the malicious intent of using hardware or hardware mechanisms to gain access to data or software running on the chips. The example in Figure 4 shows cryptographic hardware with the output bypassed with a simple multiplexer. When the select line is high, the unencrypted input is sent to the output. The multiplexer is the Trojan here, which when activated by a trigger alters the intended functionality and sends the unencrypted data to the adversary.
Fig. 4. A simple Trojan [Source: J Rajendran et al.]
An interesting point to note here is that bypass structures like the one in Figure 4 are used routinely in design for debug and design for testability (DFT).[xii] It is very difficult to distinguish such modifications and detect this type of Trojan, which may be disguised as a normal debug function. There are many other characteristics of a hardware Trojan, such as small area and rare trigger, which make it difficult to detect. Hardware Trojan detection is still a fairly new research area, but it has gained significant traction in the past few years.
Difficulty of DetectionDetection of malicious alterations is extremely difficult, for several reasons.
- Reuse. There is a great deal of third-party soft or hard Internet Protocol (IP) integration in ICs to accelerate the time to market. The IPs are getting increasingly small and detecting a small malicious alteration in a third-party IP is extremely difficult.
- Small Size. Small, submicron, IC feature sizes make detection by physical inspection and destructive reverse engineering very difficult and costly. Moreover, destructive reverse engineering does not guarantee a comprehensive test, especially when Trojans are dispersed throughout the entire chip.
- Low Activation Probability: Trojan circuits, by design, are activated under very specific low probability conditions, such as sensing a specific low-frequency toggling design signal or such analog parameters as power or temperature. This makes them unlikely to be activated and detected using random or functional stimuli during limited test times, but more easily triggered during the mission mode.
- Insufficient Manufacturing Tests. Tests of manufacturing faults, such as stuck-at and delay faults, cannot guarantee detection of Trojans. Such tests are limited by test times, which are typically a few milliseconds per chip. Within this time frame, they cannot activate and detect Trojans. Even when 100 percent fault coverage for all types of manufacturing faults is possible, there are no guarantees as far as Trojans are concerned, since all functional use cases and state vectors are not exercised.
- Decreasing Physical Geometry: Devices are getting smaller each day because of improvements in lithography. As physical feature sizes decrease, process (PVT) and environmental variations have a greater impact on the integrity of the circuit parameters (voltages, current, power, and I/O delay). This makes parametric detection of Trojans using simple measurement of signals ineffective.
Taxonomy of TrojansWang, Tehranipoor, and Plusquellic[xiii] developed a detailed taxonomy for hardware Trojans. Wang et al. suggest three main categories of Trojans according to their physical, activation, and action characteristics. Although Trojans could be hybrids of this classification (for instance, they could have more than one activation characteristic), this taxonomy captures the elemental characteristics of Trojans and is useful for defining and evaluating the capabilities of various detection strategies.
Fig. 5. Detailed taxonomy of hardware Trojans [Source: Wang et al.][xiii]
Physical CharacteristicsThe physical category describes the various hardware manifestations of Trojans. This type of category partitions Trojans into functional and parametric classes. The functional class includes Trojans that are physically realized through the addition or deletion of transistors or gates, whereas the parametric class refers to Trojans that are realized through modifications of existing wires and logic.
The size category accounts for the number of components in the chip that have been added, deleted, or compromised. The distribution category describes the location of the Trojan in the chip's physical layout. The structure category refers to the case when an adversary is forced to regenerate the layout to insert a Trojan, which could then cause the chip's physical form to change. Such changes could result in different placement for some or all design components. Any malicious changes in physical layout that could change the chip's delay and power characteristics would facilitate Trojan detection.
Trigger CharacteristicsTrojans can also be classified based on their activation or trigger characteristics. A Trojan consists of a trigger and a payload. The trigger function causes the payload to be active and carry out its malicious function. Once activated, the Trojan may continue to be in an activated state or return to its base state (one-shot activation). These triggers are further divided into two categories, externally activated and internally triggered.
Externally triggered Trojans require external inputs to act. The external trigger can be an adversary input or a legitimate user input or even a lab component's output. User input triggers may include push buttons, switches, keyboards, or keywords/phrases in the input data stream. An external component trigger could be a signal that is received by an antenna or sensor and triggers a payload inside the circuit. The activation condition could be based on the output of a sensor that monitors temperature, voltage, or any type of external environmental condition (such as electromagnetic interference, humidity, or altitude).
An internally triggered Trojan is activated by an event that occurs within the target device. The event may be either time–based or physical condition–based. Common methods include hardware counters, which can trigger the Trojan at a predetermined time. These are also called time bombs. Triggering circuitry may monitor physical parameters such as temperature and power consumption of the target device. When these parameters reach a predetermined value, they trigger the Trojan. The Trojan in this case is implemented by adding logic gates and/or flip-flops to the chip, and hence is represented as a combinational or sequential circuit. Action characteristics identify the types of disruptive behavior introduced by the Trojan.
"Always On" Trigger
The "always on" trigger keeps the Trojan active, continuously deteriorating the chip's performance. This trigger can disrupt the chip's normal reliability and function at any time. This subclass covers Trojans that are implemented by modifying the chip's geometries such that certain nodes or paths have a higher susceptibility to failure.
Another classification of Trojan based on triggers is done by Chakraborty et al.[xiii] Based on this classification, trigger mechanisms can be of two types: digital and analog.
Fig. 6. Classification of triggers based on digital/analog mechanisms[xiii]
Analog-triggered Trojans are based on detection methods of chip power or current levels. Digital-triggered Trojans can again be classified into combinational and sequential types. A combinational trigger is a logic function of internal circuit state variables. Typically, an attacker would choose a rare activation condition so that it is very unlikely for the Trojan to trigger during a conventional manufacturing test. On the other hand, sequentially triggered Trojans are activated by the occurrence of a sequence, or a period of continuous operation. The simplest sequential Trojan triggers are synchronous stand-alone counters, which trigger a malfunction on reaching a particular count. In general, detecting sequential Trojans is more difficult because the activation probability is lower due to the content and timing variables. Additionally, the number of such sequential trigger conditions for arbitrary Trojan instances can be insurmountably large for a deterministic logic testing approach, making testing and detection impractical.
Fig. 7. Example of Trojans with trigger mechanisms [Source: R.S Chakraborty et.al][xiv]
Payload consists of the circuitry designed for the intended functionality. Payload can characterize a Trojan by the severity of the effect. A Trojan can change the function of the target device and can cause errors that may be difficult to detect in testing but are detrimental in mission mode. Another class of Trojans can change specifications by changing device parameters. They may change the reliability, functional, or parametric specifications (such as power and delay). Trojans can also leak sensitive information through a secret or already existing channel. Information can be leaked by radio frequency, optical and thermal means, and via interfaces such as RS 232 and JTAG. Trojan can also be designed to create backdoor access to assist in software-based attacks like privilege escalation and password theft. Trojans can hog chip resources, including bandwidth, computation, and battery power, causing the chip to malfunction, emulating a denial of service. Some Trojans may physically destroy, disable, or alter the configuration of the device (kill switches).
Another way to categorize Trojans is based on the type of circuitry: digital and analog. Digital Trojans can either affect the logic values at chosen internal nodes, or can modify the contents of memory locations. Analog payload Trojans, on the other hand, affect circuit parameters, such as performance, power, and noise margin. Another form of analog payload would be generation of excess activity in the circuit and accelerating the aging process of an IC and shortening its lifespan. All this happens without affecting the IC functionality.
Current Trojan Detection MethodsDetection of Trojans is extremely difficult for the reasons discussed in the previous sections. It is an important area of research that has led to the development of some Trojan detection methods over the past few years. These are categorized mainly as chip-level solutions and architectural-level Trojan detection solutions.
Chip-level MethodsPower and Current Measurement
Trojans typically change a design's parametric characteristic by, for example, hampering performance, increasing or decreasing power, or causing reliability problems in the chip. Measuring current and voltage can provide information about the internal structure and activities within the IC, enabling detection of Trojans without fully activating them.
A weakness of such methods is that a Trojan can draw only a very small amount of current and that it could be submerged below the noise floor and process variation effects, thus making it undetectable by conventional measurement equipment. However, Trojan detection capability can be greatly enhanced by measuring current locally and from multiple power ports or pads, switching off certain sections of the chip, and thus increasing the small differential of voltage or current with respect to the normal operating parameters.
In timing-based methods, Trojans can be detected by measuring the delays between a circuit's inputs and outputs. Trojans can be detected when one or a group of path delays are extended beyond the threshold determined by the process variations level.
Many different samples from a process lot are checked under the same test patterns and compared. An outlier is a suspect of Trojan infection. This method uses statistical analysis to deal with process variations. However, it is not suitable for today's complex circuits, which contain millions of paths between inputs and output. Measuring all these paths, especially the short ones, is not easy.
Architecture-level Trojan DetectionAn attack can occur at different levels of design abstraction, for example at the specification, RTL, gate level, or post-layout level. At the most abstract level, the adversary can access the interpreter and perform software tampering, scan-chain readout, or a fault attack. At the hardware microarchitecture and circuit levels, the attacker takes into account power energy consumption or electromagnetic energy. As we ascend to an upper level of abstraction, the required sophistication of the attacking agent decreases and detectability of the Trojan decreases. This is because the automated synthesis and automated place and route process distribute the logic all over the chip area.
Design for Trust
One approach is to design chips for detectability of any tampering. The CAD and test community has long benefited from Design for Testability (DFT) and Design for Manufacturability (DFM). Design for Trust is another "ility" that is critical for Trojan detection. These design methods, proposed by the hardware security and trust community, improve Trojan detection and isolation by changing or modifying the design flow. They help prevent insertion of Trojans, facilitate easier detection, and provide effective IC authentication.
Some methods are physical-level tamper-proofing techniques, such as placing security parts into special casings with light, temperature, tampering, or motion sensors.
Suh, Deng, and Chan et al.[xv] have proposed a design-level tamper-proofing method. In their paper, they discuss an encryption microarchitecture featuring a high-end secure microprocessor. A secure processor is authenticated by a checksum response to a challenge within a time limit. The unique checksum is based on the cycle-to-cycle activities of the processor's specific internal micro-architectural mechanism. The authors showed that small differences in the crypto-architecture result in significant deviations in the checksum.
The architectural detection methods are specific and have to be built into the design for easy tamper detectability. The chip-level methods are too high-precision and error-prone because it is so difficult to identify a trigger in the presence of chip noise and process variation.
ConclusionThe issue of IC security and effective countermeasures has drawn considerable research interest in recent times. This paper presents a survey of different Trojan types and emerging methods of detection. Analog Trojans present a major future challenge because there are numerous types of activation and observation conditions. Considering the varied nature and types of IC vulnerabilities, a combination of design and test methods would be required to provide an acceptable level of security.
Designs are inherently made secure each day. However, the hacker is always one step ahead!! Engineers are reacting to changing security needs. They are proactively designing in "trust-ability" and making designs more secure, but physical access is something beyond the control of the academic and engineering communities. Businesses have to be aware and procurement policies have to be improved. The threats to IC security are more severe in regards to physical security. Vertical integration of the entire manufacturing chain would bring up trust in the manufacturing process, enabling many Trojans to be controlled.
Appendix: Short Cases of IC Vulnerability[xvi]The sensitive assets that each market sector tries to protect against attack are diverse. For example, mobile handsets aim to protect the integrity of radio networks, while television set-top boxes prevent unauthorized access to subscription channels. The varied type and value of the assets being protected, combined with the different underlying system implementations, mean that the attacks experienced by each also vary.
Mobile SectorTwo critical parts of a GSM handset are the International Mobile Equipment Identity (IMEI) code, a unique 15-digit code used to identify an individual handset when it connects to the network, and the low-level SIMLock protocol that is used to bind a particular device to SIM cards of a particular network operator.
Both of these components are used to provide a security feature: the IMEI is used to block stolen handsets from accessing a network, and the SIMLock protocol is used to tie the device to the operator for a contract's duration. On many handsets both of these protection mechanisms can be bypassed with little effort, typically using a USB cable and a reprogramming tool running on a desktop workstation.
The result of these insecurities in the implementation is an opportunity for fraud to be committed on such a large scale that statistics reported by Reuters UK suggest it is driving half of all street crime through mobile phone thefts, costing the industry billions of dollars every year.
Security requirements placed on new mobile devices no longer relate only to the network, but also to content and services available on the device. Protection of digital media content through Digital Rights Management (DRM) and protection of confidential user data, such as synchronized email accounts, is becoming critical as both operators and users try to obtain more value from their devices.
Consumer Electronics and Embedded SectorThe requirements placed on consumer electronics, such as portable game consoles and home movie players, are converging with those seen in the mobile market. Increasing wired and wireless connectivity, greater storage of user data, dynamic download of programmable content, and handling of higher value services all suggest the need for a high-performance and robust security environment.
Security attacks are not limited to open systems with user-extensible software stacks. Within the automotive market most systems are closed or deeply embedded, yet odometer fraud, in which the mileage reading is rolled back to inflate the price of a secondhand vehicle, is still prevalent. The US Department of Transportation reports that this fraud alone costs American consumers hundreds of millions of dollars every year in inflated vehicle prices.
Security features typically encountered in these embedded systems are those that verify that firmware updates are authentic and those that ensure that debug mechanisms cannot be used maliciously.
iCyber Security in Federal Government, Booz Allen Hamilton
iiSource: Booz Allen Hamilton. www.boozallen.com
iii"The Hunt for the Kill Switch," IEEE Spectrum, May 2008
iv"The Hunt for the Kill Switch," IEEE Spectrum, May 2008
vAn FPGA, or field-programmable gate array, is a general-purpose programmable chip with logic blocks and programmable interconnections. FPGA often replace application-specific ICs for small-volume applications. A bit stream is the interconnection information between the logic elements of the FPGA. A bit stream defines the function of the FPGA.
viReport of the Defense Science Board Task Force on High Performance Microchip Supply, Defense Science Board, US Department of Defense, February 2005; http://www.acq.osd.mil/dsb/reports/2005-02-HPMS_Report_Final.pdf.
viiInnovation at Risk: Intellectual Property Challenges and Opportunities, white paper, Semiconductor Equipment and Materials International, June 2008.
ixTool control language: Standard CAD tools support a common tool control language for automating design flows and batch mode jobs
x"The Hunt for the Kill Switch," IEEE Spectrum, May 2008
xiTowards a Comprehensive and Systematic Classification of Hardware Trojans, J Rajendran et al.
xiiiX. Wang, M. Tehranipoor, and J. Plusquellic, "Detecting Malicious Inclusions in Secure Hardware: Challenges and Solutions," Proc. IEEE Int'l Workshop Hardware-Oriented Security and Trust (HOST 08), IEEE CS Press, 2008, pp. 15-19
xivHardware Trojan: Threats and Emerging Solutions, Rajat Subhra Chakraborty et al.
xvG.E. Suh, D. Deng, and A. Chan, "Hardware Authentication Leveraging Performance Limits in Detailed Simulations and Emulations," Proc. 46th Design Automation Conf. (DAC 09), ACM Press, 2009, pp. 682-687.
xviSource: Building a Secure System Using TrustZone™ Technology, ARM Technologies white paper