HOME

TheInfoList



OR:

High-temperature operating life (HTOL) is a reliability test applied to
integrated circuits An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuits on one small flat piece (or "chip") of semiconductor material, usually silicon. Large numbers of tin ...
(ICs) to determine their intrinsic reliability. This test stresses the IC at an elevated
temperature Temperature is a physical quantity that expresses quantitatively the perceptions of hotness and coldness. Temperature is measured with a thermometer. Thermometers are calibrated in various temperature scales that historically have relied o ...
, high voltage and dynamic operation for a predefined period of time. The IC is usually monitored under stress and tested at intermediate intervals. This reliability stress test is sometimes referred to as a "lifetime test", "device life test" or "extended burn in test" and is used to trigger potential failure modes and assess IC lifetime. There are several types of HTOL: * AEC Documents.AEC Documents
/ref> * JEDEC Standards.JEDEC Standard
/ref> * Mil standards.Mil standard


Design considerations

The main aim of the HTOL is to age the device such that a short experiment will allow the lifetime of the IC to be predicted (e.g. 1,000 HTOL hours shall predict a minimum of "X" years of operation). Good HTOL process shall avoid relaxed HTOL operation and also prevents overstressing the IC. This method ages all IC's building blocks to allow relevant failure modes to be triggered and implemented in a short reliability experiment. A precise multiplier, known as the Acceleration Factor (AF) simulates long lifetime operation. The AF represents the accelerated aging factor relative to the useful life application conditions. For effective HTOL stress testing, several variables should be considered: #Digital toggling factor #Analog modules operation #I/O ring activity #Monitor design #Ambient temperature (Ta) #Junction temperature (Tj) #Voltage stress (Vstrs) #Acceleration factor (AF) #Test duration (t) #Sample size (SS) A detailed description of the above variables, using a hypothetical, simplified IC with several RAMs, digital logic, an analog voltage regulator module and I/O ring, together with the HTOL design considerations for each are provided below.


Digital toggling factor

The digital toggling factor (DTF) represents the number of transistors that change their state during the stress test, relative to the total number of gates in the digital portion of the IC. In effect, the DTF is the percentage of transistors toggling in one time unit. The time unit is relative to the toggling frequency, and is usually limited by the HTOL setup to be in the range of 10–20Mhz. Reliability engineers strive to toggle as many as possible transistors for each time unit of measure. The RAMs (and other memory types) are usually activated using the BIST function, while the logic is usually activated with the
SCAN Scan may refer to: Acronyms * Schedules for Clinical Assessment in Neuropsychiatry (SCAN), a psychiatric diagnostic tool developed by WHO * Shared Check Authorization Network (SCAN), a database of bad check writers and collection agency for ba ...
function, LFSR or logic BIST. The power and the self-heating of the digital portion of the IC are evaluated and the device's aging estimated. These two measures are aligned so that they are similar to the aging of other elements of the IC. The degrees of freedom for aligning these measures are the voltage stress and/or the time period during which the HTOL program loops these blocks relative to other IC blocks.


Analog modules operation

The recent trend of integrating as many electronic components as possible into a single chip is known as system on a chip (SoC). This trend complicates reliability engineers' work because (usually) the analog portion of the chip dissipates higher power relative to the other IC elements. This higher power may generate hot spots and areas of accelerated aging. Reliability engineers must understand the power distribution on the chip and align the aging so that it is similar for all elements of an IC. In our hypothetical SoC the analog module only includes a voltage regulator. In reality, there may be additional analog modules e.g.
PMIC Power management integrated circuits (power management ICs or PMICs or PMU as unit) are integrated circuits for power management. Although PMIC refers to a wide range of chips (or modules in system-on-a-chip devices), most include several DC/DC c ...
, oscillators, or charge pumps. To perform efficient stress tests on the analog elements, reliability engineers must identify the worst-case scenario for the relevant analog blocks in the IC. For example, the worst-case scenario for voltage regulators may be the maximum regulation voltage and maximum load current; for charge pumps it may be the minimum supply voltage and maximum load current. Good engineering practice calls for the use of external loads (external R,L,C) to force the necessary currents. This practice avoids loading differences due to the chip's different operational schemes and operation trimming of its analog parts. Statistical methods are used to check statistical tolerances, variation and temperature stability of the loads used, and to define the right confidence bands for the loads to avoid over/under stress at HTOL operating range. The degrees of freedom for aligning the aging magnitude of analog parts is usually the
duty-cycle A duty cycle or power cycle is the fraction of one period in which a signal or system is active. Duty cycle is commonly expressed as a percentage or a ratio. A period is the time it takes for a signal to complete an on-and-off cycle. As a formu ...
, external load values and voltage stress.


I/O ring activity

The interface between the "outside world" and the IC is made via the input/output (I/O) ring. This ring contains power I/O ports, digital I/O ports and analog I/O ports. The I/Os are (usually) wired via the IC package to the "outside world" and each I/O executes its own specific command instructions, e.g.
JTAG JTAG (named after the Joint Test Action Group which codified it) is an industry standard for verifying designs and testing printed circuit boards after manufacture. JTAG implements standards for on-chip instrumentation in electronic design aut ...
ports, IC power supply ports etc. Reliability engineering aims to age all I/Os in the same way as the other IC elements. This can be achieved by using a
Boundary scan Boundary scan is a method for testing interconnects (wire lines) on printed circuit boards or sub-blocks inside an integrated circuit. Boundary scan is also widely used as a debugging method to watch integrated circuit pin states, measure voltage ...
operation.


Monitor design

As previously mentioned, the main aim of the HTOL is aging the samples by dynamic stress at elevated voltage and/or temperature. During the HTOL operation, we need to assure that the IC is active, toggling and constantly functioning. At the same time, we need to know at what point the IC stops responding, these data are important for calculating price reliability indices and for facilitating the FA. This is done by monitoring the device via one or more vital IC parameters signals communicated and logged by the HTOL machine and providing continuous indication about the IC's functionality throughout the HTOL run time. Examples of commonly used monitors include the BIST "done" flag signal, the SCAN output chain or the analog module output. There are three types of monitoring: # Pattern matching: The actual output signal is compared to the expected one and alerts about any deviation. The main disadvantage of this monitor type is its sensitivity to any minor deviation from the expected signal. During the HTOL, the IC runs at a temperature and/or voltages that occasionally fall outside its specification, which may cause artificial sensitivity and/or a malfunction that fails the matching but is not a real failure. # Activity: Counts the number of toggles and if the results are higher than a predefined threshold the monitor indicates OK. The main disadvantage of this type of monitoring is the chance that unexpected noise or signal could be wrongly interpreted. This issue arises mainly in the case of low count toggling monitor. # Activity within a predefine range: Checks that the monitor responds within a predefined limit, for example when then number of toggles is within a predefined limit or the output of the voltage regulator is within a predefined range.


Ambient temperature (Ta)

According to JEDEC standards, the environmental chamber should be capable of maintaining the specified temperature within a tolerance of ±5 °C throughout while parts are loaded and unpowered. Today's environmental chambers have better capabilities and can exhibit temperature stability within a range of ±3 °C throughout.


Junction temperature (Tj)

Low power ICs can be stressed without major attention to self-heating effects. However, due to technology scaling and manufacturing variations, power dissipation within a single production lot of devices can vary by as much as 40%. This variation, in addition to high power IC makes advanced contact temperature controls necessary for facilitating individual control systems for each IC


Voltage stress (Vstrs)

The operating voltage should be at least the maximum specified for the device. In some cases a higher voltage is applied to obtain lifetime acceleration from voltage as well as temperature. To define the maximum permitted voltage stress, the following methods can be considered: # Force 80% of breakdown voltage; # Force six-sigma less than the breakdown voltage; # Set the overvoltage to be higher than the maximum specified voltage. An overvoltage level of 140% of the maximum voltage is occasionally used for MIL and automotive applications. Reliability engineers must check that Vstress does not exceed the maximum rated voltage for the relevant technology, as specified by the FAB.


Acceleration factor (AF)

The Acceleration factor (AF) is a multiplier that relates a product's life at an accelerated stress level to the life at the use stress level. An AF of 20 means 1 hour at stress condition is equivalent to 20 hours at useful condition. The voltage acceleration factor is represented by AFv. Usually the stress voltage is equal to or higher than the maximum voltage. An elevated voltage provides additional acceleration and can be used to increase effective device hours or achieve an equivalent life point. There are several AFv models: #E model or the constant field/voltage acceleration exponential model; #1/E model or, equivalently, the anode hole injection model; #V model, where the failure rate is exponential to voltage #Anode hydrogen release for the power-law model AFtemp is the acceleration factor due to changes in temperature and is usually based on the
Arrhenius equation In physical chemistry, the Arrhenius equation is a formula for the temperature dependence of reaction rates. The equation was proposed by Svante Arrhenius in 1889, based on the work of Dutch chemist Jacobus Henricus van 't Hoff who had noted in ...
. The total acceleration factor is the product of AFv and AFtemp


Test duration (t)

The reliability test duration assures the device's adequate lifetime requirement. For example, with an
activation energy In chemistry and physics, activation energy is the minimum amount of energy that must be provided for compounds to result in a chemical reaction. The activation energy (''E''a) of a reaction is measured in joules per mole (J/mol), kilojoules p ...
of 0.7 eV, 125 °C stress temperature and 55 °C use temperature, the acceleration factor (Arrhenius equation) is 78.6. This means that 1,000 hours' stress duration is equivalent to 9 years of use. The reliability engineer decides on the qualification test duration. Industry good practice calls for 1,000 hours at a junction temperature of 125 °C.


Sample size (SS)

The challenge for new reliability assessment and qualification systems is determining the relevant failure mechanisms to optimize sample size. Sample plans are statistically derived from manufacturer risk, consumer risk, and the expected failure rate. The commonly used sampling plan of zero rejects out of 230 samples is equal to three rejects out of 668 samples assuming LTPD =1 and a 90% confidence interval.


HTOL policy


Sample selection

Samples shall include representative samples from at least three nonconsecutive lots to represent manufacturing variability. All test samples shall be fabricated, handled, screened and assembled in the same way as during the production phase.


Sample preparation

Samples shall be tested prior to stress and at predefined checkpoints. It is good engineering practice to test samples at maximum and minimum rating temperatures as well as at room temperature. Data logs of all functional and parametric tests shall be collated for further analysis.


Test duration

Assuming Tj = 125 °C, commonly used checkpoints are after 48, 168, 500 and 1,000 hours. Different checkpoints for different temperatures can be calculated by using the Arrhenius equation. For example, with an activation energy of 0.7e V, Tj of 135°C and Tuse of 55 °C the equivalent checkpoints will be at 29, 102, 303 and 606 hours. Electrical testing should be completed as soon as possible after the samples are removed. If the samples cannot be tested soon after their removal, additional stress time should be applied. The JEDEC standard requires samples be tested within 168 hours of removal. If testing exceeds the recommended time window, additional stress should be applied according to the table below:


Merit numbers

The merit number is the outcome of statistical sampling plans. Sampling plans are inputted to SENTENCE, an audit tool, to ensure that the output of a process meets the requirements. SENTENCE simply accepts or rejects the tested lots. The reliability engineer implements statistical sampling plans based on predefined Acceptance Quality Limits, LTPD, manufacturer risk and customer risk. For example, the commonly used sampling plan of 0 rejects out of 230 samples is equal to 3 rejects out of 668 samples assuming LTPD=1.


HTOL in various industries

The aging process of an IC is relative to its standard use conditions. The tables below provide reference to various commonly used products and the conditions under which they are used. Reliability engineers are tasked with verifying the adequate stress duration. For example, for an activation energy of 0.7eV, a stress temperature of 125 °C and a use temperature of 55 °C, an expected operational life of five years is represented by a 557-hour HTOL experiment.


Commercial use


Automotive use

Example Automotive Use Conditions


Telecommunication use

Example European Telecom use Conditions definition Example US Telecom use conditions definition


Military use

Example military use conditions


Example

Number of Failures = r Number of Devicesfailures = D Test Hours per Device = H Celsius + 273 = T (Calculation Temperature in Kelvin) Test Temperature (HTRB or other burn-in temperature)= T_\text Use Temperature (standardized at 55°C or 328°K) = T_\text Activation Energy (eV) = E_\text \text^2(\alpha,\nu) Chi Squared/2 is the probability estimation for number of failures at α and ν :Confidence Level for X^2 distribution; reliability calculations use α=60% or .60 = α (alpha) :Degrees of Freedom for X^2 distribution; reliability calculations use ν=2r + 2. = ν (nu) Acceleration Factor from the Arrhenius equation = A_\text Boltzmann's Constant ( \hbar) = 8.617 x 10e-5 eV/°K Device Hours (DH) = D x H Equivalent Device Hours (EDH) = D x H x A_\text Failure Rate per hour = \lambda_\text Failures in Time = Failure Rate per billion hours = FIT = \lambda_\text Mean Time to Failure = MTTF Where the Acceleration Factor from the Arrhenius equation is: \displaystyle A_\text= e^ Failure Rate per hour = \lambda_\text = \frac = \frac Failures in Time = Failure Rate per billion hours = FIT = \lambda_\text = \lambda_\text \times 10^9 Mean Time to Failure in hours = MTTF_\text = \frac Mean Time to Failure in years= MTTF_\text = \frac´ In case you want to calculate the acceleration factor including the Humidity the so-called Highly accelerated stress test ( HAST), then: the Acceleration Factor from the Arrhenius equation would be: \displaystyle A_\text= e^ \times e^ where \text_\text is the stress test relative humidity (in percentage). Typically is 85%. where \text_\text is the typical use relative humidity (in percentage). Typically this is measured at the chip surface ca. 10–20%. where \beta is the failure mechanism scale factor. Which is a value between 0.1 and 0.15. In case you want to calculate the acceleration factor including the Humidity ( HAST) and voltage stress then: the Acceleration Factor from the Arrhenius equation would be: \displaystyle A_\text= e^ \times e^ \times e^ where \text_\text is the stress voltage (in volts). Typically is the VCCx1.4 volts. e.g. 1.8x1.4=2.52 volts. where \text_\text is the typical usage voltage or VCC (in volts). Typically VCC is 1.8v. Depending on the design. where \zeta is the failure mechanism scale factor. Which is a value between 0 and 3.0. Typically 0.5 for Silican junction defect.


See also

* Transistor aging *
Arrhenius equation In physical chemistry, the Arrhenius equation is a formula for the temperature dependence of reaction rates. The equation was proposed by Svante Arrhenius in 1889, based on the work of Dutch chemist Jacobus Henricus van 't Hoff who had noted in ...
*
Stress migration Stress migration is a failure mechanism that often occurs in integrated circuit metallization (aluminum, copper). Voids form as result of vacancy migration driven by the hydrostatic Fluid statics or hydrostatics is the branch of fluid mechanic ...
* Reliability (semiconductor) * Failure modes of electronics *
Bathtub curve The bathtub curve is widely used in reliability engineering and deterioration modeling. It describes a particular form of the hazard function which comprises three parts: *The first part is a decreasing failure rate, known as early failures. *Th ...


References

{{reflist
siliconfareastComparing the Effectiveness of Stress-based Reliability Qualification Stress Conditions
Semiconductors Semiconductor analysis Environmental testing