The Evolving Thermal Landscape
Managing heat in chips is becoming a precision balancing act at advanced nodes and with advanced packaging. While it’s important to ensure that temperatures don’t rise high enough to cause reliability problems, adding too much circuitry to control heat can reduce performance and lower energy efficiency.The most common approach to dealing with these issues is thermal simulation, which requires a 3D representation of the system—the package, the board on which it sits, as well as material properties for all of the constituent parts of the assembly. It sounds straightforward enough, but as more devices are packed together into systems, it is proving to be anything but straightforward.
“From a thermal perspective what you are doing is packaging much more silicon in a given square inch, the result of which is the power density,” said Robin Bornoff, product marketing manager for Mentor Graphics‘ Mechanical Analysis Division he said. “The number of watts dissipated goes up considerably. As power density goes up, the temperatures go up, so the need to be able to design an efficient removal of that heat becomes even more important than it ever has been before. If you go back 10 or 15 years, the increase in power density was being driven by an increase in clock speed. Now, as we go to other ways in which we can increase functional density, we find the power density increasing not in clock speed but in functional density of the packaging itself. So it is yet another driver that puts thermal considerations right at the front of a lot of design constraints today.”
Just about every electronics device built today—from airplanes to cars to cell phones—uses some kind of predictive simulation for thermal stress and electromagnetic analysis. This has led to efforts to more tightly couple electrical and mechanical design, using a single simulation that includes everything from chip to package to board, to how it all works together as part of a larger system, said Steve Pytel, electronics product manager at Ansys. Included in that analysis are such factors as power loss inside the copper of the package and how the PCB impacts thermal stress and strain. This is followed by considering the enclosure. If it’s a phone or a tablet, what impact does the cooling have? Is there a fan? And all of that has an impact on reliability over time.
Wire bond degradation, metallization layer mismatch, solder fatigue and die/substrate cracks cause by thermal issues in hybrid/electric vehicle module. Source: Mentor Graphics.
Finite element analysis
While the term ‘finite element analysis’ might not be a common term, this mathematical construct underlies many EDA simulation tools today. One of the main drivers for thermal simulation using finite element methods is the finFET process, first introduced by Intel at 22nm, and which will soon be unveiled for 10nm and 7nm by all of the major foundries.
“While we advance on the finFET technology, one of the side impacts is that due to 3D finFET architecture, the heat is easily trapped in the fingers,” said Norman Chang, vice president and senior product strategist at Ansys. “On the substrate side, there is a narrow substrate under the finger and the rest of the material is silicon dioxide, which makes it much harder for heat dissipation through the substrate, through the package, and then through the PCB. That’s one of the main drivers.”
Heat is easily trapped in the fingers of finFETs, which affects heat dissipation.
The self-heat induced on the device level for finFET processes, along with thermal coupling between wires, requires analysis. One approach is using finite element methods to analyze the gradient temperature on the chip and the temperature increase on each wire.“That’s very important to determine the electromigration because the EM limit is a function of temperature,” said Chang. “Resistance is also a function of temperature, and leakage power is an exponential function of temperature. When temperature increases, leakage power increases, and when leakage power increases, temperature increases. That can become a thermal runaway issue. If you do not have a good enough packaging design, thermal runaway will happen in the chip-packaging system. Another factor that makes heat dissipation more challenging is 3D IC design, which is becoming popular for the CoWoS (chip on wafer on substrate) design from TSMC, or the new integrated fan-out (InFO) on wafer-level package design. That is also going to ship to mass market this year starting in Q2.”
Multi-chip packaging already is gaining traction, primarily due to high-throughput between processor and memory and smaller form factors. “There will be multiple dies on the package, and with the chip and the package increasingly difficult to separate from each other, they will be very much integrated,” he explained. “In the InFO design the chip will be directly on top of the silicon substrate, and the silicon substrate will have a ball grid array directly on top of the PCB. Because of these technologies, thermal becomes even more of a factor. In automotive applications thermal in the harness environment in automobiles and the temperature envelope is set at 135° Fahrenheit. If you have multiple MCUs in the car — usually there are more than 100 in cars today — and different kind of chips like sensors and spark plug electronics, the environment in the vehicle is very harsh for thermal dissipation.”
Mentor’s Bornoff noted that multi-chip packages have multiple junction temperatures that must be considered in a design. “One of the resources a thermal engineer can use to get information about the construction of a package is a spreadsheet, which contains thermal metrics that can be used as input to a simulation tool. These metrics, standardized by JEDEC and other standards bodies, have been very much derived with a monochip assumption. There is a challenge going forward of how to formulate thermal metrics that are appropriate for multi-heat source or multi-die type packages — get it onto a spec sheet to enable an engineer to be able to use that information for more accurate thermal simulation.”
Multi-chip packagings have multiple junction temperatures that need to be considered.
He noted that standards bodies are adapting to this changing landscape. Rather than just one heat source for a package there are multiple sources, which means multiple junction temperatures.One way to deal with this is thermal-induced stress, which is another finite element method for analysis. “When you increase the temperature, it will be more vulnerable to stress in terms of the on-chip interconnect and the package,” Chang said. “For InFO, there are extreme-low k dielectrics because it goes through the thermal stack rings. When a known good die is used in the InFO process, you will go through 350° to 400° F thermal stack ring so the extreme-low k dielectric material has to suffer through the thermal stack ring process and is vulnerable to stress to cracks, to fatigue, in addition to drop stress.”
Why predict temperature?
One of the key reasons accurate temperature measurements have become so important is the emphasis on reliability, particularly in markets such as automotive, where electronics must last 10 to 15 years under extreme conditions. Temperature has a direct correlation to how long a device will function properly over time. As long as this can be simulated and junction temperatures accurately accounted for, this is a relatively straightforward design constraint using finite element analysis.
In the past, finite element analysis was focused on the transfer of heat within a solid. “Once the heat gets to the edge of the solid, some assumption has to be made at how effectively the heat is whisked away by the air without actually modeling the physics of the airflow itself,” Bornoff said.
It is a commonly held belief that for an accurate prediction, the full physical description of the three modes of heat transfer—conduction, convection, and radiation—should be considered for a full, accurate representation of the entire heat flow path.
“One half of the simulation is ensuring that you solve the right equations. With any simulation model you have to add some input into the model, so a 3D representation of all of the internal construction of the package has to be created. You have to create a 3D representation of the board, the chassis, the air gaps, the heat sinks, and materials, as you need to have an accurate representation of the 3D geometry of the model. And this is where it gets interesting, especially where uncertainty is concerned,” Bornoff said. “For a good thermal simulation, you need to consider material properties, and the most common one is the thermal conductivity of any solid in your model.”
For example, copper has high thermal conductivity, while other materials have low thermal conductivity. Some of these materials are very well understood, which provide very accurate values for simulation input data.
“Other values are other materials much less well understood, both in terms of their material properties, and also in terms of their size,” he said. “If you look at uncertainties associated with package manufacturing process, the thing that we hear most of all is you talk about die attach and die attachment materials. It is notoriously difficult, especially for the person designing the package, to be able to get good accurate information about the thermal conductivity of the die attach material and its thickness. These parameters are very important with regard to accuracy of the simulation, but are also extremely difficult to get good accurate values for. And that’s a real challenge.”
Compounding this is leakage current, which decreased with the introduction of finFETs, but which begins increasing again at each new node after 16/14nm. “The contribution to the total power dissipation from leakage power has increased at much smaller nodes, and this leakage power itself is very temperature-dependent. So in terms of the simulation technology, instead of just being limited to specifying a constant power dissipation into predicted temperature, you need to be able to specify a power dissipation that is a function of temperature. As the temperature goes up, the power dissipation goes up, as well. That relationship still needs to be defined.”
Bigger systems, bigger challenges
The problem is compounded with complex systems at advanced nodes. Today’s smartphones have at least two PCBs and as many as 10 packages on each board, said CT Kao, product engineering architect at Cadence. When a systems company has significant heating problems on one of the chips, and that hot spot is next to another chip, then it must be simulated and analyzed to identify where that hot spot is — all under the enormously complex power scheme of operation.
“The latter part is the key because there are different power operating schemes, and that means the power is really huge in the I/O for a while,” Kao said. “If you want to do finite element analysis for each package on that board, times two, you need granularity to know where the hot spot is. There is no way to simulate an operating scenario for more than a few minutes, given how long that would take the simulation to complete. Nowadays, one method to overcome that is so-called adaptive matching, because it says that every solid has to be cut into smaller elements, and we don’t need very fine cutting across the board. So people use finite element the first time. Then you identify where the high gradient of temperature is. At that specific location, you put the fine grain analysis there.”
This all boils down to how to best utilize the energy put into the system, turning that energy into useful or non-useful work, Kao said.
http://semiengineering.com/the-evolving-thermal-landscape/
No comments:
Post a Comment