All computer components, under normal operating conditions, generate some level of heat output that must be dissipated if the component is to continue to operate properly. When the component’s operating temperature exceeds its designed maximum operating point, it can suffer either immediate and permanent damage, or reduced lifespan over the long term.
For this reason, many components adopt two methods of thermal protection: thermal throttling and thermal shutdown.
What is Thermal Throttling
Thermal throttling is typically associated with CPUs and GPUs, yet it is not uncommon for other computer components to adopt this method of self-protection, such as Solid State Drives (SSD), which can also suffer from over-heating.
Some SSDs incorporate a built-in thermal sensor to help monitor their operating temperature. When the drive reaches extreme temperature conditions, it uses thermal throttling in an effort to prevent damage. In the case of an SSD, this results in reduced read/write throughput.
However, as stated previously, thermal throttling is most commonly used to refer to CPU and GPU protection, and will be the main focus of this article.
Processors have what is known as a throttle temperature setting, that dictates the maximum safe operating temperature. To find this maximum point, Intel for example, recommends finding a CPU’s Tjunction value, which is typically around 100°C, and suggests setting a value well below this (around 80°C or lower) when the CPU is under load.
When this temperature is exceeded, thermal throttling kicks in and the processor starts to reduce power consumption in an attempt to bring the temperature back down to within limits.
Reducing power is achieved by either lowering the voltage supplied to the processor (and thus reducing the current drawn), or by lowering the processor’s operating frequency. For this reason, thermal throttling is also referred to as dynamic voltage frequency scaling.
To lower the voltage, a DC-to-DC power converter, known as a Voltage Regulator Module (VRM), is used. Some CPUs come equipped with a built-in VRM, otherwise, it is found as a separate component on the motherboard or graphics card.
Reducing a processor’s operating frequency is achieved by scaling back the clock multiplier. Assuming a processor has a base clock frequency of 100Mhz, then a multiplier of 32 would give an operating frequency of 3.2GHz. When thermal throttling kicks in, the multiplier would typically be reduced by 25% to a value of 24, providing a new operating frequency of 2.4Ghz.
While today’s computer systems and processors monitor and regulate their performance, employing thermal throttling based on internal temperature and workload in the interests of self preservation, some systems will allow users to alter the operating voltage and frequency limits (e.g. via the BIOS), and thus bypass any designed safeguards.
Typical examples of such “tampering” are overclocking and overvolting, where the processor is often forced to operate at its upper limits, or at least beyond what the system manufacturer’s design intended. The operating frequency is increased, approaching a theoretical maximum, and/or the processor voltage level is increased, in an effort to obtain more processing power.
However, since most systems are not designed to continually operate at such extreme levels, overheating occurs and thermal throttling takes effect, even when elaborate cooling systems have been employed.
It should also be noted that when thermal throttling fails to reduce the temperature to a safe level, the processor will automatically employ thermal shutdown, before any permanent damage is caused.
Under the Microscope (What Initiates Thermal Throttling)
Transistors used in the construction of logic gates in today’s processors, are fabricated at a width of 32nm (Intel Core i7-3970X). The thickness of a nanometer is about 10 atoms thick, so at such microscopic levels, we are approaching the realm of quantum physics, where phenomena like electron tunneling start to take effect.
Electron tunneling causes leaky transistor gates, which happens when the electrons pass through the gates as if they weren’t there. Since gates are supposed to control the flow of electrons, this makes them ineffective and imposes a physical limit on how small transistors can be fabricated and how fast they can operate.
Every time a gate opens or closes, it produces heat, so the more gates there are and the faster the operating frequency of the processor, the more heat created.
Consider an Intel Core i7-2600K processor operating at a clockspeed of 1.6GHz with a power consumption of 150W at full load. When that same processor’s clockspeed is increased to 4.8GHz, the power consumption jumps to over 350W.
Having reached a speed barrier, manufacturers have adopted other techniques to provide greater processing power, like multiple cores and multi-threading. Multiple cores is like having multiple processors in the system, but all on the one, physical chip.
Multi-core processors allow systems to better cope with increased workloads and the number of tasks a computer is asked to carry out. Instead of a single core processor having to handle a game’s video rendering, audio generation, game physics calculations, game-play logic, etc, all within a single core, on a multi-core processor, each task is assigned to its own core.
While a simple application that uses just one core would not generate a lot of heat, a typical gameplay scenario like the above where all cores are operating at, or close to their peak, would generate a significant amount of heat. And when that heat reaches or approaches the maximum safe level, thermal throttling is initiated.
Affects of Thermal Throttling
The affects of thermal throttling are not always discernible and depend on the running applications and system. Obvious and immediate effects are:
- an increase in fan speed (which usually equates to increased fan noise)
- sluggish response, especially on computer systems running computation intensive applications
- a drop in game-play frame rate (Frames Per Second) from the graphics card.
In the case of GPUs, artifacts are present in the rendered scenes. Artifacts are visual impairments and can range from undetectable to quite severe. Although artifacts are not the result of thermal throttling, they are an indication that the GPU may be operating very close to its thermal limits .
Thermal throttling may eventually kick in, but if artifacts are a common occurrence, they are an indication that the processor is experiencing overheating (although there are other causes of artifacts also, such as faulty memory), especially if it has been overclocked, and can suffer cumulative damage if not checked.
Preventing Thermal Throttling
While several factors in combination may cause a system to experience thermal throttling, usually correcting one of these factors (discussed below) will help remedy the situation.
The clock rate or operating frequency is a direct influence on the heat output of any component. Keeping the frequency to within designed specifications will help avoid thermal throttling. In fact, when thermal throttling kicks in, it lowers the clock rate in an attempt to control any overheating.
Cooling and Airflow
The Thermal Design Power (TDP) (also known as the Thermal Design Point), refers to the maximum heat a processor or component generates under real application loads, for which its cooling system is designed to dissipate.
Regardless of the type of cooling system used, whether it be passive or active, air or liquid cooled, it should be engineered to more than cope with the heat output of the system.
It should also be noted, that an effective exhaust system that expels hot air from within an enclosure, can greatly aid in avoiding thermal throttling, as it efficiently expels hot air out of and draws cool air into the enclosure.
Overclocking / Overvolting
Especially popular among gaming enthusiasts, overclocking and overvolting entail running the CPU/GPU at a higher frequency and/or voltage than originally designed for. It is also almost always accompanied by replacing the stock cooling system with an enhanced cooling system, usually liquid cooled.
Excessive overclocking or overvolting can lead to permanent damage so should be employed with caution. At best, any form of overclocking will reduce the lifespan of a system.
Any system that undergoes steady, prolonged running at a constantly high workload, may, over time, generate a heat build-up that can eventually lead to thermal throttling.
Also over time, dust build-up on heatsinks or on the surface of components, can act as an insulator, trapping in heat and restricting heat dissipation. It is therefore necessary to remove this dust.
Thermal throttling is a built-in safety feature that preempts overheating and possible permanent damage. However, if a system is continually experiencing thermal throttling, then it may be an indication that the system is either being stretched beyond its limits, or that it may be in need of some maintenance.