What is CPU Cache? – The Hero of Speed
Central processing units, or CPUs, have had a major influence on how modern computing works. Most of the logic and everyday usage a personal computer, or mobile phone goes through, are done by the CPU and depending on the tasks, a graphics card, commonly referred to as a GPU. Speed, when talking about CPUs, has often been attributed to the clock rate, measured in gigahertz (GHz) today.
This is a common misconception, because for example, the AMD Bulldozer series of CPUs reached 5 GHz in 2013, but their performance was not as good as competing units, despite the high clock speed. Its failure was in the design, two cores sharing a single floating-point unit (FPU), as well as cache. The latter is more important, especially for everyday use.
Often, manufacturers will talk about improved cache or a larger cache, and the numbers might seem impressive. Without knowing what cache actually is, the numbers are anything but meaningful to the average user. Examining cache in detail will help in understanding it.
What is a CPU Cache? – The Basics
Latency sensitive operations, such as CPU operations in modern computing, need fast memory in order for them to actually be perceived as fast. While clock speeds play an important role in how fast a CPU will perform a requested operation, the CPU cache plays an even bigger one.
A CPU cache is made from static random-access memory and is used to store frequently requested data or parts of data deemed necessary, so that it can be accessed and opened faster. Bits of data that are considered important are based on many algorithms that analyze code.
A good example is your favorite browser. Opening it a couple of times during one session will have as a result faster loading times. Depending on many factors, however, the data for your browser, or video game, for that matter, will be stored in a different tier/level of cache.
Caches were developed because there was a need for faster memory access, because CPU clock speeds started rocketing in the 1980s, compared to memory speed. SRAM is used for caches, because it is faster and can be smaller and placed physically closer to the CPU, usually on-die. Compared to main memory or dynamic RAM, it is faster and closer to the CPU, meaning much less latency.
Latency is a big issue in computing, where distance, speed and other factors determine how fast an operation will be performed. To reduce latency and improve performance, different levels of cache were created, most processors having Level 1 and 2 and larger ones Level 3, also called L1, L2 and L3.
CPU Cache Levels – The Hierarchy of Data
When the first caches were invented and implemented, in the late 1970s and early 1980s, most processors had a L1 cache. Even today, these cache tiers are only a couple kilobytes large, and are split into two separate sides, L1D, for data and L1I for instructions.
The L1 cache is present on all processors and is the most important cache for operations, reserved for data that needs the lowest latency. Having fewer blocks than L2, it is the fastest. Typically, every CPU core has its own L1 cache. Sharing L1 between cores is undesirable and can produce latency issues.
When a CPU finds the necessary data in L1, it is considered a cache hit. When, however, it does not find the data, it is called a cache miss and then it has to look for the data elsewhere, in this case, the next tier of cache. L2 is the next tier, for most processors.
L2 cache is sometimes shared between two cores, but more often than not, on desktop processors, each core has its own L2 cache. L2 is larger than L1, with a total size of a couple of megabytes. L2 does not share data with L1, because that would make things worse. If data cannot be found in L2, the processor moves down the line, to L3.
L3 is often the largest cache to be found on processors, just as the others, it is on-die. This tier of cache, however, is most often shared between cores, simply because it is very large. AMD Zen 2 processors, for example, have a combined L3 of 64 MB, 32 per 6-8 cores, depending on the core count. L3 is used for larger chunks of data, like video games.
Some processors have L4, but not too many. DRAM is sometimes used for L4, meaning system memory, which is the case with integrated graphics. One of the reasons why integrated graphics are slow is due to the physical distance between the CPU and the speed of DRAM.
CPU Cache and Overclocking – Does it Matter?
Some overclockers tinker with CPU cache ratio and CPU cache voltage, basically overclocking their cache. The problem with this is that cache overclocking does not do much but raise the temperature and potentially cause more instability. Higher temperatures require better cooling.
Overclocking the cache is not recommended unless one is competing for a world record. Extreme overclocks tend to be unstable and unsustainable, which can cause crashes and in fringe cases, data loss due to bit flips and write errors. Any professional or even a gaming rig would suffer from an unstable overclock.
Conclusion and Summary – Cached Data
Cache is basically really fast memory which the processor uses to find necessary data or data chunks. Split into tiers, most often L1, L2 and L3, cache is used to find data faster, the most important data being in the first level, larger and less important data being in the next levels.
A cache hit is when the processor finds the data it needs, while a cache miss moves the search to the next tier, either until a hit is made or system memory is reached. Cache is most often on-die, very close to the CPU cores, except for L4, which can be off-die and is the system memory, particularly on modern desktop CPUs.
Finally, cache overclocking is not recommended, unless the overclocker is an expert and either is looking to break a record or is up to the challenge of stability testing.
SSD Vs HDD – Understanding SSD and HDD Nuances
Modern computers, laptops and even smartphones make use of various storage technologies, most of which can be labeled as a solid-state drive. It is assumed that SSDs are faster and better, but nothing is without trade-offs, particularly storage technology. Hard disk drives have their place, in both professional and consumer applications. What is often considered […]
How Long Should a PSU Last?
If you’re building a PC for gaming, the Power Supply Unit should be picked with utmost consideration, and not be kept an afterthought. The PSU is responsible for converting mains AC voltage to DC for your computer’s internal systems, and grounds the system. PSUs from most mainstream vendors are high quality and will easily accompany […]
How To Overclock GPU
GPU overclocking is easier than ever. All you need is a couple of apps, some free time, and a bit of patience until you find a safe, daily driver OC value. Below you’ll find a list of resources needed to get extra performance from your graphics card as well as a detailed explanation on how […]
How To Overclock a CPU
Glory days of CPU overclocking are long gone. Back in the day, you could get a completely new CPU simply by increasing the clock of your current processor. These days, both AMD and Intel push their CPUs to limits with dynamic boost features so there’s not a lot you can do to get extra performance. […]
What is Anti-aliasing? Is it important?
What is Anti-aliasing? Since graphics are made out entirely of pixels, which as we all know are tiny little squares, it’s only given that you’ll be seeing jagged edges on images. This staircase effect is what you call ‘aliasing’. Using that definition of the word ‘aliasing’, you can probably guess what ‘anti-aliasing’ entails. To put […]
What is GPU Scaling and Should You Use It?
GPU scaling is a feature that you can use to play older titles on modern displays while having a better experience. In this article, we are going to answer the question; what is GPU scaling and should you use it? What is GPU Scaling? GPU scaling is a feature that adjusts the aspect ratio of […]