In This Article
What is L2 (Level 2) Cache?
L2 or Level 2 cache refers to the memory bank of the CPU. A bit slower than the L1 cache, it is the memory that is accessed by the processor after the L1 cache.
L2 cache is usually found outside the chip core but on the chip which is why it is also referred to as external cache or secondary cache.
- The Level 2 cache is slightly bigger in size than the L1 cache but is also a bit slower than it.
- The fact that the L2 cache is located outside the CPU core allows increasing its size.
- The L2 cache comes with several hardware prefetch units that are complicated but help in observing and dealing with the access patterns sequentially.
- The L2 cache is a combined data and instruction cache unlike the L1 cache that has a separate data cache and instruction cache.
Understanding L2 (Level 2) Cache
Level 2 memory cache is a high performing and specialized computer memory typically located on the die of the processor.
It is so called because this is the memory that is accessed by the CPU in the second attempt when it cannot find the necessary data or instruction for operation in the Level 1 memory cache in the first place.
Therefore, the Level 2 memory cache is the second priority of the CPU when it looks for the instructions to implement for completion of a specific task after the Level 1 cache. But that does not mean it is not important.
The Level 2 cache may not be as fast as the Level 1 cache of the CPU which is located in the core of the processor but it is quite important for the overall performance of the computer system.
More importantly, this cache memory is still much faster in comparison to the main memory.
The best part of the Level 2 memory cache is that the capacity of this cache memory can be increased easily due to the fact that it is located outside the core of the CPU.
You will find that the Level 2 cache is usually referred to as a shared memory.
However, the fact is that it is not called so in the general sense.
There are several different Level 2 memory caches that are considered to be unified caches since they cache both data and instructions.
However, not all L2 caches are unified in nature.
Also, there are a few multi-core processor groups that typically share only one Level 2 memory cache among numerous CPU cores.
Ideally, all types of memory caches, including the level 2 memory cache, are designed to expedite the back and forth movement of data and information between the CPU and the main memory.
The time needed for such movements is called the latency.
This latency is lowest in the case of the Level 1 memory cache which makes it fastest of all, middling in the Level 2 cache, and highest in the Level 3 memory cache, making it the slowest of all.
The latency increases if there is a cache miss since the CPU needs to look for it somewhere else.
However, it decreases if the computer and its other components are fast and more efficient.
Therefore, a faster SSD or Solid State Drive and a DDR4 RAM will reduce latency and make the entire system faster.
The Level 2 cache was first introduced by Intel in the computers that were powered by Pentium and Pentium Pro processors.
After that, this cache memory has been included within every process apart from the earlier versions of the Celeron processors.
The ‘Cache of a Cache’ Concept
Sometimes, the Level 2 memory cache is referred to as the ‘cache of a cache.’
Usually, in the modern CPUs you will have a multiple level memory hierarchy such as CPU⇒L1⇒L2⇒L3⇒DRAM, in that order.
Here, as it is evident, that the Level 3 memory cache has a direct connection with the external memory but the L1 memory cache and the L2 memory cache do not because both of them are higher above the memory hierarchy.
If they want to have such an alliance with the DRAM or Dynamic Random Access Memory, they will need to pass through the Level 3 memory.
Moreover, there are several caches that utilize an inclusive distribution discipline. In this specific arrangement, every level of the memory hierarchy stores a subset of the data or instruction that is stored in the levels below.
This means that everything that you will find in the Level 1 memory cache will also be available in the L2 memory cache, and everything that is stored in the Level 2 cache is also found in the L3 memory cache.
Therefore, in such an inclusive scheme, you can conceivably say that these corroborate the concept of ‘caches of cache’ since the L1 memory cache holds the contents of the L2 cache which in turn caches the contents of the Level 3 memory cache.
The inclusivity aspect of the caches are quite complex to understand.
You may note that all L2 caches, or any memory cache for that matter, may not be inclusive.
For example, the first few generations of the AMD 64 CPUs were known to use the exclusive scheme between a Level 1 memory cache and a Level 2 memory cache.
While, some other Acorn RISC Machine or ARM interconnects use the same discipline between the L2 memory cache and the Level 2 memory cache.
In some cases, the different caches may be weakly inclusive as well.
This means that if there is a cache miss, it may allot a line to L1 memory cache, L2 memory cache, and L3 memory cache.
However, if there is an eviction from the Level 3 memory cache it does not mean it will be evicted from the L2 memory cache as well.
Similarly, if there is an eviction from the Level 2 memory cache, it may not be evicted from the L1 cache.
This dissimilarity matters more in particular situations such as:
- When you get into multiprocessor systems
- When you use coherence protocols
- When you analyze data access interactions and instruction fetch.
Therefore, inclusivity depends on the cache discipline.
For example, in simple systems, the L1 memory cache may be strictly inclusive in, or tend to be inclusive in, or strictly exclusive with Level 2 memory cache.
The same idea can be extended to L2 cache and L3 memory cache as well.
And, in a hybrid system, there may be strict exclusivity between Level 1 and Level 2 memory cache but L1 and L2 may be strictly inclusive in Level 3 memory cache.
At this point you should remember that you may have a combination of cache and snoop filter as a cache layer but it will need strict inclusivity for both the cache and snoop filter and not for the cache alone.
However, the flexibility offered will be much more when the Level 2 memory cache does not enforce strict inclusivity.
With respect to inclusivity, there are a few technical aspects related to it as well which is also good to know at this point.
The most important one is that if the Level 1 memory cache is inclusive in Level 2 memory cache, which means that if every line of L1 cache is included in the L2 cache, then it should be of an organization and size that will allow the L1 cache to reach its full potential.
This becomes quite complex especially when the Level 2 memory cache is a unified data and instruction cache and the inclusivity is enforced to either or both data and instruction caches of the Level 1 memory cache.
Also, when the capacity is increased in the L2 cache, it will also increase the latency for a hit at the same time.
If it is an L1 cache, it will render a much lower fmax or translate to a deeper load pipeline.
However, in the case of the Level 2 memory cache it will translate to a deeper load pipeline.
In such cases it is less likely for a user to lower the main clock of the CPU.
Instead, in order to satisfy the Level 2 cache, users may either lower the L2 clock or add more pipe stages.
The Level 2 memory cache is shared between different cores in the processor and there is a specific way in which it is shared.
This shared L2 memory fills in the L1 cache miss for any cores of the CPU because it is shared by all of them.
However, it will depend on several factors as to how it will be filled by the L2 cache. Some of these factors are:
- The inclusivity or exclusivity of the Level 2 cache with the L1 memory cache
- The ways in which the coherency is dealt between the separate L1 caches in each core and the shared L2 memory cache and more.
No matter whatever you do, when you increase the capacity of the L2 cache, it will in any case increase the hit rate as well as the hit latency.
This means that you will reach diminishing returns with respect to the hit rates and if the capacity is made too large then it will surely hit the performance of the Level 2 cache.
Ideally, anything more than 50% should be considered to be a good hit rate for the Level 2 memory cache for a few particular types of workloads if neither it nor the Level 1 cache is very large.
For the sake of comparison, as for the L1 memory cache it is more than 90%, preferably more than 95%.
The Level 2 cache is usually bigger in size than the L1 cache.
Usually, the size of a Level 2 cache is measured in Kilobyte or KB but there are a few modern L2 memory caches that are larger in size and are usually measured in Megabyte or MB.
The size of the L2 cache usually varies depending on the type and model of the CPU.
For example, the high-end AMD Ryzen 5 5600X CPU comes with just 384 KB L1 cache, 3 MB L2 cache, and a 32 MB L3 cache.
However, in general, it usually ranges between 256 KB and 8 MB, where most of the modern processors come with more than 256 KB Level 2 cache.
To give you an idea of it here are some of the implementations of Intel in which the Level 2 memory cache is found as follows:
- 256 KB in each core of the Intel Broadwell 2014 microarchitecture and the Intel Kaby Lake 2016 microarchitecture
- 512 KB in each core of the AMD Zen 2017 microarchitecture and it is also 4-way inclusive
- 512 KB in each core of the AMD Zen 2 2019 microarchitecture which is also 8-way inclusive and
- 256 KB in the IBM Power 7 in a 128B block with 2 nanosecond access latency and write back feature which is also inclusive of L1.
Still, this size today is considered to be pretty small and therefore, some of the more modern and more powerful microprocessors come with a Level 2 memory cache that is much more than the upper limit of 8 MB mentioned above.
Subjectively, the Level 2 memory cache is about 8× to 16× the size of Level 1 data cache in many processors.
However, that may not always be the case.
There may be some instances where the size of the Level 2 memory cache may be either larger or smaller than the L1d cache.
Ideally, there are some sweet spots in the sizes of the Level 1 and Level 2 caches and it is suggested by the intended application working set.
If the ratio is large, it will benefit number crunching and database workloads but if it is smaller then it may benefit latency sensitive codes.
However, the size of the Level 2 memory cache should be a bit larger in comparison to the L1 memory cache in order to make a significant dent on the hit rate.
Ideally, the 8× to 16× range mentioned above usually falls out of the latency versus the relative size of the available SRAM or Static Random Access Memory.
Other than the speed and size of the Level 2 memory cache, there are some unique features that are also very important for you to know.
These features are completely different from that of the Level 1 cache which makes the L2 cache quite good as well.
The slower, cheaper but larger L2 cache will offer all those benefits that a larger L1 cache would have offered but not at the cost of power consumption or die size.
The Level 2 memory cache is designed to better the access or load to use time which is usually between 12 and 20 cycles as opposed to only 3 cycles of the level 1 memory cache.
The Level 1 memory caches are able to handle one write and two reads in a pipelined fashion from the processor in every cycle due to the larger number of ports.
There are also paths in them for the new data flowing from the L2 cache and the victims leaving.
However, the Level 2 memory cache typically has one read/write port for the memory systems and one for the processor to handle the fills, snoops and victims.
The Level 2 caches are typically combined data and instruction caches as opposed to the L1 caches that are normally specialized for data or instructions.
There is an addressing structure in the Level 1 caches that allows them to be physically tagged and virtually addressed.
This means that a 32K 8-way set associative cache with 64 byte lines can be accessed in parallel by the TLB or Translation Lookaside Buffer.
This boosts its speed. However, a Level 2 memory cache will be completely physically addressed and have no structural requirements.
The Level 2 memory caches can track the addresses that are stored in the L1 cache.
This allows them to filter the snoops from the memory system so that it does not have to bother the L1 caches due to snoop misses.
Such type of tracking is not required by the Level 1 memory cache because it is the first point of contact for the CPU.
When it comes to atomic operation with locking the Level 1 cache will have support from the hardware but the features, design and functionality of the Level 2 memory cache makes it self sufficient and does not need such support.
Moreover, the L2 cache does not need support for fractional line writes as the L1 cache.
The design of the Level 1 memory caches allows them to deal with the unaligned accesses including those that cross line borders.
The Level 2 memory caches do not need to deal with such accesses, which, ideally, are not a part of the cache but are typically CPU logic.
The hardware prefetch unit of the L1 caches is nothing complicated since it is limited only for one line pair.
However, in comparison, the Level 2 memory caches come with rather complicated and multiple hardware prefetch units that help them to observe and deal with the access patterns strided or Arranged sequentially.
If there is a L1 cache miss, the CPU will fall back on the fill buffer and the store buffer to handle the task.
However, the L2 cache needs to follow all outstanding orientations by the CPU using the prefetch hardware and also snoop from other cores.
This specific type of judgment can be incorporated into the L2 cache itself or into a bus interface component that unites it with the other parts of the memory system.
The design of the Level 2 memory caches allows adding Error Correction Code or ECC to it comparatively easily.
This is because the L2 cache handles the whole line reads and writes.
It is a sensible decision to add ECC because it will lower the probability of errors which have a direct connection with cell configurations and scale.
How Does It Work?
The Level 2 cache is a small pool of memory or data that the CPU will access next if it does not find it in the L1 cache.
However, an advanced algorithm and a few assumptions about the programming code determines which information is to be stored in the L2 cache.
The main idea behind all this is to ensure a cache hit, which means that the CPU finds the necessary data and information when it comes looking for it in the Level 2 memory cache and does not have to go scampering for it somewhere else, a phenomenon called a cache miss.
The larger size of the Level 2 memory cache ensures that a lot of data is stored in it in order to eliminate the chances of a cache miss.
The larger size therefore increases the hit rate.
The set associativity factor ensures that any block of the RAM data is stored in the cache.
This also increases the hit rate of the L2 memory cache. However, the time taken can be longer because the processor will have to search the entire cache to find the required data.
If not, it will have to move on to the L3 cache and further to the DRAM.
The direct-mapped L2 caches may allow faster fetching since each cache block holds only a single block of the main memory.
However, this, at the same time, reduces the hit rate due to 1:1 mapping.
The n-way associative L2 memory caches lie in between these two types of caches.
If the cache is a 2-way associative cache, each main memory block will be mapped in one of the two cache blocks and if it is an 8-way associative cache, it will be mapped in one of the 8 cache blocks.
All these arrangements improve the hit rates in particular, a thing that is different for different applications.
The data flow is just in the opposite way the CPU accesses them.
This means the data flows from the RAM to the Level 3 memory cache, then to the L2 cache and then finally to the L1 memory cache.
As for the CPU, it first accesses the data in the L1 cache and, subject to the availability or unavailability to be precise, it will then look for it in the L2 cache, L3 cache and the main memory of the system, in that particular order.
What Does a Level 2 Cache Do?
The Level 2 cache actually helps the CPU to find the necessary data for its operation from the memory.
However, it does not have to go to the main system memory for that. It can find it in one of the caches in between.
Therefore, the cache acts as a bridge for the gap between the process and the performance.
In that way it reduces the time taken to access such data, delays, interruptions and wait states.
It also saves from the hassles of loading any previously accessed data over and over again every time it needs to be accessed.
Therefore, in short, an L2 memory cache reduces memory access by buffering data that is used repeatedly.
This eventually provides substantial performance gains by the computer system overall.
The microprocessors that you find today sometimes come with a specific feature known as data prefetching.
The job of the Level 2 cache is to improve this feature.
This is done by buffering the different data and instructions of the program that are requested from the memory and required by the processor.
It actually serves as the nearer waiting area as compared to the Random Access Memory.
Where is L2 Cache Located?
The L2 memory cache is a separate memory cache that is typically located outside the core of the microprocessor chip.
However, it is found on the same package of the processor chip.
This specific L2 memory cache is placed closer to the CPU which is why it is much faster in comparison to the Level 3 cache.
However, it is not located in the core as the Level 1 memory cache and therefore is not as fast as the L1 cache.
In the past, the Level 2 cache was located on the motherboard, and, maybe for that reason, it was quite slow.
However, now it is a common practice to place them in the microprocessor chip.
However, in no way a Level 2 memory cache is to be considered as a part of the processor technically.
L2 Cache Speed
As said earlier, the Level 2 memory cache lags far behind the Level 1 memory cache in terms of speed but is much faster than the L3 cache or the RAM in your system.
While comparing the two, the speed of the L1 memory cache is usually found to be a hundred times faster and the L2 cache is about 25 times faster than in comparison to the system RAM.
Is L2 Cache Good?
Yes, the L2 cache is quite good and necessary in a CPU as it offers a boost in the performance level.
The impact in performance is quite significant which makes it an important factor especially on any x86 microprocessors.
If you disable the L2 cache, it will lower the performance level of the system overall even more than disabling one of the cores of a dual core processor!
Therefore, you can understand its importance.
However, at this point it is also good to note that the L2 cache memory is not the only factor that may affect the performance of the system.
It is just one of the many important ones.
Therefore, after reading this article now you know that the second level cache memory helps the CPU to perform better.
Ideally, it can be cheaper and slower but since it is larger than the Level 1 cache, it is a good inclusion in the CPU design, and now you surely know the reasons for it.