What is Willow Cove Processor?
Willow Cove refers to the codename given by Intel to the processor microarchitecture that succeeds the Sunny Cove architecture. Released in September 2020, these processors are fabricated on the 10 nm SuperFin process node of Intel, or 10 SF.
Technically, the Willow Cove x86 core microarchitecture powers the Intel Core mobile processors belonging to the 11th generation, codenamed Tiger Lake.
- Willow Cove is the architecture designed on the 10 nm SuperFin process node by Intel.
- This architecture was released on September 12, 2020. It succeeds the Sunny Cove architecture and precedes the Golden Cove architecture.
- The Willow Cove microarchitecture forms the basic compute core for Tiger Lake.
- The architecture comes with a redesigned cache, newer security features, and transistor optimization as compared to its predecessors.
- The improved features of the Willow Cove processors offer a 10 to 20% increase in the performance level over the Sunny Cove microarchitecture, though it has quite a few similar features.
Understanding Willow Cove Processor
The 10 nm SuperFin Willow Cove x86 core microarchitecture designed by Intel succeeds Sunny Cove and offers better and higher performance.
It is used in a wide range of client and server products, including Tiger Lake cores.
Though there are some features similar to its predecessor, the Willow Cove architecture supports improved features to offer a 10% to 20% enhancement in performance. These include the following:
- Security features
- Higher CPU clock speed
- A redesigned cache subsystem
Some of the other notable improvements in the architecture design of the Willow Cove processors are:
- Bigger Level 2 cache
- Higher Level 3 cache
- A new AVX-512 instruction
- Full memory encryption
- Intel Key Locker
- Indirect Branch Tracking
- Shadow Stacks
- Unlocked AVX/AVX2 instructions support for Pentium Gold and Celeron processors
- Control Flow Enforcement Technology that prevents return-oriented and jump-oriented programming exploitation practices
As said earlier, Willow Cove is the successor to Sunny Cove and the predecessor of Golden Cove.
Therefore, as a successor to Sunny Cove, it is quite natural that it will have some notable changes in its architecture in comparison.
A couple of such changes are:
- An LPDDR5 and higher bandwidth support by the memory subsystem and
- Inclusion of the Total Memory Encryption or TME feature.
One of the most significant aspects of the Willow Cove architecture is the improvement in the process node.
Moving onto 10 SF and using a set of new SuperFin transistors offers the Willow Cove processors much higher scalability in terms of frequency and voltage.
This in turn offers better and much higher performance metrics at the same voltage across the board as compared to the predecessor of the Willow Cove architecture, Sunny Cove.
This, in turn, helps the processors to perform at a higher peak frequency of about 5 GHz as opposed to the 4 GHz of its predecessor, if the peak voltage remains the same.
The pipeline of the Willow Cove processors comes with the following features:
- It has a minimum of 14 stages.
- It has a maximum of 19 stages.
- It offers Out-of-Order Execution or OoOE support.
- It also allows speculative executions.
- It supports register renaming.
- It allows 5-way decoding.
The Willow Cove processors come with L1, L2 and L3 cache, and each of them offers a significant edge to its performance in its own distinct way.
For example, the Level 1 cache measures 80 KB per core, where 32 KB is reserved for instructions and 48 KB for data.
The Level 2 and Level 3 cache sizes are however much larger in comparison to its predecessor. For example:
- The Level 2 cache can be up to 1.25 MB per core in place of 512 KB per core
- The Level 3 cache can be up to 3 MB per core in place of 2 MB per core.
The more L2 and L3 memory in the cache structure surely adds up to its performance, but there are also some worthy tradeoffs.
For example, the non-inclusive, 150% larger, 20-way 1.25 MiB L2 cache may be the biggest update, but it comes at the cost of inclusivity.
Usually, with a cache measuring more than double, the miss rate will be reduced by √2. This means that the 2.5x large L2 cache will now have cache misses reduced by nearly 58%.
On the flip side, if you consider the latency aspect, the bigger caches seem to have much longer access latencies. This means that the L2 cache in this case will be a bit slow in its performance.
However, the non-inclusive nature offers a small additional gain in its performance because it will not need back-invalidation or maintain an identical copy of entries in the L2 cache.
Still, it will have a knock-on effect on its power and die area. Moreover, additional hardware is required to be built for a non-inclusive cache into the core so that it complies with the rules for cache coherency.
And, most importantly, with the size of the cache increased, a non-inclusive cache cannot run at the speed of an inclusive cache.
As for the L3 cache, there is a notable increase in its capacity, but there is a reduction in its associativity. This is sure to affect the performance.
Control Flow Enforcement Technology
As said earlier, CET or Control Flow Enforcement Technology is enabled in the Willow Cove architecture for added security.
This protects from attacks with respect to returns and jumps, and prevents diversion of the instruction stream to a code not sought for.
This technology is enabled by enabling Indirect Branch Tracking and Shadow Stacks, which prevent and protect against misdirected calls or jump targets and return addresses through page tracking, respectively.
However, it needs specific software built with new instructions.
Instruction Set Architecture and extension support
The Willow Cove processors supports x86-64 Instruction Set Architecture (ISA) along with a lot of other extensions, such as:
- AES-NI or Advanced Encryption Standard New Instructions
- CLMUL or Carry-less Multiplication
- RDRAND or Read Random
- SHA or Secure Hashing Algorithm
- TXT or Text File extension
- MMX or MultiMedia eXtensions
- SSE or Streaming SIMD Extensions, along with all its variants such as SSE2, SSE3, SSSE3, SSE4, SSE4.1, and SSE4.2
- AVX or Advanced Vector Extensions, along with AVX2 and AVX-512
- FMA3 or Fused Multiply Add
- VT-x and VT-d or Virtualization extensions
There are also a few new instructions introduced in the Willow Cove microarchitecture, as follows:
- MOVDIR or Move Direct stores
- AVX512_VP2INTERSECT or AVX-512 Vector Intersection Instructions
Willow Cove vs Golden Cove
- The Willow Cove processors are a bit older technology, being released on September 12, 2020, in comparison to the Golden Cove processors which are relatively modern and have a release date of November 4, 2021.
- The Willow Cove processors are built on the 10 nm SuperFin or 10 SF technology node, while in comparison, the Golden Cove processors are built on the Intel 7 or 10 ESF fabrication process.
- The maximum CPU clock rate attained by the Willow Cove processors can be up to 5 GHz. On the other hand, the maximum clock rate achieved by the cores of the Golden Cove usually ranges between 1 GHz and 5.5 GHz.
- The product codename of the Willow Cove processors is typically Tiger Lake, but in comparison, the product code names of the Golden Cove processors are Alder Lake (client) and Sapphire Rapids (server).
- The Instructions per Cycle (IPC) performance of Willow Cove is not as high as Golden Cove, which offers about 19% better performance.
- The Willow Cove processors cannot take on as much of AI workloads as the Golden Cove processors.
- The network and 5G performances are not as improved in Willow Cove as they are in the case of the Golden Cove processors.
- The Golden Cove processors come with even more advanced security features in comparison to the Willow Cove processors.
- There is one less ALU and LEA in the Willow Cove processor architecture in comparison to that of the Golden Cove architecture which is up to a total of 5.
- The front-end of the Willow Cove processors lags a bit behind that of the Golden Cove architecture, which has more simple decoders, a larger BTB, and more page entries.
- The back-end of the Willow Cove processor also needs to do a lot of work to catch up with that of the Golden Cove processors, which typically come with a larger Reorder Buffer (ROB) up to 512 from 652 and two more execution ports, bringing the total up to 12.
- The predecessor of Willow Cove is Sunny Cove, but the predecessors of Golden Cove are 10 nm Sunny Cove servers, 14 nm Skylake servers, 10 nm Willow Cove mobile processors and 14 nm Cypress Cove desktop processors.
- The successor of Willow Cove is Golden Cove, but the successor of Golden Cove is Raptor Cove.
The Willow Cove processors are built on the 10 nm manufacturing process and come with a redesigned Middle Level Cache.
It is designed to offer much higher frequencies as compared to its precursor at comparatively lower voltages. This improves its power efficiency and helps it offer a better dynamic range overall.