In This Article
What is Goldmont Processor?
The term Goldmont refers to the low-power Pentium, Celeron and Atom branded CPUs used in Systems on a Chip (SoCs) designed for use in smaller devices such as entry-level notebooks and desktop computers.
Technically, it signifies the 2nd generation low-power, Out-of-Order Atom microarchitecture. Designed and manufactured by Intel, the Goldmont microarchitecture officially supports only one thread in each core and is built on the 14 nm manufacturing process.
- Goldmont actually refers to the 14 nm microarchitecture of Intel designed to be used on Systems on a Chip for the ultra-low power devices.
- The processors for the consumer devices may come with up to 4 cores and include Gen 9 HD Graphics architecture of Intel.
- The pipelines for fetch and instruction caches are not coupled and large page support is included in the architecture design along with a small Level 2 pre code cache.
- The Goldmont processors can execute up to three simple integer operations in one cycle and they can generate addresses Out-of-Order.
- The instruction latencies are much more improved on these processors since they can decode up to two branches in each cycle.
Understanding Goldmont Processor
Designed by Intel, the Goldmont processors belong to the 2nd generation of low-power Celeron, Pentium, and Atom families and come with a SoC design.
This particular architecture comes with a lot of features similar to those of the Skylake Core processors.
These features enable the Goldmont processors to offer a performance boost of up to 30% in comparison to the previous generations.
The features and functionalities of the Goldmont processors allow them to be used in low-end, smaller devices where power efficiency is most important. These devices include:
- 2-in-1 netbooks
- Small PCs
- IP cameras
- Vehicle entertainment systems
Typically, the Goldmont processors come with three decoder units that can decode a maximum of 20 bytes in one cycle.
These decoders are triple-wide and can retire up to three instructions in one cycle, while the chip itself can execute one load and store in each clock cycle.
When coupled together, they offer a significant higher and faster performance overall.
Ideally, the processors based on the Goldmont architecture come in different variants such as:
- Desktop processors
- Server processors
- Mobile processors
- Embedded processors
- Automotive processors
- Tablet processors
Normally, Apollo Lake cores are used in entry-level computers and tablets, and Denverton cores are used in ultra-low power servers, storage, networking, and the Internet of Things (IoT).
Features and technology
The architecture of the Goldmont processors offers a lot of significant enhancements due to its innovative features and different technologies. These are summarized for you as follows:
- A decoder that can take three instructions in each cycle
- A microcode sequencer that can send three µops in each cycle to allocate them to the reservation stations
- A better retirement that support peak rate of three per cycle
- A better branch prediction by decoupling the fetch pipeline from the instruction decoder
- An Out-of-Order Execution or OoOE engine
- A larger Out-of-Order execution window
- Buffers that allow for deeper OoOE across integer, memory instructions and Floating Point (FP) or Single Instruction, Multiple Data (SIMD) types
- A complete OoO execution and disambiguation of memory
- Ability to execute one load and one store per cycle
- A second level Translation Lookaside Buffer or TLB included in the memory execution pipeline to allow 512 entries for 4 KB pages
- An integer execution cluster with three pipelines that can carry out up to three simple integer ALU operations in each cycle
- A 128-bit wide engine to execute FP/SIMD instructions
- Improved latency
- Enhanced throughput of instructions such as doubted SIMD instruction output and PSHUFB or Packed Shuffle Bytes with single-cycle throughput
- A 14 nm manufacturing process
- A System on Chip architecture that allows using them in smaller devices as well
- 3D tri-gate transistors
- 2 to 4 cores
- High Efficiency Video Coding or HEVC Main 10 and VP9 Profile 0 video coding and hardware decoding support
- Thermal Design Power (TDP) of 10 watts for desktop or server processors
- TDP of 4 watts to 6 watts for mobile processors
- eMMC 5.0 technology support for connecting NAND flash storage
- Universal Serial Bus or USB 3.1 and USB-C specifications
- Double Data Rate or DDR3L, Low-Power DDR3, and LPDDR4 memory support
- Integrated Sensor Hub (ISH) support, which allows sampling and combining data from separate sensors and operating independently at times when the host platform goes into a low power state
- Image Signal Processor (ISP) support for running up to four simultaneous camera streams
- Audio controller support for playing both LPE and HD Audio
- Improved security subsystem with Trusted Execution Engine 3.0
- Improved and faster branch prediction unit with dedicated Jump Execution Unit or JEU port
- Larger Reorder Buffer or ROB entries and reservation station to offer a larger OoOE window
- Faster and better scalar or packed single, double or extended floating point divides with Radix-1024 FP divider
- Read Processor ID (RDP) support for new instructions
Integrated graphics processor
The Goldmont processors come with Intel Gen 9 HD Graphics. Depending on the driver installed in the system, these graphics processors support a wide range of features such as:
- DirectX 12
- OpenGL 4.6 with updated Windows 10 drivers
- OpenGL 4.5 on Linux
- OpenGL ES 3.2
- OpenCL 2.0
Ideally, the Gen 9 HD Graphics 400 and HD Graphics 500 come with 12 Execution Units, while in comparison, the HD Graphics 405 and HD Graphics 505 come with 18 Execution Units.
The memory hierarchy of the Goldmont processors consists of Level 1 and Level 2 cache but no Level 3 cache. The Level 1 cache is divided into two parts for data and instructions and s with varied features.
The Level 1 instruction cache comes with the following features:
- 32 KiB of instructions per core
- 8-way set associative
- 64 B line size
The Level 1 data cache comes with the following features:
- 24 KiB of data per core
- 6-way set associative
- 64 B line size
The Level 2 cache comes with the following features:
- 1 to 2 MiB for two cores
- 16-way set associative
- 64 B line size
- 32 B per cycle
- 17 cycle latency
There may also be a few Paging Cache Enhancements or PxE/ePxE caches, depending on the type.
The modular system design, or the processors with four cores, allowing sharing the Level 2 cache up to 4 MB.
The Random Access Memory or RAM support offered may be different due to the dual 32-bit channels supporting one or two ranks in one channel. The memory limit supported can be up to:
- 1 GiB
- 2 GiB
- 4 GiB
- 8 GiB
The features of the pipeline supported by the Goldmont processors are as follows:
- It is a triple-wide and superscalar pipeline in nature
- It supports speculative execution
- It allows register renaming
- It has 12 to 14 stages
Instruction Sets and Extensions Support
The Goldmont x86 architecture supports x86-64 or Intel-64 Instruction Set Architecture (ISA) and different old and new extensions of theirs, such as:
- MOVBE or Move Data After Swapping Bytes
- MMX or MultiMedia eXtensions
- SMAP – Supervisor Mode Access Prevention
- MPX -Memory Protection Extensions
- POPCNT or Population Count
- XSAVE or Save Processor Extended States
- XSAVEC – Save processor extended states with compaction to memory
- XSAVES – Save processor supervisor-mode extended states to memory
- XSAVEOPT or Save Processor Extended States Optimized
- FSGSBASE or FS/GS Base Access Instructions
- An optimized PAUSE instruction latency for better power efficiency
- AESNI or Advanced Encryption Standard New Instructions
- Secure Hashing Algorithms such SHA 1 and SHA 256 for hardware acceleration
- PCLMUL or Carry-less Multiplication
- Intel RDRAND instruction support
- VT-x and VT-d or Virtualization extensions
- SSE or Streaming SIMD Extensions, along with all its variants such as SSE2, SSE3, SSSE3, SSE4, SSE4.1, and SSE4.2
- Advanced Encryption Standard (AES) and Carry-Less Multiplication Quadword (PCLMULQDQ) for improved instructions and speeding up encryption or decryption
- RDSEED instruction that allows for random number generation in 16, 32 or 64 bits according to the National Institute of Standards and Technology Special Publication (NIST SP 800-90B and NIST SP 800-90C) standard
- CLFLUSHOPT or Flush Cache Line Optimized which flushes and invalidates memory operand along with the related cache lines in al caches
The Goldmont processors may not offer huge gains over the earlier generations, but the uplift in performance is quite noticeable.
The transition is quite worthy, and it is even better in Goldmont Plus, its successor.
Typically, as you can see from the features, these CPUs are good for low power mobile and desktop use.