This text is a part of the Expertise Perception collection, made doable with funding from Intel.
We are likely to deal with the newest and biggest expertise nodes as a result of they’re used to fabricate the densest, quickest, most power-efficient processors. However as we have been reminded throughout Intel’s latest Structure Day 2020, a variety of transistor designs is required to construct heterogeneous methods.
“No single transistor is optimal across all design points,” stated chief architect Raja Koduri. “The transistor we need for a performance desktop CPU, to hit super-high frequencies, is very different from the transistor we need for high-performance integrated GPUs.”
Right here’s the issue: gathering processing cores, fixed-function accelerators, graphics assets, and I/O, after which etching all of them onto a monolithic die at 10nm makes manufacturing very, very, troublesome. However the various—breaking them aside and linking the items—presents challenges of its personal. Improvements in packaging overcome these hurdles by enhancing the interface between dense circuits and the boards they populate.
Again in 2018, Intel laid out a plan to get smaller gadgets working collectively with out sacrificing pace. “We said that we need to develop technology to connect chips and chiplets in a package that can match the performance, power efficiency, and cost of a monolithic SoC,” continued Koduri. “We also said we need a high-density interconnect roadmap that enables high bandwidth at low power.”
In an business keen to call winners and losers primarily based on course of expertise, revolutionary approaches to packaging might be drive multipliers within the battle for computing supremacy. Let’s have a look at Intel’s present packaging playbook, together with the teasers disclosed throughout its latest Structure Day 2020.
- The Embedded Multi-die Interconnect Bridge (EMIB) facilitates die-to-die connections utilizing tiny silicon bridges embedded within the bundle substrate
- The Superior Interface Bus (AIB) is an open-source interconnect normal for creating high-bandwidth/low-power connections between chiplets
- Foveros takes packaging to the third dimension with stacked dies. The primary Foveros-based product will goal the area between laptops and smartphones.
- Co-EMIB and the Omni-Directional Interface promise scaling past Intel’s present packaging applied sciences by facilitating higher flexibility.
Overcoming monolithic rising pains with EMIB
Till not too long ago, when you needed to get heterogeneous dies onto a single bundle for max efficiency, you positioned these dies on a chunk of silicon referred to as an interposer and ran wires by way of the interposer for communication. By way of silicon vias (TSVs) — electrical connections — handed by way of the interposer and right into a substrate, which fashioned the bundle’s base.
The business refers to this as 2.5D packaging. TSMC used it to fabricate NVIDIA’s Tesla P100 accelerator again in 2016. A yr earlier than that, AMD mixed an enormous GPU and 4GB of high-bandwidth reminiscence (HBM) on a silicon interposer to create the Radeon R9 Fury X. Clearly, the expertise works. However it provides an inherent layer of complexity, slicing into yields and including vital price.
Intel’s Embedded Multi-die Interconnect Bridge (EMIB) goals to mitigate the restrictions of 2.5D packaging by ditching the interposer in favor of tiny silicon bridges embedded within the substrate layer. The bridges are loaded with micro-bumps that facilitate die-to-die connections.
“The current generation of EMIB offers a 55 micron micro-bump pitch with a roadmap to get to 36 microns,” stated Ramune Nagisetty, director of course of and product integration at Intel. Examine that to the 100-micron bump pitch of a typical natural bundle. EMIB makes it doable to attain a lot larger bump density in consequence.
Small silicon bridges are additionally so much cheaper than interposers. Whereas the Tesla P100 and Radeon R9 Fury X have been high-dollar flagships, one among Intel’s first merchandise with embedded bridges was Kaby Lake G, a cell platform that mixed eighth-gen Core CPUs and AMD Radeon RX Vega M graphics. Laptops primarily based on Kaby Lake G weren’t low cost by any measure. However they demonstrated EMIB’s capacity to get heterogeneous dies onto one bundle, consolidating invaluable board area, augmenting efficiency, and driving down price in comparison with discrete parts.
Intel’s Stratix 10 FPGAs additionally make use of EMIB to attach I/O chiplets and HBM from three completely different foundries, manufactured utilizing six completely different expertise nodes, on one bundle. By decoupling transceivers, I/O, and reminiscence from the core material, Intel can choose and select the transistor design for every die. Including support for CXL, quicker transceivers, or Ethernet is as simple as swapping out these modular tiles related by way of EMIB.
Standardizing die to die integration with the Superior Interface Bus
Earlier than chiplets will be combined and matched, the reusable IP blocks should know discuss to one another over a standardized interface. For its Stratix 10 FPGAs, Intel’s embedded bridges carry the Superior Interface Bus (AIB) between its core material and every tile.
AIB was designed to allow modular integration on a bundle in a lot the identical means PCI Specific facilitates integration on a motherboard. However whereas PCIe drives very excessive speeds by way of few wires, AIB exploits the density of EMIB to create a large parallel interface that operates at decrease clock charges, simplifying the circuitry to transmit and obtain whereas nonetheless reaching very low latency.
The primary technology of AIB provides 2 Gb/s wire signaling, enabling Intel’s imaginative and prescient of heterogeneous integration with monolithic SoC-like efficiency. A second-generation model, anticipated to tape out in 2021, helps as much as 6.4 Gb/s per wire, bump pitches as tight as 36 microns, decrease energy per bit transferred, and backward compatibility with present AIB implementations.
It’s value noting that AIB is packaging agnostic. Though Intel connects its tiles utilizing EMIB, TSMC’s Chip-on-Wafer-on-Substrate (CoWoS) technology might carry AIB, too.
Earlier this yr, Intel turned a member of the Frequent Hardware for Interfaces, Processors, and Methods (CHIPS) Alliance, hosted by the Linux Basis, to contribute the AIB license as an open-source normal. The thought, in fact, was to encourage business adoption and facilitate a library of AIB-equipped chiplets.
“We currently have 10 AIB-based tiles from multiple vendors that are either in-production or on power-on,” says Intel’s Nagisetty. “There are 10 more tiles in the near-term horizon from ecosystem partners including startups and university research groups.”
Foveros will increase density in a 3rd dimension
Breaking SoCs into reusable IP blocks and integrating them horizontally with high-density bridges is without doubt one of the methods Intel plans to leverage manufacturing efficiencies and proceed scaling efficiency. The subsequent step up, in response to the corporate’s packaging expertise roadmap, includes stacking dies on prime of one another, face-to-face, utilizing fine-pitched micro-bumps. This three-dimensional method, which Intel calls Foveros, closes the gap between dies, utilizing much less energy to maneuver knowledge round. Whereas Intel’s EMIB expertise is rated at roughly zero.50 pJ/bit, Foveros will get that all the way down to zero.15 pJ/bit.
Like EMIB, Foveros permits Intel to choose one of the best course of expertise for every layer of its stack. The primary implementation of Foveros, code-named Lakefield, crams processing cores, reminiscence management, and graphics right into a die manufactured at 10nm. That chiplet sits on prime of the bottom die, which incorporates the features you’d sometimes discover in a platform controller hub (audio, storage, PCIe, and many others.), manufactured on a 14nm low-power course of. Micro-bumps between the 2 pipe in energy and communications by way of TSVs within the base die. Intel then tops the stack with LPDDR4X reminiscence from one among its companions.
An entire Lakefield bundle measures simply 12x12x1mm, enabling a brand new class of gadgets between laptops and smartphones. However we don’t count on Foveros to solely serve low-power functions. In a 2019 HotChips Q&A session, Intel fellow Wilfred Gomes predicted the technology’s future ubiquity. “…the way we designed Foveros, we think it’ll span the entire range of the computing spectrum, from the lowest-end devices to the highest-end devices,” he stated.
Scalability provides us one other variable to think about
The packaging roadmap set forth throughout Intel’s Structure Day 2020 plotted every expertise by interconnect density (the variety of microbumps per sq. millimeter) and energy effectivity (pJ of vitality expended per bit of knowledge transferred). Past Foveros, Intel is pursing die-on-wafer hybrid bonding to push each metrics even additional. It expects to attain greater than 10,000 bumps/mm² and fewer than zero.05 pJ/bit.
However superior packaging applied sciences can provide utility past larger bandwidth and decrease energy. A mixture of EMIB and Foveros — dubbed Co-EMIB — guarantees scaling alternatives past both method by itself. There aren’t any real-world examples of Co-EMIB but. Nonetheless, you possibly can think about massive natural packages with embedded bridges connecting Fovoros stacks that mix accelerators and reminiscence for high-performance computing.
Intel’s Omni-Directional Interface (ODI) provides much more flexibility by linking chiplets subsequent to one another, connecting chiplets stacked vertically, and offering energy to the highest die in a stack immediately by way of copper pillars. These pillars are bigger than the TSVs that run by way of the bottom die in a Foveros stack, minimizing resistance and enhancing energy supply. The liberty to attach dies in any course and stack bigger tiles on prime of smaller ones provides Intel much-needed flexibility in structure. It definitely appears like a promising expertise for constructing on Foveros’ capabilities.