Intel Demonstrates High-Speed 3nm Silicon
Companies mentioned: INTC 0.00%↑, IBM 0.00%↑, AAPL 0.00%↑
I find semiconductor engineering conferences particularly fun: there's IEDM for silicon devices, ISSCC for circuits, and VLSI that kind-of combines both. It's a chance for companies and academics to show off their latest research to a field of like-minded people. That includes Apple, who attend en mass every year, but never seem to present anything (leave me a comment to where I can find Apple’s presentations!).
This year's ISSCC conference, being held in San Francisco, has had a number of talks from Intel, AMD, TSMC, and even AI startups talking about what they've been up to in the past. However, it's the presentations about what they're working on for the future that are always well attended. The show also features a demo session area, where companies with silicon to demonstrate will put their hardware on display. Alongside IBM's NorthPole (which might get its own coverage), Axelera's new AI chip, Rebellion's Atomus AI chip, and others, was Intel showing off its newest 3nm in-silicon design.
This isn't a CPU core, but a SERDES connection. When chips communicate with each other on package, or the outside world, some connection is made, and in order to make that connection fast, the data is converted from digital to analog, then serialized and deserialized. SER (serialized) and DES (deserialized). PCIe is the most common interface to use SERDES connectivity, however other chip-to-chip protocols such as QPI and networking require them as well. In some computers, it's the fastest signaling IP designed to transfer data off-chip that exists. In the new generation of chiplet-based systems, it's the connections between the chips that will define what bandwidth can be achieved. Thus over the last 10 years we've seen SERDES connectivity speeds grow - grow in lanes and absolute transfer speeds.
For example, PCIe transfer rates using SERDES over the years have gone from 1 gigabits per second, up to 32 gigabits per second. PCIe then takes advantage of multiple SERDES links, such as x1, x2, x4, x8, x16, in order to multiply out the overall bandwidth of the connection. PCIe as a protocol introduces encoding overhead, and so the transfer rate of the SERDES links is actually higher than the quoted bandwidth of PCIe, and this is common for types of SERDES links.
Up to this point, I've been writing assuming that the SERDES link is simply sending 1 and 0, such as a traditional binary pattern. In the analog world, this is called NRZ, or Non-Return Zero. It means that the signal can either be a 1 or a 0, and when developing a high-speed link, making sure you can differentiate between the two is paramount. Engineers and companies in this field often like to show 'eye-diagrams', indicating the difference between a 1 and a 0 in their design. To get this diagram, they overlay thousands if not millions of cycles of connection, showing that the 1 or 0 doesn't interfere with each other.
Another way of increasing the bandwidth of a link, such as an 8 gigabit link, is to encode more bits per transfer. Instead of NRZ, we now look at Pulse-Amplitude-Modulation, or PAM. The simplest way to show this is an example where there are now four levels in the signal, known as PAM-4:
This signal now transmits one of four values: a 11, a 10, a 01, or a 00. So instead of transmitting a single bit of information with NRZ, we can no submit two bits of information with PAM-4. Thus an 8 GT/s SERDES link at PAM-4 has a theoretical bandwidth of 16 Gbps.
For higher-speed connections, I've reported on 56 Gbps and 112 Gbps connections come into the market over time. These are a mix of multi-link SERDES connections as well as variation in the encoding schemes. These find hold in networking, or transceivers attached to FPGAs, but also GPU-to-GPU connectivity is taking advantage of these high-speed connections. As the connections get higher in bandwidth, the tolerances and manufacturing precision increases substantially. As a result, we often see these high-speed connections demonstrated in older process nodes first (such as 28nm or 16nm) due to cost, before they come to the denser process nodes that can offer better efficiency. Also, depending on the application, it can be easier to start with a lower bit rate in transfers if a more complex encoding scheme can be applied.
With all this in mind, at ISSCC 2024, Intel was showing off some impressive silicon. Not only was it going for some of the fastest SERDES connection bandwidth numbers presented, but they also had it in 3nm silicon - a process node that isn't commercially available yet. On top of that, they integrated a PAM-6 encoding scheme. This chip is called Bixby Creek.
PAM6 means that they are encoding more bits per transfer, but it's also a lot harder to distinguish between those bits. Over regular NRZ, if PAM4 offers 2x the bandwidth, PAM6 holistically should offer around 2.58x. All of these would be in the same power though, so it offers incredible power efficiency in that instance.
Intel has demonstrated 224 Gb/s at IEEE events before, with Transmit (Tx) at 1.9 picojoules per bit, and Receive (Rx) at 1.4 picojoules per bit (combined 3.3 pJ/bit). Now, when transferring gigabytes per second, that can be high - 1 terabit per second at 3.3 pJ/bit is 3.3 watts. This new demonstration reduces the transmit power down to 0.92 pJ per bit, halving the power on the transmit side, and implementing it on one of Intel's upcoming process nodes. All of this in 0.15mm2 of silicon as well.
What does this mean in the grand scheme of things? Process node technology, much like any process node, needs to have various IP blocks validated in order for customers to be able to use them. For any foundry building new nodes, this means making sure that both digital logic and analog parts of the design work together.
For Intel, they are implementing a sort of tick-tock platform for its new nodes - on Intel 4, only high speed logic and some SERDES will be validated, but for Intel 3, a large range of IP will be made available for customers, and hence why Intel 3 is the one being offered as part of the foundry. Similarly with Intel 20A, which is focusing on high-speed logic and some SERDES, but 18A will have a full suite of IP ready and validated for customers.
Even with this, Intel will create IP on a variety of nodes, internal and external (TSMC, Samsung), for use. The question whether it’s used internally only, or offered as licensable IP, depends on the nature of the product. One of the lead authors of this paper and demo, reached out to me to confirm that Intel does a number of high-speed SERDES connectivity at TSMC as well as Intel.
Demonstrating high-speed connectivity is one of those offerings needed, and so seeing the silicon that can actually do it, at a conference, can only be a good thing!
Also, Intel Foundry Services event is tomorrow (February 21st, 2024). If you are attending, come say hi!
More Than Moore, as with other research and analyst firms, provides or has provided paid research, analysis, advising, or consulting to many high-tech companies in the industry, which may include advertising on the More Than Moore newsletter or TechTechPotato YouTube channel and related social media. The companies that fall under this banner include AMD, Applied Materials, Armari, Baidu, Facebook, IBM, Infineon, Intel, Lattice Semi, Linode, MediaTek, NordPass, ProteanTecs, Qualcomm, SiFive, Supermicro, Tenstorrent, TSMC.
ncG1vNJzZmiln6eytbTAp6Sop6Kae7TBwayrmpubY7CwuY6pZqKmpJq5brDEpqanq6SnrrWx0mafop%2BYYsCxscSdZGymnQ%3D%3D