Immediately we proceed our take a look at large AI servers with a take a look at the Supermicro SYS-821GE-TNHR. When of us focus on Supermicro’s AI server prowess, this is without doubt one of the techniques that’s totally different out there as an air-cooled NVIDIA server. The NVIDIA HGX H200 8-GPU platform is bigger than many different techniques, and that’s for purpose. It’s designed for lower-power density racks which can be extra frequent in most of right now’s information facilities.
We try to do as many movies as we are able to for these AI servers, and so right here is one for this technique:
As at all times we recommend opening this video in its personal tab, browser, or app for the very best viewing expertise. Additionally, on condition that this technique is sort of expensive and in excessive demand, it was simpler for George and I to go to Supermicro and report this than it was to ship the system to the STH studio. Because of this, we have to say this one is sponsored. With that, allow us to get to it.
Supermicro SYS-821GE-TNHR Exterior {Hardware} Overview
The primary massive characteristic of this server is that it’s an 8U platform. We see a number of 6U techniques and a few 7U techniques within the trade, however the 8U platform is definitely for purpose. With a taller chassis Supermicro can use giant followers and unfold out the I/O in a bigger type issue.

We are going to go into every of those sections in additional element, however the prime is the NVIDIA HGX H200 8-GPU meeting which comes by itself entrance accessible tray. In contrast to another choices in the marketplace, accessing the eight GPUs doesn’t require eradicating the chassis from the rack.

Within the entrance heart, there are 5 followers.

These followers are all sizzling swappable.

Right here is one other take a look at the module.

The underside of the chassis has sixteen 2.5″ U.2 NVMe bays, and three SATA bays as customary. If you happen to take away the entrance I/O, by way of an non-compulsory equipment, then you’ll be able to add one other 5 SATA bays.

The entrance I/O consists of a administration port, two USB ports, and a VGA port. Having entrance I/O signifies that one can hook up a KVM cart to the entrance of the chassis within the chilly aisle as a substitute of being on the loud and sizzling aisle aspect.
Transferring to the rear, we see extra followers, and in addition the ability provides and networking.

The 5 prime fan modules might appear like they’re the identical because the entrance, however they should blow in the other way.

To make sure the modules go within the first place, Supermicro has a easy keying system to make sure that these modules are used on the right aspect of the server. This can be a small characteristic now we have by no means proven earlier than, however it’s a kind of small refinement particulars that comes with making GPU servers for a very long time. We reviewed the Supermicro 4028GR-TR 4U 8-way GPU SuperServer again in 2015 for some context, so these are the form of small options current in techniques which have been widespread and evolving for a decade.

The center row of followers is a bit totally different, and for purpose. You will notice this center fan part really has two energy provides on both aspect, and two fan modules within the center.

The fan modules are fairly distinctive since they’re meant to plug into areas that may also be used for energy provides.

Customary the system comes with six energy provides for 4+2 redundancy. You’ll be able to optionally exchange the 2 PSU-sized fan modules with two extra PSUs for full 4+4 redundancy.

The facility provides are 3kW models and supply each 12V and 54V energy. Another HGX servers use totally different energy provides to provide totally different voltages. Supermicro has a single PSU designed to service each.

Between these energy provides, now we have the NIC tray.

Within the video, you’ll be able to see me pulling this out by way of handles for straightforward service. You do not want to take away the chassis from the rack to service the NIC tray.

Within the heart, we get eight low-profile slots. Right here now we have the NVIDIA BlueField-3 SuperNIC put in as a result of that’s what was within the system earlier than Supermicro pulled it from the lab and introduced it over. For giant AI clusters, Ethernet is turning into the popular answer for its scaling. InfiniBand is another choice, so many of those servers are related by way of NVIDIA ConnectX-7 playing cards in these slots as a substitute. In ether case, the final ratio right now is one NIC per GPU.

On the left of the NIC tray, we get a NVIDIA BlueField-3 DPU in addition to 10Gbase-T ports. The 10Gbase-T ports are there for capabilities like PXE boot and administration.

Here’s a take a look at the NVIDIA BlueField-3 DPU that’s put in in that prime slot.

On the best aspect of the tray, we get non-compulsory further NICs. Right here once more, now we have one other BlueField-3 DPU. In fact, you’ll be able to configure the entire community playing cards as you need since there’s loads of area to take action.

All advised, now we have round 4.22Tbps of community bandwidth coming off of this server, or greater than a 32-port 100GbE swap can deal with. That is without doubt one of the driving forces behind community demand proper now within the trade.
Subsequent, allow us to get to the CPU and PCIe tray.