PRESS RELEASE

Supermicro announces AI Inference-optimised GPU Server with up to 20 NVIDIA Tesla T4 Accelerators

Super Micro Computer, has announced an addition to its line of GPU-optimised servers. The new SuperServer 6049GP-TRT provides the performance required to accelerate the diverse applications of modern AI.  For maximum GPU density and performance, this 4U server supports up to 20 NVIDIA Tesla T4 Tensor Core GPUs, three terabytes of memory, and 24 hot-swappable 3.5" drives. This system also features four 2000-watt Titanium level efficiency (2+2) redundant power supplies to help optimise the power efficiency, uptime and serviceability.

‘Supermicro is innovating to address the rapidly emerging high-throughput inference market driven by technologies such as 5G, Smart Cities and IOT devices, which are generating huge amounts of data and require real-time decision making,’ said Charles Liang, president and CEO of Supermicro. ‘We see the combination of NVIDIA TensorRT and the new Turing architecture based T4 GPU Accelerator as the ideal combination for these new demanding and latency-sensitive workloads and are aggressively leveraging them in our GPU system product line.’

‘Enterprise customers will benefit from a dramatic boost in throughput and power efficiency from the NVIDIA Tesla T4 GPUs in Supermicro's new high-density servers,’ said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. ‘With AI inference constituting an increasingly large portion of data centre workloads, these Tesla T4 GPU platforms provide incredibly efficient real-time and batch inference.’

Supermicro's performance-optimized 4U SuperServer 6049GP-TRT system can support up to 20 PCI-E NVIDIA Tesla T4 GPU accelerators, which dramatically increases the density of GPU server platforms for wide data centre deployment supporting deep learning, inference applications. As more and more industries deploy artificial intelligence, they will be looking for high-density servers optimised for inference. The 6049GP-TRT is the optimal platform to lead the transition from training deep learning, neural networks to deploying artificial intelligence into real-world applications such as facial recognition and language translation.

Supermicro has an entire family of 4U GPU systems that support the ultra-efficient Tesla T4, which is designed to accelerate inference workloads in any scale-out server.  The hardware accelerated transcode engine in Tesla T4 delivers multiple HD video streams in real-time and allows integrating deep learning into the video transcoding pipeline to enable a new class of smart video applications. As deep learning shapes our world like no other computing model in history, deeper and more complex neural networks are trained on exponentially larger volumes of data.  To achieve responsiveness, these models are deployed on powerful Supermicro GPU servers to deliver maximum throughput for inference workloads.

Company: 
Feature

Robert Roe reports on developments in AI that are helping to shape the future of high performance computing technology at the International Supercomputing Conference

Feature

James Reinders is a parallel programming and HPC expert with more than 27 years’ experience working for Intel until his retirement in 2017. In this article Reinders gives his take on the use of roofline estimation as a tool for code optimisation in HPC

Feature

Sophia Ktori concludes her two-part series exploring the use of laboratory informatics software in regulated industries.

Feature

As storage technology adapts to changing HPC workloads, Robert Roe looks at the technologies that could help to enhance performance and accessibility of
storage in HPC