AWS Launches UltraServers with Record-Breaking GPU Performance for AI and HPC Workloads

July 10, 2025
AWS Launches UltraServers with Record-Breaking GPU Performance for AI and HPC Workloads
  • These UltraServers connect multiple EC2 instances via a high-bandwidth, low-latency accelerator interconnect using NVIDIA's NVLink-C2C technology, colocating GPU and CPU within a single compute module for optimal performance.

  • The UltraServers feature up to 72 NVIDIA Blackwell GPUs interconnected through fifth-generation NVLink, delivering 360 petaflops of FP8 compute and 13.4 TB of high-bandwidth GPU memory, making them ideal for the most demanding AI workloads.

  • Each superchip in these systems provides 10 petaflops of FP8 compute and up to 372 GB of HBM3e memory, supporting large-scale models such as trillion-parameter AI models.

  • AWS announced the general availability of high-performance GPU solutions, including P6e-GB200 UltraServers and P6-B200 instances powered by NVIDIA Blackwell Superchips, optimized for advanced AI training and inference.

  • The P6-B200 instances, equipped with 8 NVIDIA Blackwell GPUs, are suitable for medium to large-scale AI workloads and support existing GPU applications with familiar configurations.

  • AWS offers multiple deployment options for these GPU solutions, including Amazon SageMaker HyperPod, Amazon EKS, and NVIDIA DGX Cloud, enabling flexible, managed AI environments.

  • These UltraServers can be integrated with AWS services like SageMaker Hyperpod, EKS, and FSx for Lustre, providing automated provisioning, scalable orchestration, and high-throughput data access for large AI and HPC workloads.

  • Built on AWS’s core strengths of security, reliability, and efficiency, the UltraServers utilize the Nitro System for hardware security and live updates, and are deployed in third-generation EC2 UltraClusters for enhanced fault tolerance.

  • The UltraServers are now available in the Dallas Local Zone, with pricing details accessible via the AWS EC2 webpage, and can be reserved through EC2 Capacity Blocks, supporting extensive AI research and production.

  • Customers can leverage these UltraServers with preconfigured Deep Learning AMIs supporting frameworks like PyTorch and JAX, ensuring seamless integration into existing workflows.

  • AWS has introduced UltraServers powered by the AWS Nitro System, deployed in EC2 UltraClusters, offering up to 28.8 Tbps of Elastic Fabric Adapter (EFA) networking and coupling with NVIDIA GPUDirect RDMA for low-latency GPU communication.

  • Advanced networking technologies, including EFA with up to 28.8 Tbps bandwidth, ensure high-performance distributed training and communication at scale.

Summary based on 2 sources


Get a daily email with more AI stories

More Stories