Intel’s Gaudi 3 AI accelerator will be available at Boston Limited from 30th September, offering advanced AI capabilities for enterprises and AI developers. This release positions Gaudi 3 as a competitive alternative to NVIDIA GPUs, excelling in deep learning, generative AI and high-performance computing (HPC) workloads. The Gaudi 3 platform is designed to deliver greater efficiency, performance, and scalability, ensuring it meets the ever-growing demands of AI and data centre workloads.
The Intel Gaudi 3 accelerator brings notable performance advancements compared to its predecessor, Gaudi 2. It offers 4x BF16 compute, 2x networking bandwidth and 1.5x memory bandwidth. These improvements translate into faster training times and increased inference throughput across various models, including Llama 2 and GPT-3. Gaudi 3 is projected to provide up to 50% better time to train (TTT) on these models when compared to NVIDIA’s H100.
What sets Gaudi 3 apart is its open architecture, which avoids the proprietary systems that competitors use. This allows enterprises the flexibility to integrate and scale using standard Ethernet networking equipment rather than being locked into proprietary fabrics. Gaudi 3’s scalable architecture enables performance across a single node to clusters of 1,000 nodes or more. Each accelerator is equipped with 24x200 GbE ports, providing substantial scale-up and scale-out capabilities.
The Intel Gaudi 3 accelerator brings notable performance advancements compared to its predecessor, Gaudi 2. It offers 4x BF16 compute, 2x networking bandwidth and 1.5x memory bandwidth. These improvements translate into faster training times and increased inference throughput across various models, including Llama 2 and GPT-3. Gaudi 3 is projected to provide up to 50% better time to train (TTT) on these models when compared to NVIDIA’s H100.
What sets Gaudi 3 apart is its open architecture, which avoids the proprietary systems that competitors use. This allows enterprises the flexibility to integrate and scale using standard Ethernet networking equipment rather than being locked into proprietary fabrics. Gaudi 3’s scalable architecture enables performance across a single node to clusters of 1,000 nodes or more. Each accelerator is equipped with 24x200 GbE ports, providing substantial scale-up and scale-out capabilities.
Gaudi 3 offers substantial performance improvements, including an average of 50% faster training on GPT-3 (175B) models and 2x faster inference on Llama 2 models compared to Nvidia’s H100. This acceleration provides a significant advantage in AI workloads such as large language models (LLMs), where speed and efficiency are critical.
In addition to its superior performance, Gaudi 3’s cost-effectiveness is another strong selling point. Enterprises can expect up to 2.3x better performance per dollar for inference throughput compared to Nvidia H100, and 1.9x better for training throughput. The efficiency of the Gaudi 3 also translates into lower energy costs, with a 40% improvement in power efficiency for Llama and Falcon model inferencing.
Book your test drive below, today!
To help our clients make informed decisions about new technologies, we have opened up our research & development facilities and actively encourage customers to try the latest platforms using their own tools and if necessary together with their existing hardware. Remote access is also available
The annual GPU Technology Conference is returning once again in San Jose, California!