The most advanced data centre GPU ever

Artificial intelligence for self-driving cars. Predicting our climate's future. A new drug to treat cancer. Some of the world's most important challenges need to be solved today, but require tremendous amounts of computing to become reality. Today's large-scale data center relies on lots of interconnected commodity compute nodes, limiting the performance needed to drive these important workloads. Now, more than ever, the data center must prepare for the high-performance computing and hyperscale workloads being thrust upon it.

The NVIDIA® Tesla® P100 is purpose-built as the most advanced data center accelerator ever. It taps into an innovative new GPU architecture to deliver the world’s fastest compute node capable of performance equal to hundreds of slower commodity compute nodes. Lightning-fast nodes powered by Tesla P100 accelerate time-to-solution for the world’s most important challenges that have infinite compute needs in HPC and deep learning.

P100 features five ground breaking technologies:

  • New Pascal Architecture: Delivering 5.3 and 10.6 TeraFLOPS of double and single precision for HPC, 21.2 TeraFLOPS of FP16 for Deep learning
  • NVLink: World's first high-speed Interconnect for multi-GPU scalability with 5x boost in performance
  • CoWoS ® with HBM2: Unifying data and compute into single package for up to 3X memory bandwidth over prior-generation solution
  • 16nm FinFET for unprecedented energy efficiency - With 15.3 billion transistors built on 16 nanometer FinFET fabrication technology, the Pascal GPU is the world's largest FinFET chip ever built2. It is engineered to deliver the fastest performance and best energy efficiency for workloads with near-infinite computing needs.
  • New AI Algorithms: New half-precision, 16-bit floating point instructions deliver over 21 TeraFLOPS for unprecedented training performance.
  P100 for NVLink-optimized Servers P100 for PCIe-based Servers
Double-Precision Performance 5.3 TeraFLOPS 4.7 TeraFLOPS
Single-Precision Performance  10.6 TeraFLOPS 9.3 TeraFLOPS
Half-Precision Performance 21.2 TeraFLOPS 18.7 TeraFLOPS
NVIDIA NVLink™ Interconnect Bandwidth 160GB/s -
PCIe x16 Interconnect Bandwidth 32GB/s 32GB/s
CoWoS HBM2 Stacked Memory Capacity 16GB 16GB 12GB
CoWoS HBM2 Stacked Memory Bandwidth 720GB/s 720GB/s 540GB/s
Enhanced Programability with Page Migration Engine Yes Yes
ECC Protection for Reliability Yes Yes
Server-Optimized for Data Center Deployment Yes Yes

* FLOPS performance with NVIDIA GPU Boost™ * Interconnect Bandwidth measured Bidirectional

Find your solution

Test out any of our solutions at Boston Labs

To help our clients make informed decisions about new technologies, we have opened up our research & development facilities and actively encourage customers to try the latest platforms using their own tools and if necessary together with their existing hardware. Remote access is also available

Contact us