GPU computing is the use of a GPU (graphics processing unit) as a co-processor to accelerate CPUs for general-purpose scientific and engineering computing.
The GPU accelerates applications running on the CPU by offloading some of the compute-intensive and time consuming portions of the code. The rest of the application still runs on the CPU. From a user's perspective, the application runs faster because it's using the massively parallel processing power of the GPU to boost performance. This is known as "heterogeneous" or "hybrid" computing.
A CPU consists of four to eight CPU cores, while the GPU consists of hundreds of smaller cores. Together, they operate to crunch through the data in the application. This massively parallel architecture is what gives the GPU its high compute performance. There are a number of GPU-accelerated applications that provide an easy way to access high-performance computing (HPC).
Core comparison between a CPU and a GPU
Application developers harness the performance of the parallel GPU architecture using a parallel programming model invented by NVIDIA called "CUDA." All NVIDIA GPUs - GeForce®, Quadro®, and Tesla® - support the NVIDIA® CUDA® parallel-programming model.
Tesla GPUs are designed as computational accelerators or companion processors optimized for scientific and technical computing applications. The latest Tesla 20-series GPUs are based on the latest implementation of the CUDA platform called the "Fermi architecture". Fermi has key computing features such as 500+ gigaflops of IEEE standard double-precision floating-point hardware support, L1 and L2 caches, ECC memory error protection, local user-managed data caches in the form of shared memory dispersed throughout the GPU, coalesced memory accesses, and more.
Graphics chips started as fixed-function graphics pipelines. Over the years, these graphics chips became increasingly programmable, which led NVIDIA to introduce the first GPU. In the 1999-2000 timeframe, computer scientists, along with researchers in fields such as medical imaging and electromagnetics, started using GPUs to accelerate a range of scientific applications. This was the advent of the movement called GPGPU, or General Purpose GPU computing.
The challenge was that GPGPU required the use of graphics programming languages like OpenGL and Cg to program the GPU. Developers had to make their scientific applications look like graphics applications and map them into problems that drew triangles and polygons. This limited the accessibility to the tremendous performance of GPUs for science.
NVIDIA realized the potential of bringing this performance to the larger scientific community and invested in modifying the GPU to make it fully programmable for scientific applications. Plus, it added support for high-level languages like C, C++, and Fortran. This led to the CUDA parallel computing platform for the GPU.