Modern GPUs are massively parallel computers, and can in some sense be thought of as vector machines. When a graphics chip is reported as having 1,536 CUDA cores (as the Nvidia GTX 1660 Ti) or 2304 Stream Processors (as the AMD Radeon RX 580), it is a direct metric of how much parallelism there is. Here we’re indeed talking about thousands of arithmetic operations that can all be executed in one go, in parallel. This is used in the graphics pipeline, which has the job of positioning and coloring pixels on the screen, but can also be used for general computing tasks.
As in the case of vector instructions (AVX, SSE, etc.), the parallelism in GPUs puts extreme pressure on memory requirements, and memory transfer speeds are generally an order of magnitude up from what a CPU can do. This also means, as a very rough rule of thumb, we can think of the GPU as being around 10x faster than the CPU for what it does. On the other hand, GPUs respond very poorly to general computing tasks that contain things like conditional execution or branching, so if you’re thinking to just drop the CPU and only use the GPU, that won’t work so well. Thankfully, games are one of the workloads that do extremely well on GPUs and the development of the modern graphics processor is largely what enables the amazing game titles we have today.
Another property of the graphics processor is that it sits on the PCIe bus. This does in theory mean that the bus might be another bottleneck, but in practice this doesn’t really happen. The bus is quite a bit slower than the GPU though, which is why most GPUs have dedicated memory at all, and why you generally want large chunks of data like textures to fit in the available GPU memory.
Contrast this with processors that have integrated graphics, and usually use a portion of main memory for graphics work as well. This works well for low end purposes, but we’ve lost that 10x gain on memory bandwidth, and very often this is what limits integrated GPUs from performing better than they do.
The first mainstream games computer with any sort of accelerated graphics was the Commodore Amiga in 1985, which contained a set of custom chips that could operate independently from the CPU, and among other things had a blitter unit to accelerate bitmap manipulation, and “The Copper” which had a primitive instruction set to sync hardware with the display, or to drive the blitter. The S3 86C911 in 1991 was the first graphics accelerator for Windows, but it was only 2D graphics at the time. 3D-graphics started being commonplace in 1994 with the PlayStation and later with Nintendo 64, and some early PC cards were S3 ViRGE, ATI Rage, and Matrox Mystique, with the 3dfx Voodoo maybe the most notable in 1996. Nvidia later released “the world’s first ‘GPU’” in 1999 with the GeForce 256.
Modern GPUs have roughly the processing power of supercomputers from the early 2000s.