Supercomputer on your graphicsboard

already owning awesome computing power.
They take a look on their CPU, compare the speeds on a
supercomputing archive like Top
500.org
and after that, they’re disappointed.

But compared to CPU’s (the main processor), modern graphic boards
GPU’s are already a lot faster than their complex instruction
processing pendants.

Fast GPU’s (graphics processors) are needed for the latest 3D games
which are getting more and more hype. They even bundle lots of
GPU’s to virtual graphics boards for maximum performance. For the
consumer that’s called SLI or crossfire, for graphics offices
Nvidia introduced quadroplex
recently.

nvidia_quadro_plex

Now back to supercomputing, ok GPU’s are faster ? why still use
CPU’s ?
A GPU has the limit that it can only handle simple instructions
which are highly optimized, further they work from their own board
based but extreme fast memory banks. So, a CPU is needed for
complex calculations and system management and flexible functions,
whereas a GPU can do the simple but effective calculations.

Some smart people started to use GPU’s to accelerate physics
calculations for games, making Ageia’s Physx accelerator
board
a nonsense investment.

More smart people started to figure out how to use GPU’s for their
own purposes.
A programmer needs an API to abstract function calls to the
hardware.
The common API for GPU’s is OpenGL, and this leads to a very
uncommon thinking
in programming, as a graphics API like OpenGL has graphics
functions.
Lets get closer to some expression differences.

On graphics boards you know an expression like:
– texture (pixels in x,y space)
– drawing
– shader program
the same on a CPU langauge would simply be:
– array (2 dimensional x,y)
– computing
– algorithm / calculation formula

Now if we want to give the GPU work to do we copy input data from
the main memory to the graphics board texture buffer. Like we do as
graphics textures for game-models.
The next thing is, we implement an algorithm workflow in assembly
as a shader
program. (The graphics card thinks we will animate a flickering
fire for example, and does the calculation according to the shader
program from the input texture buffer to an output texture
buffer.)
And now we transfer back the output texture buffer to the main
systems memory, and wonder wonder…. the calculation was done by
the GPU.
Depending on the shader program and implementation possibility the
GPU will be lightyears faster in calculation as the CPU.

The big advantage of GPU’s is of course graphical calculation, this
is why makes sense to use them for floating point and limit
calculations.
See here for some implemented examples:

Lineare Algebra auf GPU’s


GPU Tutorials

GPGPU
Implementations

Now it’s only a matter of time when distributed computing projects
like
setiathome, faah, hpf, hpf2 and so
on will run a hundred times faster on peoples
home computers graphics hardware.

Hallelujah !