Thursday, January 1, 2015


I spent a fair amount of my vacation time with comparing several OpenCL-enabled boards. I looked for a "cheap" board in terms of price/performance and operational costs (power consumption) for massively parallel workload. I found that the best performing boards available (for almost anyone) are AMD Radeon R9 295X2 and NVIDIA GeForce GTX TITAN Z.The following table summarizes my findings.

Radeon R9 295X2GeForce GTX TITAN Z
Price (USD)11002200
Single precision performance (TFLOPS)11.4668.122
Double precision performance (TFLOPS)1.4332.66
Thermal Design Power (W)500375
OpenCL version1.21.1

There are of course many more criteria, but these are the most important ones that affect my decision.

So the absolute winner is Radeon R9 295X2. Yes, it is a little bit power hungry, but its 22.9 GFLOPS/W is slightly better than the 21.7 GFLOPS/W of GeForce GTX TITAN Z. If you cannot afford to build a nuclear reactor in your room, the Radeon R9 290X is also a good choice with 18.78 GFLOPS/W.
Just a side note, but I really don't know what to think about GeForce GTX TITAN Z. Its FP64 performance is remarkable, but the board is a bit overpriced. It is also an open question, how the board performs using OpenCL, since NVIDIA's support for CUDA is far better than for OpenCL.