Portable CUDA on NVIDIA, AMD, Intel GPUs and CPUs

I’ve just uploaded a new version of the open source CUDAfy.NET SDK that targets Linux.  For those that do not know, CUDAfy.NET is an easy to use .NET wrapper that brings the NVIDIA CUDA programming model and the power of GPGPU to the world of C#, VB, F# and other .NET users.  Anyone with a brief understanding of the CUDA runtime model will be able to pick up and run with CUDAfy within a few minutes.

One criticism of CUDA is that it only targets NVIDIA GPUs.  OpenCL is often regarded as a better alternative, being supported by both NVIDIA and AMD on their GPUs, but also on CPUs by Intel and AMD.  That should have been the end of the story for CUDA if it were not for the fact that OpenCL remains even in its 1.2 version, a bit of a dog’s breakfast in comparison to CUDA.  I guess that is what it gets by trying to be all things to all men.  Although in theory platform agnostic to get the best out of it you need to take care of a number of often vendor specific issues and if you need higher level maths libraries such as FFTs and BLAS then it gets tougher.  Ultimately if you’ve been reared on CUDA moving to OpenCL is not a nice proposition.

CUDAfy .NET brings together a rare mix of bed fellows

Because CUDAfy is a wrapper we were able to overcome this limitation.  The same code with some small restrictions can now target either OpenCL or CUDA.  The restrictions include no FFT, BLAS, SPARSE and RAND libraries, and no functions in structures. Other than this you can use either OpenCL or CUDA notation for accessing thread, block and grid id and dimensions, same launch parameters, same handling of shared, constant and global memory, and same host interface (the methods you use for copying data etc).  Now with the release of the CUDAfy Linux library you can run the same application on both Windows and Linux, 32-bit or 64-bit, and use AMD GPUs, NVIDIA GPUs, Intel CPU/GPUs, AMD APU/CPU/GPUs and also some Altera FPGA cards such as those from BittWare and Nallatech.  To run such an app you need only have the relevant device drivers and OpenCL libraries installed, and you’ll need Mono installed on Linux.  If you have an NVIDIA or AMD GPU in the system and up to date drivers then it should just work.  For developing you’ll need the relevant CUDA or OpenCL SDK.  NVIDIA, Intel, AMD and Altera all have theirs.  An advantage of OpenCL is that the compiler is built in, you do not need Visual C++ or gcc on the machine to make new GPU functions.

The aim we have in mind is to cater for the increasing number of embedded or mobile devices with GPU capabilities.  NVIDIA is yet to support CUDA on their Tegra, but there are already ARM and other devices that have OpenCL libraries.  Examples include the Google Nexus 4 phone and Nexus 10 tablet.  The embedded market may often be less visible (excuse pun) than their more obvious phone and tablet counterparts but since they are hidden in so many machines their numbers are huge.  Power consumption is always an issue which is why being able to use the GPU for general purpose processing instead of just graphics is so important.  Being able to take advantage of that in a straightforward way is therefore vital and CUDAfy .NET can be a solution for such devices running Windows or Linux.

Google Nexus 10 has OpenCL libraries onboard