These examples exercise the GPUs on the Power nodes in various ways. To build/run these examples:
A script that sets up the environment and then does a make
Makefile for the examples.
This program returns the number of GPUs detected on a node. It should be 4 for ppc001 and ppc002. If not, there is a problem with your environment.
A very simple OpenAcc program.
Jacobi relaxation Calculation in OpenAcc and OpenMP This is from the Nvidia workshop.
Timer code for laplace2d.c.
A matrix multiply in Cuda from
cuFFT library example
C and Fortran Cuda programs. The CPU code accepts the Grid and Block dimensions then calls the kernel. We note that the number of threads for the kernel is the product of the grid and block dimensions. The kernel simply fills in an array of length 6*(# threads). The first element of each set of 6 is a thread number. Then we have: blockIdx.x, blockIdx.y, threadIdx.x, threadIdx.y, and threadIdx.z. Finally, the CPU prints this array. The file "input" is for this program.
Input for testinput.cu and testinput.f90
A script for running the examples.