Tuesday, October 27, 2009

Put That Hairstyle On My Head



first project using CUDA I'm already behind. The code itself was not difficult to write. The worst thing was to induce the Visual Studio suggests a code (which so far I have not succeeded - suggests just the code for the variables and functions nieCUDA). Another thing that irritated me is the path to the header files and libraries. I assumed at the beginning of the project in the My Documents \\ Visual Studio 2008 \\ Projects and tried the project settings to add the path to the CUDA SDK. Compiler crash "errors that can not cuda.h find the file when it seemed to me that everything is in order. Ultimately, it turned out that the following settings are correct (for the Debug configuration):

- Linker -> General -> Additional Library Directories:

"C: \\ CUDA \\ lib" "C: \\ Documents and Settings \\ Yelonek \\ Local Settings \\ Application Data \\ NVIDIA Corporation \\ NVIDIA CUDA SDK \\ common \\ lib "
(so strange path here, because I chose" Install only for me "when you install and does not want me to loosen)

- Linker -> Input -> Additional Dependencies:

cudart.lib cutil32d.lib

next problem that appeared is that VS does not coloring on parsing cu. You can work around as follows: In the directory
NVIDIA CUDA SDK \\ doc \\ syntax_highlighting usertype.dat file is to be placed in the directory Microsoft Visual Studio 8 \\ Common7 \\ IDE. Then, in the VS is in Tools ... -> Options to open the Text Editor in the list on the left and choose File Extension. Enter the end of the Extension box, set the editor for MS Visual C + + and click Add. After accepting the changes and restart the program can no longer enjoy the syntax colored when viewing files cu.
There is a little more convenient way to write programs. This patent can be found on other sites and blogs, but I will describe it here. It consists in the fact that we write a program in C, and when you build the project file c copies the file to the last, and only cu file is being built by nvcc. You can do this by defining their own rules for building files. When you click on the project ppm in the Solution Explorer, select the Custom Build Steps ... Create a new policy file (New Rule File), give the name of which will be displayed in the VS, the file name and path in which you want to save it. Add a new rule (Add Build Rule). The first rule is to copy the file c cu.


Command line:
copy $ (InputFileName) $ (InputName). cu
Execution Description: $ (InputFileName) ------> $ (InputName) . cu
File Extensions: *. c


Name:
Exchange at end of file c

Outputs:
$ (InputName).
cu
Show Only Rule Properties: True

Supports batching File: False


The second rule is a compilation of files using nvcc cu:

Additional dependencies:
$ (CUDA_INC_PATH )";"../../ common / inc "

Command Line:
"$(CUDA_BIN_PATH)\nvcc.exe" -ccbin "$(VCInstallDir)bin" -c -D_DEBUG -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/Od,/Zi,/RTC1,/MTd -I"$(CUDA_INC_PATH)" -I../../common/inc -o $(ConfigurationName)\$(InputName).obj $(InputFileName)
Display Name:
Kompilacja NVIDIA C for CUDA
Execution Description:
Wywoływanie NVCC

Name: Kompilacja plików CUDA Outputs: $(ConfigurationName)\$(InputName).obj
Show Only Rule Properties: True Supports File Batching: False

main.cw Now create a file where we write the application code. Copy it (only the first time) and called main.cu, then add to the project. We will change the file properties main.c Tool from C / C + + Compiler on "Changing c to diabetes," and in the properties file main.cu choose "Compiling C for NVIDIA CUDA." From now on, you can now build the project with F6 without any problems.
Keep in mind that changes made in the C file, not the CU! If the compilation project failed, it will get errors, as always, but double-clicking the file will take us to the CU, not C!! If a forget this and change your last, and then build the project everything will work: the file c is unchanged, so VS does not build it again (so the file c is copied to the end of the file.) Now suppose that the closed project, open it again and edit the file c. What is happening? Cu file has been overwritten, it disappears code written by us busy, and again there are compiler errors with which we fought before closing. You can lose a couple of hours of work, so I advise to be careful.
At the end of the sample code:
stdio.h # include # include stdlib.h \u0026lt;cuda.h>
/ / Data structure single agent struct {Agent
float x, float y ; };
/ Kernel which will perform on the card __global__ void fitness (Agent * agent, int N, float * results)
{int idx = blockIdx.x * blockDim.x + threadIdx.x; if (idx \u0026lt;N) {
results [idx] = 2 * agent [idx]. x * agent [idx]. x - agent [idx]. x + 3;}
} / / function that will perform on the host
int main (void) {
/ / Reports variables const int N = 10000, / / \u200b\u200bNumber of agents Agent * host_agents = NULL, * device_agents = NULL, / / \u200b\u200bPointer to the tables in the host (_h), and the card (_d)
size_t size_agents = N * sizeof (Agent) / / Calculate table size in bytes agents float * host_results = NULL, * device_results = NULL, / / \u200b\u200bPointer to array with the results
size_t size_results = N * sizeof (float) / / Compute sizes in bytes of the array results / / Memory Allocation
host_agents = (Agent *) malloc (size_agents) / / memory allocation for the agents on the host cudaMalloc ((void **) & device_agents, size_agents); / / Allocation of memory for agents on the card host_results = (float *) malloc (size_results) / / memory allocation for results on the host
cudaMalloc ((void **) & device_results, size_results) / / memory allocation for results on your / / Write the initial values \u200b\u200bto the array
for (int i = 0; i \u0026lt;N; i + +) { host_agents [i]. x = (float) (rand ()% 100);
host_agents [i]. y = (float) (rand ()% 100);}
/ / Copy the array from the host to the card cudaMemcpy (device_agents, host_agents, size_agents, cudaMemcpyHostToDevice); / / Call the kernel
int block_size = 4; n_blocks int = N / block_size + (N% block_size == 0? 0:1);
Fitness \u0026lt;\u0026lt;\u0026lt;n_blocks, block_size>>> (device_agents, N, device_results) ;

/ / Receiving the results of the card and store them in an array on the host
cudaMemcpy (host_results, device_results, size_results, cudaMemcpyDeviceToHost);
 / / Write results 
for (int i = 0; i \u0026lt;N; i + +)
printf ("% 05d: f (% 04.4f,% 04.4f) =% 04.4f \\ n", i, host_agents [i]. x, host_agents [i]. y, host_results [i]);
/ / Cleanup
free (host_agents);
free (host_results);
cudaFree (device_agents);
cudaFree (device_results);
system ("pause");}




Tuesday, October 20, 2009

Kates Playground In Natural

launched CUDA CUDA NVidia

recently given the task to use CUDA optimization algorithm consisting of searching the solution space using the "swarm ". The advantage is to use CUDA GPU to perform parallel calculations on a number of variables (at least at the moment so I understand).
I thought that I could complete this project on Ubuntu 8.04.
I went to NVIDIA's website, where at the moment there are drivers in version 185, and CUDA Toolkit 2.2.
Unfortunately, he failed - I could not install the drivers. After every installation X stood up in low-quality graphics. I read on the internet that envyNG install NVIDIA drivers, together with the library libcuda. With this program I managed to install the drivers as the 173rd Then I installed CUDA CUDA Toolkit and SDK. I started to build the sample projects. After make I got the information that you can not find libraries libglut, so I installed it from the repository -
`sudo apt-get install libglut3
libglut3-dev`. Another problem was a design threadMigration - Here is simply changed the name of the makefile renamed to Makefile-and this project can not fire me. In the end, all projects are constructed and go to the directory
/ home / daniel / NVIDIA_CUDA_SDK / bin / linux / release /
. Here
first run. / DeviceQuery
with the following result:

CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA Device 0

"GeForce 8400M GS"
CUDA Capability Major revision number : 1
CUDA Capability Minor revision number: 1
Total Amount of global memory: 267714560 bytes Number of multiprocessors
: 16
Number of cores: 128
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 0.80 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: Yes Support
host page-locked memory mapping: Yes
Compute Mode: Default (host multiple threads can use this device Simultaneously)

Test PASSED

Press ENTER to exit .. .


What does it mean that I have a device "CUDA Capable". The next test for the device to communicate with the system:
. / BandwidthTest


Running on ......
device 0: GeForce 8400M GS
Quick Mode Host to Device
Bandwidth for pageable memory
.
Transfer Size (Bytes) Bandwidth (MB / s)
33554432 856.9 Quick Mode
Transfer Size (Bytes) Bandwidth (MB / s)
33554432 895.4 Quick Mode

Device to Device Bandwidth . Transfer Size (Bytes) Bandwidth (MB / s) 33554432 4214.5 & & & & Test PASSED
Press ENTER to exit ...



So theoretically, all is well, but I can not run any example. The most common mistake is "
cudaSafeCall () API Runtime error in file
, line 51: feature is not yet Implemented.
" It follows from this that you can not CUDA 2.2 council to act on the current drivers and I had to fight with those of luck from NVIDIA.

EDIT:
Newsflash: I managed to install the drivers for XP. Helped select the drivers for notebooks. This allows you to use CUDA version 2.2.