FAQ  •  Login

atomicAdd in CUDA

<<

Sai

Newbie
Newbie

Posts: 10

Joined: Mon Aug 18, 2008 9:32 am

Unread post Wed Nov 05, 2008 4:52 pm

atomicAdd in CUDA

Suppose I want to use the atomicAdd() function in my kernel apparently I must add '-arch sm_11' to my command line while compiling the .cu file. To do so in VS 2005 I right clicked on my .cu file -> command line and add  -arch sm_11 to the additional options. I also did this for the project. When I compile my code,  i get the error

identifier "atomicAdd" is undefined

I'm using a GTX280 card so I guess the atomic functions are supported. I don't have any other errors while compiling. Does anyone know what I'm doing wrong here? I hope I don't have to add some custom build rules ...
<<

BSme964uw

Newbie
Newbie

Posts: 15

Joined: Mon Aug 18, 2008 9:32 am

Unread post Thu Nov 06, 2008 9:54 am

Re: atomicAdd in CUDA

Hi Sai,

I ran into the same problem. I added this as a Custom Build Step to the collide.cu file. In "Configuration Properties"->"Tool" I use "Custom Build Step" instead of "CUDA".

nvcc.exe -ccbin "C:\Program Files\Microsoft Visual Studio 8\VC\bin"  -c -arch sm_11 -D_CONSOLE -Xcompiler "/EHsc /W3 /nologo /Wp64 /O2 /Zi  /MT " -I"C:\CUDA\include" -I"C:\Program Files\NVIDIA Corporation\NVIDIA CUDA SDK\common\inc" -o Release\collide.obj collide.cu

Brandon
<<

Sai

Newbie
Newbie

Posts: 10

Joined: Mon Aug 18, 2008 9:32 am

Unread post Thu Nov 06, 2008 10:26 am

Re: atomicAdd in CUDA

Hi Brandon,

Thanks for your suggestion. I tried it and the code compiled but I got a whole bunch of linking errors. Instead, I  downloaded the CUDA custom build rules  from

http://forums.nvidia.com/index.php?showtopic=30273

and edited the file (Dan helped with this one) to include -arch sm_11. I then used CUDA to build it and it seems to work now.


-Sai
<<

Dan Negrut

Global Moderator
Global Moderator

Posts: 833

Joined: Wed Sep 03, 2008 12:24 pm

Unread post Fri Nov 07, 2008 7:17 am

Re: atomicAdd in CUDA

Just a heads up for all of you struggling to get the atomicAdd to work.  There is another approach to solving the problem of figuring out how many contacts you'll have and setting aside the right amount of memory to store the contact information.
This approach is based on a prefix scan and it's actually faster than the atomicAdd version that everybody seems to have embraced.  The idea is like this:
- set aside an array about 10 times larger than the number of bodies you have in the system.  This will be the array where the collision information will be stored (r_A, r_B, normal info, etc.).  Let's call this array C.
- set aside an array of ints, call it Z, with as many entries as bodies you have in the system
- run a kernel in which each body (thread associated with a body, that is) stores in its corresponding entry in Z the number of contacts that it experiences (count only contacts with bodies of index higher than own to avoid duplication)
- run an exclusive scan on this array Z of ints and get the offsets that each body will use to write its collision data in C
- run another kernel where you actually populate the array C with the collision info

Stop by if you want to talk about this further.  You don't have to follow this idea, but I wanted to suggest an approach that doesn't have to do with the atomicAdd.  atomicAdd slows down the execution and also seems to give compile problems to some of you.

Dan

Return to ME964 Fall 2008: High Performance Computing

Who is online

Users browsing this forum: No registered users and 1 guest

cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software.