FAQ  •  Login

ME759: Midterm Project, SPH Default Project

<<

ArmanP

Newbie
Newbie

Posts: 28

Joined: Wed Feb 16, 2011 12:02 pm

Unread post Mon Dec 09, 2013 7:49 pm

Re: ME759: Midterm Project, SPH Default Project

Names of the most important variable names in the code:

posRadD : Array of float3, each component is position of a marker (x,y,z)
D at the end points to the device. H points to the host. posRadD is on device, posRadH is on the host memory. Same rule applies to the naming of the other data.
velMasD : Array of float4, the first 3 component of each float4 is velocity of the associated marker and the last component is marker representative mass (a fixed float for all markers)
rhoPresMuD : Array of float4, each float4 is (rho, pressure, mu, type) = (density, pressure, viscosity, type) of the marker. type is -1 for fluid, 0 for boundary and 1,2,3,4, .... for rigid markers.

sortedPosRad, sortedVelMas, sortedRhoPreMu are similar to the vectors defined above, except the fact that they are sorted according to markers bin location.
cellStart: start location of the sequence of markers that share the same bin.
cellEnd: end location of the sequence of markers that share the same bin.
derivVelRhoD: Array of float4, the first 3 component of each float4 are marker acceleration (time derivative of the velocity) and the last component is derivative of density.
paramsH, paramsD: some useful parameters like domain size, grid size, ...

referenceArray: an array of int2, first component, is start and end location of fluid markers in posRadD, ..., 2nd component is start and end of boundary markers, 3rd, 4th, 5th .... start and end of rigid markers of rigid bodies 1,2,3, ... (you don't need to be worried about rigid stuff).


These are the values you do not need to be worried about, since they are rigid bodies data:
posRigidD, posRigidCumulativeD, velMassRigidD, qD, AD1, AD2, AD3, omegaLRF_D, rigidIdentifierD, rigidSPH_MeshPos_LRF_D, jD1, jD2, jInvD1, jInvD2
<<

ArmanP

Newbie
Newbie

Posts: 28

Joined: Wed Feb 16, 2011 12:02 pm

Unread post Tue Dec 10, 2013 10:52 am

Re: ME759: Midterm Project, SPH Default Project

Answer to this question about timing:
"
For 1000 steps :
step: 0, step Time: 2855.723633
total Time: 41119.425781
step: 0, step Time: 2896.286621
total Time: 37224.222656
The difference is so big (more than 10%) that I just wonder which one I should use. This problem happens a lot that I never know the performance improvement is due to my change or just another fluctuation.
Is there a way to make the timing more reliable?
"

Answer:
Four actions you can take:
1 - For timing purposes, comment out "PrintToFile" function call in collideSphereSphere.cu. This function saves data into hard drive.
2 - Time the code after all initialization, i.e. move this two lines:
GpuTimer myTotalTime;
myTotalTime.Start();
from the beginning of the "cudaCollisions" function to right before where the computation for loop starts ( for (int tStep = 0; tStep < stepEnd + 1; tStep++) )

3 - Do not rely on the timing of a single computation. Run the code for, say, 1000 steps and calculate the average time per step.
4 - make sure you are running your code on the same GPU, and make sure that the GPU is doing nothing before you run your code for timing. As I mentioned in the post before the previous post, you can see GPU usage by nvidia-smi command and you can set the appropriate GPU for your run.
<<

ArmanP

Newbie
Newbie

Posts: 28

Joined: Wed Feb 16, 2011 12:02 pm

Unread post Tue Dec 10, 2013 11:39 am

Re: ME759: Midterm Project, SPH Default Project

Validation Benchmark:
The code generates a file called "dataRigidCenterVsTimeAndDistance.txt". You have to compare it with "Reference_DataRigidCenterVsTimeAndDistance.txt". To help you with the comparison, I generated a Matlab file, "mainParticleTrajectory.m", which calls the two .txt files and generate a comparison plot. You should include this plot in your report. All you need to do is to replace "dataRigidCenterVsTimeAndDistance.txt" with the one your code generates and run the matlab script.
All of these files are checked in the same bitbucket folder.
<<

ArmanP

Newbie
Newbie

Posts: 28

Joined: Wed Feb 16, 2011 12:02 pm

Unread post Wed Dec 11, 2013 4:32 pm

Re: ME759: Midterm Project, SPH Default Project

As an option for speeding up the simulation, you can look into CUDA Stream. The idea is to run something when the code is stalled for inevitable memory transactions.
<<

f13-759-agola

Newbie
Newbie

Posts: 35

Joined: Fri Sep 13, 2013 11:30 am

Unread post Fri Dec 13, 2013 6:00 pm

Re: ME759: Midterm Project, SPH Default Project

As mentioned on the previous page I was using average time for comparing performance. I was using stepEnd 1000.
After implementing an optimization, I got the performance as :
average Time : 120.780106
If I re-execute the program immediately, my average time immediately drops down to
average Time : 86.817863
average Time : 87.133163
I believe this is due to data caching. I should not really on the 80 something average times as if my data size increases, cache will evict out things much more frequently, making these values also inconsistent. To get a true idea of perfromance comparison,I want to make sure that I use a clean system state as the base.

Since I am a non-sudo user, I am unable to use the drop_caches.

Should I just change my execution GPU? Is their any better way to get a true idea of performance.
<<

Dan Negrut

Global Moderator
Global Moderator

Posts: 833

Joined: Wed Sep 03, 2008 12:24 pm

Unread post Sat Dec 14, 2013 10:27 am

Re: ME759: Midterm Project, SPH Default Project

Anjali - if drop_cache would give you a difference in timing, it means you are not placing the timing calls at the right locations in your code.
Hitting or not hitting the cache is something that should be relevant only on the host side. You are timing something on the device side. Make sure you place the timing functions right before and right after the kernel call. Cached or not cached, what you'd measure should not change if timing is done this way since you don't talk (for a kernel call) to the host cache (only the kernel call arguments might or might not be cached, but they will be very likely cached for your code).

A couple of things:
- don't write anything to disk when you run your timing. If you do I/O all your changes will get lost in the noise
- do the timing inside the for look, place the relevant calls right before and right after each kernel call

I hope this helps.
Dan
<<

f13-759-asekar

Newbie
Newbie

Posts: 9

Joined: Mon Sep 09, 2013 9:12 am

Unread post Thu Jan 16, 2014 8:02 pm

Re: ME759: Midterm Project, SPH Default Project

Dear Professor,
It would be really useful if SPH project group's collective optimization results and observations are posted .


Thanks
Ajay
<<

Dan Negrut

Global Moderator
Global Moderator

Posts: 833

Joined: Wed Sep 03, 2008 12:24 pm

Unread post Thu Jan 16, 2014 11:39 pm

Re: ME759: Midterm Project, SPH Default Project

I'll talk w/ Arman, he will summarize the outcome.
Not going to happen too soon, there are several students working on this who haven't submitted their work.
Have a great semester,
Dan
<<

xiudongwu

Newbie
Newbie

Posts: 15

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Sat Dec 12, 2015 11:15 am

Re: ME759: Midterm Project, SPH Default Project

good
Previous

Return to ME759 Fall 2013: High Performance Computing

Who is online

Users browsing this forum: No registered users and 2 guests

cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software.