FAQ  •  Login

HW 11 Results

<<

alvarolinares

Newbie
Newbie

Posts: 15

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Tue Dec 08, 2015 6:41 pm

HW 11 Results

Problem 3:
Execution configuration: 16 processors in 1 node.
Timing: 0.011 s

Problem 4:
Max array size handled: 67108864 (226) elements.
CUDA Implementation: 0.132 s
Handcrafted implementation (#nodes = 1; processors/node = 16): 0.038 s
MPI_Reduce (#nodes = 1; processors/node = 16) = 0.0031 s
Attachments
problem1.png
Problem 1
problem1.png (304.74 KiB) Viewed 4024 times
<<

yuelinpeng

Newbie
Newbie

Posts: 7

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 2:40 pm

Re: HW 11 Results

1) Max array size handled is 16777216
2) CUDA Implementation: 843.328ms
3) Handcrafted implementation (NP=4):819.03ms
4) MPI_Reduce (NP=8): 121.496ms
<<

pkrishna

Newbie
Newbie

Posts: 9

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 4:48 pm

Re: HW 11 Results

Problem 1
Attachments
prob1.png
prob1.png (15.88 KiB) Viewed 3999 times
<<

chadbustard

Newbie
Newbie

Posts: 10

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 5:23 pm

Re: HW 11 Results

Problem 1
Attachments
CS759_HW11_Prob1_Plot.png
CS759_HW11_Prob1_Plot.png (9.97 KiB) Viewed 3996 times
<<

chadbustard

Newbie
Newbie

Posts: 10

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 5:26 pm

Re: HW 11 Results

Problem 3:
Best configuration tested: 8 nodes, 8 processors per node
Time: 4.5 ms
<<

mikkelnielsen

Newbie
Newbie

Posts: 25

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 6:18 pm

Re: HW 11 Results

problem 3:

two nodes and eight cores on each node

solution is: 32.1208
0.0261478s to compute
<<

lucasjacobson

Newbie
Newbie

Posts: 28

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 8:36 pm

Re: HW 11 Results

Problem 1:

Image

Problem 3:

Nodes__Tasks/node__Time_[ms]__Speedup
____1___________1___0.275060_____1.00
____1___________2___0.113983_____2.41
____1___________4___0.042005_____6.55
____1___________8___0.024726____11.12
____2___________1___0.146971_____1.87
____2___________2___0.048006_____5.73
____2___________4___0.024114____11.41
____2___________8___0.012307____22.35
<<

chandanahosamanekabbali

Newbie
Newbie

Posts: 9

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 8:56 pm

Re: HW 11 Results

Config Time(ms)
Nodes cores
1 1 122
1 4 42
1 8 24
1 16 12

2 4 33
2 16 13

4 2 35
4 8 13

The times reported are averaged over 10 runs.
The lowest time obtained is for 1 node with 16 cores.
However, 32 processes distributed over 2 nodesx16 cores and 4 nodesx8 cores takes almost the same time.
<<

jonathoncrandallmagana

Newbie
Newbie

Posts: 14

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 10:39 pm

Re: HW 11 Results

Problem 3
2 nodes, 8 cores on each node gave 0.016662 seconds

Problem 4
Max array size handled: 50000000
CUDA Implementation: 0.014275
Handcrafted implementation (NP=16): 0.115756
MPI_Reduce (NP=16): 0.115756
<<

saketsaurabh

Newbie
Newbie

Posts: 15

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 10:53 pm

Re: HW 11 Results

Problem 3:

Best Configuration:
Number of nodes = 2, number of cores per node = 8
Time taken = 16.669 ms
Integral Value = 32.121041


Problem 1

Plot:
hw11-p1-plot.png
hw11-p1-plot.png (37.47 KiB) Viewed 3950 times
<<

chandanahosamanekabbali

Newbie
Newbie

Posts: 9

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 11:00 pm

Re: HW 11 Results

Question 4:
Max array size handled: 67,000,000
CUDA Implementation: 110.396576 ms
Handcrafted implementation (NP= 1 node, 16 cores): 72.391987 ms
MPI_Reduce (NP= 1 node, 16 cores): 110.695124 ms
<<

kaziahmed

Newbie
Newbie

Posts: 17

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Wed Dec 09, 2015 11:48 pm

Re: HW 11 Results

Problem 3:

2 nodes, 8 tasks per node, 16 tasks
Result: 32.1210406663591
Time: 58 ms


Problem 4:

2 nodes, 8 tasks per node, 16 tasks
Elements: 100663296
Inclusive GPU: 314.796143 ms
Exclusive GPU: 6.109664 ms
MPI Manual (uses Gather): 178 ms
MPI with Reduce: 142 ms
<<

mikkelnielsen

Newbie
Newbie

Posts: 25

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Thu Dec 10, 2015 12:41 am

Re: HW 11 Results

Problem 1 results image
Attachments
problem1.png
problem1.png (75.59 KiB) Viewed 3928 times
<<

xiluwang

Newbie
Newbie

Posts: 5

Joined: Tue Sep 08, 2015 6:03 pm

Unread post Thu Dec 10, 2015 12:58 am

Re: HW 11 Results

Problem 3:

I compared:

One node and one core, One node and four cores, One node and eight cores, Two nodes and four cores, Four nodes and two cores, among which

"One node and eight cores" has the best performance, its run time is: 0.027755s

Problem 4:

Max array size handled: 200000000
Handcrafted implementation (one node and eight cores):
Run Time: 0.108948 seconds
Reduction Results: 204296
MPI Reduce (one node and eight cores):
Run Time: 0.408144 seconds
Reduction Results: 204294

The results seems slightly different, but I do not have time to check my codes, it is 11:58 now : )
<<

xiudongwu

Newbie
Newbie

Posts: 15

Joined: Fri Sep 04, 2015 12:51 pm

Unread post Mon Dec 14, 2015 9:27 pm

Re: HW 11 Results

Problem1:
Using MPI_Bcast:
That took for size of 1 0.002167 seconds
That took for size of 2 0.000287 seconds
That took for size of 4 0.000244 seconds
That took for size of 8 0.000247 seconds
That took for size of 16 0.000274 seconds
That took for size of 32 0.000321 seconds
That took for size of 64 0.000333 seconds
That took for size of 128 0.000199 seconds
That took for size of 256 0.000359 seconds
That took for size of 512 0.000280 seconds
That took for size of 1024 0.001106 seconds
That took for size of 2048 0.002525 seconds
That took for size of 4096 0.001655 seconds
That took for size of 8192 0.002242 seconds
That took for size of 16384 0.002710 seconds
That took for size of 32768 0.002717 seconds
That took for size of 65536 0.004795 seconds
That took for size of 131072 0.007603 seconds
That took for size of 262144 0.011903 seconds
That took for size of 524288 0.008264 seconds
That took for size of 1048576 0.011820 seconds
That took for size of 2097152 0.020199 seconds
That took for size of 4194304 0.038020 seconds
That took for size of 8388608 0.134117 seconds
That took for size of 16777216 0.176821 seconds
That took for size of 33554432 0.349596 seconds
That took for size of 67108864 0.694659 seconds
That took for size of 134217728 1.388323 seconds
That took for size of 268435456 2.784297 seconds
That took for size of 536870912 5.520306 seconds
That took for size of 1073741824 11.005817 seconds

Using for loop:
That took for size of 1 0.046940 seconds
That took for size of 2 0.063970 seconds
That took for size of 4 0.051665 seconds
That took for size of 8 0.060996 seconds
That took for size of 16 0.055989 seconds
That took for size of 32 0.045704 seconds
That took for size of 64 0.060200 seconds
That took for size of 128 0.066070 seconds
That took for size of 256 0.054999 seconds
That took for size of 512 0.054986 seconds
That took for size of 1024 0.036941 seconds
That took for size of 2048 0.061041 seconds
That took for size of 4096 0.044572 seconds
That took for size of 8192 0.058414 seconds
That took for size of 16384 0.044984 seconds
That took for size of 32768 0.058997 seconds
That took for size of 65536 0.034000 seconds
That took for size of 131072 0.054359 seconds
That took for size of 262144 0.050843 seconds
That took for size of 524288 0.064418 seconds
That took for size of 1048576 0.086885 seconds
That took for size of 2097152 0.181849 seconds
That took for size of 4194304 0.295930 seconds
That took for size of 8388608 0.590581 seconds
That took for size of 16777216 1.194017 seconds
That took for size of 33554432 2.375510 seconds
That took for size of 67108864 4.729000 seconds
That took for size of 134217728 9.420702 seconds
That took for size of 268435456 18.829221 seconds
That took for size of 536870912 37.606320 seconds
That took for size of 1073741824 76.177518 seconds

For loop need almost 7 times more time than MPI_Bcast, because it need 15 repeat of point to point conmmunication.
Next

Return to ME759 Fall 2015: High Performance Computing

Who is online

Users browsing this forum: No registered users and 1 guest

cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software.