Results for MPI perforance tests on sp2

Contents

memcpy

Determining delivered memory performance

	mpicc -o memcpy -O memcpy.c
Size (bytes) Time (sec)	Rate (MB/sec)
4	0.000000	18.985397
8	0.000000	37.940945
16	0.000000	76.032616
32	0.000000	70.972320
64	0.000001	109.090656
128	0.000001	149.134097
256	0.000001	182.662942
512	0.000002	205.801349
1024	0.000005	219.709263
2048	0.000009	227.400055
4096	0.000018	231.432652
8192	0.000035	233.504672
16384	0.000070	234.559777
32768	0.000139	235.082110
65536	0.000278	235.345954
131072	0.000574	228.205962
262144	0.001592	164.712457
524288	0.003164	165.700240
1048576	0.006343	165.313615
2097152	0.012857	163.112050

Determining delivered memory performance with unaligned data

	mpicc -o memcpy memcpy.c
Size (bytes) Time (sec)	Rate (MB/sec)
4	0.000000	15.642934
8	0.000000	29.534735
16	0.000000	56.050473
32	0.000001	56.050278
64	0.000001	86.870124
128	0.000001	119.811600
256	0.000002	147.840661
512	0.000003	167.424303
1024	0.000006	179.292242
2048	0.000011	185.893654
4096	0.000022	189.369790
8192	0.000043	191.148317
16384	0.000085	192.053909
32768	0.000170	192.514659
65536	0.000340	192.743488
131072	0.000698	187.672443
262144	0.001854	141.378498
524288	0.003685	142.291696
1048576	0.007365	142.364150
2097152	0.014943	140.341091

pingpong

Benchmarking point to point performance

	mpicc -o pingpong -O pingpong.c
Kind		n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000049	0.163185
Send/Recv	2	0.000049	0.326268
Send/Recv	4	0.000049	0.652744
Send/Recv	8	0.000050	1.292506
Send/Recv	16	0.000050	2.537236
Send/Recv	32	0.000067	3.810393
Send/Recv	64	0.000080	6.409815
Send/Recv	128	0.000099	10.291640
Send/Recv	256	0.000138	14.880116
Send/Recv	512	0.000212	19.362990
Send/Recv	1024	0.000415	19.721938
Send/Recv	2048	0.000674	24.319430
Send/Recv	4096	0.001163	28.174805
Send/Recv	8192	0.002101	31.187940
Send/Recv	16384	0.003990	32.849508
Send/Recv	32768	0.007872	33.300549
Send/Recv	65536	0.015528	33.764963
Send/Recv	131072	0.030608	34.258009
Send/Recv	262144	0.061561	34.066513
Send/Recv	524288	0.122536	34.229035
Send/Recv	1048576	0.244839	34.261775

Benchmarking point to point performance with nonblocking operations

	mpicc -o pingpong -O pingpong.c
Kind		n	time (sec)	Rate (MB/sec)
Isend/Irecv	1	0.000063	0.126050
Isend/Irecv	2	0.000063	0.253398
Isend/Irecv	4	0.000063	0.505098
Isend/Irecv	8	0.000063	1.008824
Isend/Irecv	16	0.000064	1.985775
Isend/Irecv	32	0.000082	3.126286
Isend/Irecv	64	0.000089	5.728778
Isend/Irecv	128	0.000109	9.356785
Isend/Irecv	256	0.000148	13.874555
Isend/Irecv	512	0.000225	18.211533
Isend/Irecv	1024	0.000424	19.337289
Isend/Irecv	2048	0.000685	23.932224
Isend/Irecv	4096	0.001170	28.000258
Isend/Irecv	8192	0.002120	30.913571
Isend/Irecv	16384	0.004027	32.552037
Isend/Irecv	32768	0.007897	33.195338
Isend/Irecv	65536	0.015564	33.686862
Isend/Irecv	131072	0.030882	33.954580
Isend/Irecv	262144	0.061660	34.011754
Isend/Irecv	524288	0.122753	34.168762
Isend/Irecv	1048576	0.245075	34.228686

Benchmarking point to point performance with nonblocking operations, head-to-head

	mpicc -o pingpong -O pingpong.c
Kind				n	time (sec)	Rate (MB/sec)
head-to-head Isend/Irecv	1	0.000080	0.201214
head-to-head Isend/Irecv	2	0.000079	0.402765
head-to-head Isend/Irecv	4	0.000080	0.795391
head-to-head Isend/Irecv	8	0.000080	1.595369
head-to-head Isend/Irecv	16	0.000082	3.104092
head-to-head Isend/Irecv	32	0.000098	5.212693
head-to-head Isend/Irecv	64	0.000112	9.125339
head-to-head Isend/Irecv	128	0.000135	15.123161
head-to-head Isend/Irecv	256	0.000175	23.389012
head-to-head Isend/Irecv	512	0.000233	35.109834
head-to-head Isend/Irecv	1024	0.000542	30.207881
head-to-head Isend/Irecv	2048	0.000898	36.478814
head-to-head Isend/Irecv	4096	0.001617	40.541285
head-to-head Isend/Irecv	8192	0.003017	43.446280
head-to-head Isend/Irecv	16384	0.005857	44.756812
head-to-head Isend/Irecv	32768	0.011543	45.419346
head-to-head Isend/Irecv	65536	0.022918	45.754080
head-to-head Isend/Irecv	131072	0.045612	45.978504
head-to-head Isend/Irecv	262144	0.090735	46.225714
head-to-head Isend/Irecv	524288	0.181509	46.216036
head-to-head Isend/Irecv	1048576	0.363857	46.109399

Benchmarking point to point performance with unaligned data

	mpicc -o pingpong -O pingpong.c
Kind char		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000049	0.020405
Send/Recv		2	0.000049	0.040865
Send/Recv		4	0.000049	0.081714
Send/Recv		8	0.000049	0.163980
Send/Recv		16	0.000049	0.326449
Send/Recv		32	0.000049	0.647599
Send/Recv		64	0.000050	1.280320
Send/Recv		128	0.000052	2.482080
Send/Recv		256	0.000070	3.666092
Send/Recv		512	0.000091	5.641870
Send/Recv		1024	0.000104	9.811954
Send/Recv		2048	0.000148	13.874161
Send/Recv		4096	0.000206	19.908861
Send/Recv		8192	0.000418	19.610990
Send/Recv		16384	0.000672	24.383221
Send/Recv		32768	0.001153	28.411150
Send/Recv		65536	0.002094	31.289756
Send/Recv		131072	0.004028	32.542137
Send/Recv		262144	0.007851	33.387761
Send/Recv		524288	0.015546	33.725867
Send/Recv		1048576	0.030873	33.964575
Kind double		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000049	0.162870
Send/Recv		2	0.000049	0.326104
Send/Recv		4	0.000049	0.652024
Send/Recv		8	0.000050	1.292666
Send/Recv		16	0.000050	2.539784
Send/Recv		32	0.000067	3.812383
Send/Recv		64	0.000079	6.441872
Send/Recv		128	0.000099	10.379945
Send/Recv		256	0.000140	14.612915
Send/Recv		512	0.000209	19.632131
Send/Recv		1024	0.000415	19.733221
Send/Recv		2048	0.000675	24.264503
Send/Recv		4096	0.001157	28.331317
Send/Recv		8192	0.002113	31.012682
Send/Recv		16384	0.004060	32.283644
Send/Recv		32768	0.007947	32.984461
Send/Recv		65536	0.015460	33.912411
Send/Recv		131072	0.030828	34.014154
Send/Recv		262144	0.061546	34.074490
Send/Recv		524288	0.122700	34.183435
Send/Recv		1048576	0.245707	34.140733
Kind int		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000049	0.081716
Send/Recv		2	0.000049	0.163345
Send/Recv		4	0.000049	0.326250
Send/Recv		8	0.000049	0.651736
Send/Recv		16	0.000049	1.295562
Send/Recv		32	0.000050	2.539622
Send/Recv		64	0.000068	3.788757
Send/Recv		128	0.000079	6.446187
Send/Recv		256	0.000101	10.174293
Send/Recv		512	0.000147	13.948569
Send/Recv		1024	0.000210	19.493159
Send/Recv		2048	0.000414	19.793411
Send/Recv		4096	0.000675	24.273492
Send/Recv		8192	0.001151	28.465448
Send/Recv		16384	0.002107	31.109846
Send/Recv		32768	0.004033	32.501992
Send/Recv		65536	0.007845	33.413614
Send/Recv		131072	0.015422	33.995393
Send/Recv		262144	0.030761	34.087409
Send/Recv		524288	0.061300	34.211198
Send/Recv		1048576	0.122871	34.135771

Benchmarking point to point performance with contention

	mpicc -o pingpong -O pingpong.c
Kind (np=2)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000049	0.163598
Send/Recv	2	0.000049	0.325285
Send/Recv	4	0.000049	0.649081
Send/Recv	8	0.000049	1.293648
Send/Recv	16	0.000050	2.540597
Send/Recv	32	0.000067	3.795790
Send/Recv	64	0.000080	6.430141
Send/Recv	128	0.000099	10.344181
Send/Recv	256	0.000141	14.515384
Send/Recv	512	0.000214	19.113392
Send/Recv	1024	0.000419	19.574669
Send/Recv	2048	0.000673	24.352868
Send/Recv	4096	0.001158	28.301036
Send/Recv	8192	0.002095	31.279488
Send/Recv	16384	0.003998	32.780395
Send/Recv	32768	0.007843	33.425277
Send/Recv	65536	0.015587	33.635615
Send/Recv	131072	0.030801	34.043846
Send/Recv	262144	0.061414	34.147548
Send/Recv	524288	0.122301	34.294932
Send/Recv	1048576	0.245985	34.102155
Kind (np=4)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000049	0.162797
Send/Recv	2	0.000049	0.326176
Send/Recv	4	0.000049	0.648283
Send/Recv	8	0.000050	1.287263
Send/Recv	16	0.000051	2.532530
Send/Recv	32	0.000068	3.771057
Send/Recv	64	0.000080	6.398667
Send/Recv	128	0.000101	10.167918
Send/Recv	256	0.000139	14.754160
Send/Recv	512	0.000213	19.189511
Send/Recv	1024	0.000414	19.770128
Send/Recv	2048	0.000672	24.376421
Send/Recv	4096	0.001157	28.311429
Send/Recv	8192	0.002118	30.936557
Send/Recv	16384	0.004036	32.472701
Send/Recv	32768	0.007879	33.272494
Send/Recv	65536	0.015539	33.740436
Send/Recv	131072	0.030841	33.999444
Send/Recv	262144	0.061530	34.083365
Send/Recv	524288	0.122533	34.230055
Send/Recv	1048576	0.245507	34.168519
Kind (np=8)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000049	0.162823
Send/Recv	2	0.000049	0.324955
Send/Recv	4	0.000049	0.647729
Send/Recv	8	0.000050	1.289964
Send/Recv	16	0.000051	2.532763
Send/Recv	32	0.000068	3.769735
Send/Recv	64	0.000080	6.391811
Send/Recv	128	0.000100	10.190503
Send/Recv	256	0.000140	14.612045
Send/Recv	512	0.000214	19.167057
Send/Recv	1024	0.000416	19.705337
Send/Recv	2048	0.000679	24.144271
Send/Recv	4096	0.001160	28.256194
Send/Recv	8192	0.002113	31.018738
Send/Recv	16384	0.004039	32.450893
Send/Recv	32768	0.007884	33.249863
Send/Recv	65536	0.015517	33.787185
Send/Recv	131072	0.030881	33.954855
Send/Recv	262144	0.061674	34.004033
Send/Recv	524288	0.122890	34.130441
Send/Recv	1048576	0.246030	34.095821
Kind (np=16)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000051	0.158407
Send/Recv	2	0.000051	0.312505
Send/Recv	4	0.000052	0.620784
Send/Recv	8	0.000052	1.236058
Send/Recv	16	0.000054	2.368169
Send/Recv	32	0.000069	3.687904
Send/Recv	64	0.000083	6.180154
Send/Recv	128	0.000102	10.073428
Send/Recv	256	0.000143	14.369410
Send/Recv	512	0.000222	18.472299
Send/Recv	1024	0.000442	18.517701
Send/Recv	2048	0.000730	22.431159
Send/Recv	4096	0.001251	26.199992
Send/Recv	8192	0.002312	28.349394
Send/Recv	16384	0.004411	29.717414
Send/Recv	32768	0.008621	30.408624
Send/Recv	65536	0.017136	30.595906
Send/Recv	131072	0.034046	30.799243
Send/Recv	262144	0.067761	30.949043
Send/Recv	524288	0.135505	30.953037
Send/Recv	1048576	0.272360	30.799719
Kind (np=32)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000051	0.157621
Send/Recv	2	0.000051	0.313361
Send/Recv	4	0.000052	0.620078
Send/Recv	8	0.000052	1.224519
Send/Recv	16	0.000055	2.337237
Send/Recv	32	0.000070	3.659166
Send/Recv	64	0.000083	6.169294
Send/Recv	128	0.000106	9.655172
Send/Recv	256	0.000155	13.232467
Send/Recv	512	0.000323	12.697823
Send/Recv	1024	0.000498	16.460956
Send/Recv	2048	0.000826	19.834152
Send/Recv	4096	0.001513	21.655844
Send/Recv	8192	0.002810	23.319618
Send/Recv	16384	0.005436	24.110849
Send/Recv	32768	0.010479	25.015531
Send/Recv	65536	0.021049	24.907710
Send/Recv	131072	0.041756	25.112217
Send/Recv	262144	0.083295	25.177255
Send/Recv	524288	0.166516	25.188585
Send/Recv	1048576	0.333755	25.134057
Kind (np=64)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000055	0.145556
Send/Recv	2	0.000055	0.290432
Send/Recv	4	0.000056	0.573316
Send/Recv	8	0.000056	1.142998
Send/Recv	16	0.000056	2.279837
Send/Recv	32	0.000072	3.545726
Send/Recv	64	0.000086	5.958627
Send/Recv	128	0.000105	9.770160
Send/Recv	256	0.000225	9.104245
Send/Recv	512	0.000311	13.161954
Send/Recv	1024	0.000469	17.453926
Send/Recv	2048	0.000832	19.683140
Send/Recv	4096	0.001558	21.038507
Send/Recv	8192	0.002971	22.061258
Send/Recv	16384	0.006034	21.721566
Send/Recv	32768	0.011912	22.005838
Send/Recv	65536	0.023890	21.945896
Send/Recv	131072	0.047515	22.068426
Send/Recv	262144	0.095488	21.962383
Send/Recv	524288	0.187142	22.412377
Send/Recv	1048576	0.377694	22.210091

barrier

Benchmarking collective barrier

	mpicc -o barrier -O barrier.c
Kind	np	time (sec)
Barrier	1	0.000009
Barrier	2	0.000082
Barrier	4	0.000147
Barrier	8	0.000217
Barrier	16	0.000297
Barrier	32	0.000391
Barrier	64	0.000593

Benchmarking collective Allreduce

	mpicc -o barrier -O barrier.c
Kind		np	time (sec)
Allreduce	1	0.000019
Allreduce	2	0.000105
Allreduce	4	0.000179
Allreduce	8	0.000259
Allreduce	16	0.000357
Allreduce	32	0.000468
Allreduce	64	0.000670

vector

Comparing the performance of MPI vector datatypes

	mpicc -o vector -O vector.c
Kind	n	stride	time (sec)	Rate (MB/sec)
Vector	1000	24	0.002025	3.951277
Struct	1000	24	0.002662	3.005806
User	1000	24	0.000532	15.029694
User(add)	1000	24	0.000532	15.042305

circulate

Pipelining pitfalls

	mpicc -c -O  circulate.c
	mpicc -o circulate -O circulate.o -lm
For n = 20000, m = 20000, T_comm = 0.006650, T_compute = 0.024686, sum = 0.031336, T_both = 0.031700
For n = 500, m = 500, T_comm = 0.000145, T_compute = 0.000631, sum = 0.000776, T_both = 0.000787

3way

Exploring the cost of synchronization delays

	mpicc -c -O  bad.c
	mpicc -o bad -O bad.o  -lm
[2] Litsize = 8, Time for first send = 0.000049, for second = 0.000033
[2] Litsize = 9, Time for first send = 0.000049, for second = 0.000032
[2] Litsize = 511, Time for first send = 0.000179, for second = 0.000146
[2] Litsize = 512, Time for first send = 0.000170, for second = 0.000139
[2] Litsize = 513, Time for first send = 0.000348, for second = 0.001785

jacobi

Jacobi Iteration - Example Parallel Mesh

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
send/recv: 25 iterations in 0.013504 secs (2.073418 MFlops); diffnorm 0.036615, m=7 n=34 np=16
send/recv: 25 iterations in 0.864316 secs (26.538430 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
send/recv: 25 iterations in 0.017809 secs (3.144473 MFlops); diffnorm 0.055291, m=7 n=66 np=32
send/recv: 25 iterations in 1.760595 secs (26.056649 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
send/recv: 25 iterations in 0.025680 secs (4.361456 MFlops); diffnorm 0.080560, m=7 n=130 np=64
send/recv: 25 iterations in 3.492929 secs (26.267467 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Shift up and down

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
shift/sendrecv: 25 iterations in 0.016899 secs (1.656886 MFlops); diffnorm 0.036615, m=7 n=34 np=16
shift/sendrecv: 25 iterations in 0.149310 secs (153.623592 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
shift/sendrecv: 25 iterations in 0.018448 secs (3.035477 MFlops); diffnorm 0.055291, m=7 n=66 np=32
shift/sendrecv: 25 iterations in 0.177332 secs (258.697083 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
shift/sendrecv: 25 iterations in 0.047895 secs (2.338469 MFlops); diffnorm 0.080560, m=7 n=130 np=64
shift/sendrecv: 25 iterations in 0.171299 secs (535.616832 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Exchange head-to-head

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
head-to-head sendrecv: 25 iterations in 0.014439 secs (1.939253 MFlops); diffnorm 0.036615, m=7 n=34 np=16
head-to-head sendrecv: 25 iterations in 0.149807 secs (153.114340 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
head-to-head sendrecv: 25 iterations in 0.019578 secs (2.860408 MFlops); diffnorm 0.055291, m=7 n=66 np=32
head-to-head sendrecv: 25 iterations in 0.180843 secs (253.674708 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
head-to-head sendrecv: 25 iterations in 0.033262 secs (3.367246 MFlops); diffnorm 0.080560, m=7 n=130 np=64
head-to-head sendrecv: 25 iterations in 0.242692 secs (378.052490 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Nonblocking send/recv

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
irecv/isend: 25 iterations in 0.035451 secs (0.789819 MFlops); diffnorm 0.036615, m=7 n=34 np=16
irecv/isend: 25 iterations in 0.150000 secs (152.917384 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
irecv/isend: 25 iterations in 0.041119 secs (1.361916 MFlops); diffnorm 0.055291, m=7 n=66 np=32
irecv/isend: 25 iterations in 0.175528 secs (261.355491 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
irecv/isend: 25 iterations in 0.046269 secs (2.420634 MFlops); diffnorm 0.080560, m=7 n=130 np=64
irecv/isend: 25 iterations in 0.269917 secs (339.921231 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Nonblocking send/recv for receiver pull

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
isend/irecv: 25 iterations in 0.014410 secs (1.943095 MFlops); diffnorm 0.036615, m=7 n=34 np=16
isend/irecv: 25 iterations in 0.140435 secs (163.332241 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
isend/irecv: 25 iterations in 0.016638 secs (3.365734 MFlops); diffnorm 0.055291, m=7 n=66 np=32
isend/irecv: 25 iterations in 0.168898 secs (271.615263 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
isend/irecv: 25 iterations in 0.053518 secs (2.092769 MFlops); diffnorm 0.080560, m=7 n=130 np=64
isend/irecv: 25 iterations in 0.156072 secs (587.873865 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Synchronous send

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
ssend/irecv: 25 iterations in 0.019422 secs (1.441638 MFlops); diffnorm 0.036615, m=7 n=34 np=16
ssend/irecv: 25 iterations in 0.146725 secs (156.330044 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
ssend/irecv: 25 iterations in 0.022631 secs (2.474528 MFlops); diffnorm 0.055291, m=7 n=66 np=32
ssend/irecv: 25 iterations in 0.175759 secs (261.012028 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
ssend/irecv: 25 iterations in 0.209149 secs (0.535504 MFlops); diffnorm 0.080560, m=7 n=130 np=64
ssend/irecv: 25 iterations in 0.175688 secs (522.234202 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Ready send

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
rsend: 25 iterations in 0.013047 secs (2.146100 MFlops); diffnorm 0.036615, m=7 n=34 np=16
rsend: 25 iterations in 0.224038 secs (102.382460 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
rsend: 25 iterations in 0.016684 secs (3.356424 MFlops); diffnorm 0.055291, m=7 n=66 np=32
rsend: 25 iterations in 0.174061 secs (263.557913 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
rsend: 25 iterations in 0.023240 secs (4.819225 MFlops); diffnorm 0.080560, m=7 n=130 np=64
rsend: 25 iterations in 0.152655 secs (601.031674 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Overlapping communication

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
isend/overlap: 25 iterations in 0.014017 secs (1.997621 MFlops); diffnorm 0.036615, m=7 n=34 np=16
isend/overlap: 25 iterations in 0.152737 secs (150.176782 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
isend/overlap: 25 iterations in 0.017677 secs (3.167972 MFlops); diffnorm 0.055291, m=7 n=66 np=32
isend/overlap: 25 iterations in 0.178825 secs (256.537017 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
isend/overlap: 25 iterations in 0.027328 secs (4.098297 MFlops); diffnorm 0.080560, m=7 n=130 np=64
isend/overlap: 25 iterations in 0.220125 secs (416.810591 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Overlapping communication (sends first)

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
send first/overlap: 25 iterations in 0.013883 secs (2.016837 MFlops); diffnorm 0.036615, m=7 n=34 np=16
send first/overlap: 25 iterations in 0.142112 secs (161.405515 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
send first/overlap: 25 iterations in 0.017231 secs (3.250004 MFlops); diffnorm 0.055291, m=7 n=66 np=32
send first/overlap: 25 iterations in 0.170886 secs (268.455504 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
send first/overlap: 25 iterations in 0.034886 secs (3.210418 MFlops); diffnorm 0.080560, m=7 n=130 np=64
send first/overlap: 25 iterations in 0.178609 secs (513.694662 MFlops); diffnorm 0.474303, m=4098 n=130 np=64

Jacobi Iteration - Persistent send/recv

	mpicc -c -O  jacobi.c
	mpicc -c -O  cmdline.c
	mpicc -c -O  setupmesh.c
	mpicc -c -O  exchng.c
	mpicc -o jacobi -O jacobi.o cmdline.o setupmesh.o exchng.o -lm
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
qsub failed: qsub: Job exceeds queue resource limits
mpirun aborting
persistent send/recv: 25 iterations in 0.013110 secs (2.135713 MFlops); diffnorm 0.036615, m=7 n=34 np=16
persistent send/recv: 25 iterations in 0.150196 secs (152.718011 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
persistent send/recv: 25 iterations in 0.016714 secs (3.350575 MFlops); diffnorm 0.055291, m=7 n=66 np=32
persistent send/recv: 25 iterations in 0.177111 secs (259.019997 MFlops); diffnorm 0.470684, m=4098 n=66 np=32
persistent send/recv: 25 iterations in 0.024360 secs (4.597654 MFlops); diffnorm 0.080560, m=7 n=130 np=64
persistent send/recv: 25 iterations in 0.179437 secs (511.324535 MFlops); diffnorm 0.474303, m=4098 n=130 np=64