Results for MPI perforance tests on anlspx (nopoll)

Contents

memcpy

Determining delivered memory performance

Target memcpy is up to date.
Size (bytes) Time (sec)	Rate (MB/sec)
4	0.000000	15.613092
8	0.000000	31.221766
16	0.000000	55.669211
32	0.000001	46.609181
64	0.000001	66.798034
128	0.000002	85.263756
256	0.000003	98.939044
512	0.000005	107.566209
1024	0.000009	112.468054
2048	0.000018	115.090994
4096	0.000035	116.448907
8192	0.000070	117.139258
16384	0.000143	114.196600
32768	0.000563	58.247751
65536	0.001122	58.404997
131072	0.002246	58.348876
262144	0.004512	58.102506
524288	0.009135	57.392694
1048576	0.018317	57.246056
2097152	0.036738	57.084389

Determining delivered memory performance with unaligned data

Target memcpy is up to date.
Size (bytes) Time (sec)	Rate (MB/sec)
4	0.000000	9.600770
8	0.000000	18.491915
16	0.000000	32.223567
32	0.000001	35.664054
64	0.000001	53.442179
128	0.000002	70.932362
256	0.000003	84.811184
512	0.000005	94.007348
1024	0.000010	99.397056
2048	0.000020	102.330468
4096	0.000039	103.861556
8192	0.000078	104.644125
16384	0.000160	102.084971
32768	0.000494	66.357171
65536	0.000986	66.483666
131072	0.001971	66.499975
262144	0.003946	66.438735
524288	0.007998	65.555461
1048576	0.016075	65.231249
2097152	0.032440	64.647949

pingpong

Benchmarking point to point performance

Target pingpong is up to date.
Kind		n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000113	0.071106
Send/Recv	2	0.000112	0.142856
Send/Recv	4	0.000114	0.280479
Send/Recv	8	0.000356	0.179938
Send/Recv	16	0.000296	0.432699
Send/Recv	32	0.000239	1.072867
Send/Recv	64	0.000220	2.329859
Send/Recv	128	0.000224	4.572158
Send/Recv	256	0.000282	7.254372
Send/Recv	512	0.000386	10.606936
Send/Recv	1024	0.000586	13.985489
Send/Recv	2048	0.001079	15.186716
Send/Recv	4096	0.001693	19.350134
Send/Recv	8192	0.002841	23.071283
Send/Recv	16384	0.004899	26.757443
Send/Recv	32768	0.008930	29.356294
Send/Recv	65536	0.016871	31.076124
Send/Recv	131072	0.033063	31.714904
Send/Recv	262144	0.065363	32.084886
Send/Recv	524288	0.129922	32.283269
Send/Recv	1048576	0.259748	32.295197

Benchmarking point to point performance with nonblocking operations

Target pingpong is up to date.
Kind		n	time (sec)	Rate (MB/sec)
Isend/Irecv	1	0.000124	0.064615
Isend/Irecv	2	0.000123	0.130093
Isend/Irecv	4	0.000125	0.255099
Isend/Irecv	8	0.000352	0.181587
Isend/Irecv	16	0.000299	0.427617
Isend/Irecv	32	0.000255	1.005344
Isend/Irecv	64	0.000223	2.292965
Isend/Irecv	128	0.000234	4.370932
Isend/Irecv	256	0.000294	6.973398
Isend/Irecv	512	0.000393	10.415106
Isend/Irecv	1024	0.000597	13.728842
Isend/Irecv	2048	0.001090	15.030847
Isend/Irecv	4096	0.001777	18.437864
Isend/Irecv	8192	0.002806	23.355353
Isend/Irecv	16384	0.004897	26.766801
Isend/Irecv	32768	0.008954	29.275154
Isend/Irecv	65536	0.017182	30.514504
Isend/Irecv	131072	0.033315	31.474733
Isend/Irecv	262144	0.065434	32.049668
Isend/Irecv	524288	0.130724	32.085095
Isend/Irecv	1048576	0.260616	32.187671

Benchmarking point to point performance with nonblocking operations, head-to-head

Target pingpong is up to date.
Kind				n	time (sec)	Rate (MB/sec)
head-to-head Isend/Irecv	1	0.000292	0.054800
head-to-head Isend/Irecv	2	0.000294	0.108787
head-to-head Isend/Irecv	4	0.000296	0.216286
head-to-head Isend/Irecv	8	0.000568	0.225397
head-to-head Isend/Irecv	16	0.000541	0.472838
head-to-head Isend/Irecv	32	0.000551	0.928446
head-to-head Isend/Irecv	64	0.000514	1.992031
head-to-head Isend/Irecv	128	0.000573	3.571522
head-to-head Isend/Irecv	256	0.000611	6.704130
head-to-head Isend/Irecv	512	0.000530	15.471194
head-to-head Isend/Irecv	1024	0.000956	17.137628
head-to-head Isend/Irecv	2048	0.001948	16.823298
head-to-head Isend/Irecv	4096	0.003492	18.765319
head-to-head Isend/Irecv	8192	0.005022	26.098782
head-to-head Isend/Irecv	16384	0.008555	30.641662
head-to-head Isend/Irecv	32768	0.016558	31.663680
head-to-head Isend/Irecv	65536	0.031551	33.233819
head-to-head Isend/Irecv	131072	0.061299	34.211791
head-to-head Isend/Irecv	262144	0.120522	34.801192
head-to-head Isend/Irecv	524288	0.239733	34.991432
head-to-head Isend/Irecv	1048576	0.477629	35.126065

Benchmarking point to point performance with unaligned data

Target pingpong is up to date.
Kind char		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000112	0.008947
Send/Recv		2	0.000111	0.017974
Send/Recv		4	0.000110	0.036210
Send/Recv		8	0.000111	0.071843
Send/Recv		16	0.000112	0.143494
Send/Recv		32	0.000113	0.282777
Send/Recv		64	0.000317	0.201682
Send/Recv		128	0.000281	0.454894
Send/Recv		256	0.000226	1.134039
Send/Recv		512	0.000184	2.779777
Send/Recv		1024	0.000221	4.637421
Send/Recv		2048	0.000282	7.268531
Send/Recv		4096	0.000395	10.381119
Send/Recv		8192	0.000627	13.071383
Send/Recv		16384	0.001111	14.753384
Send/Recv		32768	0.001689	19.396668
Send/Recv		65536	0.002962	22.127271
Send/Recv		131072	0.004974	26.353215
Send/Recv		262144	0.008971	29.220618
Send/Recv		524288	0.017196	30.489439
Send/Recv		1048576	0.033199	31.584685
Kind double		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000112	0.071378
Send/Recv		2	0.000112	0.142912
Send/Recv		4	0.000114	0.280235
Send/Recv		8	0.000341	0.187649
Send/Recv		16	0.000302	0.424163
Send/Recv		32	0.000250	1.022904
Send/Recv		64	0.000226	2.261026
Send/Recv		128	0.000223	4.598335
Send/Recv		256	0.000282	7.253088
Send/Recv		512	0.000397	10.323556
Send/Recv		1024	0.000642	12.755651
Send/Recv		2048	0.001111	14.742430
Send/Recv		4096	0.001752	18.705868
Send/Recv		8192	0.002915	22.478669
Send/Recv		16384	0.004850	27.022926
Send/Recv		32768	0.008952	29.282062
Send/Recv		65536	0.016969	30.897586
Send/Recv		131072	0.033306	31.483238
Send/Recv		262144	0.065622	31.958215
Send/Recv		524288	0.130447	32.153310
Send/Recv		1048576	0.259607	32.312778
Kind int		n	time (sec)	Rate (MB/sec)
Send/Recv		1	0.000111	0.036019
Send/Recv		2	0.000112	0.071523
Send/Recv		4	0.000112	0.143276
Send/Recv		8	0.000114	0.281623
Send/Recv		16	0.000340	0.188451
Send/Recv		32	0.000308	0.416028
Send/Recv		64	0.000239	1.072334
Send/Recv		128	0.000191	2.683564
Send/Recv		256	0.000220	4.659045
Send/Recv		512	0.000284	7.211904
Send/Recv		1024	0.000399	10.263735
Send/Recv		2048	0.000634	12.915787
Send/Recv		4096	0.001104	14.834701
Send/Recv		8192	0.001717	19.084449
Send/Recv		16384	0.002696	24.308267
Send/Recv		32768	0.004917	26.656362
Send/Recv		65536	0.009024	29.049926
Send/Recv		131072	0.017124	30.618017
Send/Recv		262144	0.033281	31.506509
Send/Recv		524288	0.065548	31.993916
Send/Recv		1048576	0.130628	32.108758

Benchmarking point to point performance with contention

Target pingpong is up to date.
Kind (np=2)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000115	0.069828
Send/Recv	2	0.000115	0.139731
Send/Recv	4	0.000116	0.274702
Send/Recv	8	0.000322	0.198944
Send/Recv	16	0.000313	0.408505
Send/Recv	32	0.000244	1.047987
Send/Recv	64	0.000209	2.452342
Send/Recv	128	0.000225	4.546060
Send/Recv	256	0.000285	7.184914
Send/Recv	512	0.000395	10.377175
Send/Recv	1024	0.000611	13.411372
Send/Recv	2048	0.001088	15.056573
Send/Recv	4096	0.001749	18.735811
Send/Recv	8192	0.002798	23.420980
Send/Recv	16384	0.004907	26.708847
Send/Recv	32768	0.008946	29.304034
Send/Recv	65536	0.017120	30.624456
Send/Recv	131072	0.033511	31.290677
Send/Recv	262144	0.065813	31.865467
Send/Recv	524288	0.130850	32.054178
Send/Recv	1048576	0.261148	32.122086
Kind (np=4)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000112	0.071461
Send/Recv	2	0.000112	0.143366
Send/Recv	4	0.000114	0.281136
Send/Recv	8	0.000337	0.189959
Send/Recv	16	0.000306	0.418062
Send/Recv	32	0.000258	0.992741
Send/Recv	64	0.000217	2.357736
Send/Recv	128	0.000221	4.629930
Send/Recv	256	0.000282	7.255014
Send/Recv	512	0.000382	10.724268
Send/Recv	1024	0.000593	13.813920
Send/Recv	2048	0.001095	14.962045
Send/Recv	4096	0.001778	18.428791
Send/Recv	8192	0.002671	24.534751
Send/Recv	16384	0.004856	26.992735
Send/Recv	32768	0.008814	29.742745
Send/Recv	65536	0.017050	30.750863
Send/Recv	131072	0.033173	31.609452
Send/Recv	262144	0.065441	32.046448
Send/Recv	524288	0.130619	32.110955
Send/Recv	1048576	0.260107	32.250657
Kind (np=8)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000111	0.072170
Send/Recv	2	0.000111	0.144312
Send/Recv	4	0.000114	0.280537
Send/Recv	8	0.000335	0.191142
Send/Recv	16	0.000294	0.435595
Send/Recv	32	0.000222	1.150887
Send/Recv	64	0.000214	2.397228
Send/Recv	128	0.000223	4.588328
Send/Recv	256	0.000283	7.236003
Send/Recv	512	0.000382	10.731995
Send/Recv	1024	0.000601	13.625513
Send/Recv	2048	0.001088	15.063322
Send/Recv	4096	0.001760	18.613819
Send/Recv	8192	0.002895	22.635697
Send/Recv	16384	0.004823	27.175460
Send/Recv	32768	0.008927	29.363981
Send/Recv	65536	0.016978	30.880638
Send/Recv	131072	0.032977	31.796915
Send/Recv	262144	0.065325	32.103182
Send/Recv	524288	0.130097	32.239892
Send/Recv	1048576	0.259485	32.327924
Kind (np=16)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000111	0.071839
Send/Recv	2	0.000112	0.143416
Send/Recv	4	0.000115	0.277887
Send/Recv	8	0.000346	0.184755
Send/Recv	16	0.000296	0.433012
Send/Recv	32	0.000248	1.032221
Send/Recv	64	0.000205	2.498668
Send/Recv	128	0.000221	4.625299
Send/Recv	256	0.000284	7.219210
Send/Recv	512	0.000389	10.519421
Send/Recv	1024	0.000604	13.569370
Send/Recv	2048	0.001086	15.092809
Send/Recv	4096	0.001732	18.914255
Send/Recv	8192	0.002885	22.716511
Send/Recv	16384	0.004845	27.050812
Send/Recv	32768	0.009027	29.039509
Send/Recv	65536	0.017389	30.149883
Send/Recv	131072	0.033829	30.996673
Send/Recv	262144	0.066899	31.348032
Send/Recv	524288	0.133754	31.358298
Send/Recv	1048576	0.267101	31.406183
Kind (np=32)	n	time (sec)	Rate (MB/sec)
Send/Recv	1	0.000113	0.070770
Send/Recv	2	0.000114	0.140738
Send/Recv	4	0.000116	0.276265
Send/Recv	8	0.000346	0.184863
Send/Recv	16	0.000285	0.449718
Send/Recv	32	0.000256	1.000072
Send/Recv	64	0.000224	2.285102
Send/Recv	128	0.000222	4.609387
Send/Recv	256	0.000285	7.193642
Send/Recv	512	0.000392	10.449645
Send/Recv	1024	0.000624	13.137414
Send/Recv	2048	0.001495	10.960206
Send/Recv	4096	0.003027	10.826804
Send/Recv	8192	0.005712	11.473942
Send/Recv	16384	0.011345	11.553665
Send/Recv	32768	0.022138	11.841620
Send/Recv	65536	0.044420	11.802869
Send/Recv	131072	0.087207	12.023980
Send/Recv	262144	0.178001	11.781714
Send/Recv	524288	0.355826	11.787521
Send/Recv	1048576	0.709474	11.823697

barrier

Benchmarking collective barrier

Target barrier is up to date.
Kind	np	time (sec)
Barrier	1	0.000002
Barrier	2	0.000266
Barrier	4	0.000699
Barrier	8	0.001198
Barrier	16	0.001706
Barrier	32	0.002223

Benchmarking collective Allreduce

Target barrier is up to date.
Kind		np	time (sec)
Allreduce	1	0.000017
Allreduce	2	0.000287
Allreduce	4	0.000537
Allreduce	8	0.000814
Allreduce	16	0.001102
Allreduce	32	0.001469

vector

Comparing the performance of MPI vector datatypes

Target vector is up to date.
Kind	n	stride	time (sec)	Rate (MB/sec)
Vector	1000	24	0.002021	3.958464
Struct	1000	24	0.011546	0.692904
User	1000	24	0.001412	5.666553
User(add)	1000	24	0.001413	5.660011

circulate

Pipelining pitfalls

Target circulate is up to date.
For n = 20000, m = 20000, T_comm = 0.012439, T_compute = 0.035924, sum = 0.048363, T_both = 0.042928
For n = 500, m = 500, T_comm = 0.000311, T_compute = 0.000885, sum = 0.001196, T_both = 0.001459

3way

Exploring the cost of synchronization delays

Target bad is up to date.
[2] Litsize = 8, Time for first send = 0.000301, for second = 0.000119
[2] Litsize = 9, Time for first send = 0.000298, for second = 0.000118
[2] Litsize = 511, Time for first send = 0.000487, for second = 0.000343
[2] Litsize = 512, Time for first send = 0.000482, for second = 0.000423
[2] Litsize = 513, Time for first send = 0.001217, for second = 0.000347

jacobi

Jacobi Iteration - Example Parallel Mesh

Target jacobi is up to date.
send/recv: 6 iterations in 0.002572 secs (0.163302 MFlops); diffnorm 0.008134, m=7 n=4 np=1
send/recv: 7 iterations in 0.021552 secs (18.625460 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
send/recv: 24 iterations in 0.031917 secs (0.210547 MFlops); diffnorm 0.009895, m=7 n=10 np=4
send/recv: 25 iterations in 0.354430 secs (16.179230 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
send/recv: 25 iterations in 0.055101 secs (0.508158 MFlops); diffnorm 0.036615, m=7 n=34 np=16
send/recv: 25 iterations in 0.414272 secs (55.368390 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
send/recv: 25 iterations in 0.068831 secs (0.813589 MFlops); diffnorm 0.055291, m=7 n=66 np=32
send/recv: 25 iterations in 0.415681 secs (110.361452 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Shift up and down

Target jacobi is up to date.
shift/sendrecv: 6 iterations in 0.002747 secs (0.152904 MFlops); diffnorm 0.008134, m=7 n=4 np=1
shift/sendrecv: 7 iterations in 0.021888 secs (18.339139 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
shift/sendrecv: 24 iterations in 0.033650 secs (0.199700 MFlops); diffnorm 0.009895, m=7 n=10 np=4
shift/sendrecv: 25 iterations in 0.329892 secs (17.382638 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
shift/sendrecv: 25 iterations in 0.053200 secs (0.526313 MFlops); diffnorm 0.036615, m=7 n=34 np=16
shift/sendrecv: 25 iterations in 0.372010 secs (61.658632 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
shift/sendrecv: 25 iterations in 0.068540 secs (0.817045 MFlops); diffnorm 0.055291, m=7 n=66 np=32
shift/sendrecv: 25 iterations in 0.388944 secs (117.948168 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Exchange head-to-head

Target jacobi is up to date.
head-to-head sendrecv: 6 iterations in 0.002780 secs (0.151052 MFlops); diffnorm 0.008134, m=7 n=4 np=1
head-to-head sendrecv: 7 iterations in 0.022213 secs (18.070615 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
head-to-head sendrecv: 24 iterations in 0.036451 secs (0.184357 MFlops); diffnorm 0.009895, m=7 n=10 np=4
head-to-head sendrecv: 25 iterations in 0.314373 secs (18.240771 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
head-to-head sendrecv: 25 iterations in 0.058582 secs (0.477964 MFlops); diffnorm 0.036615, m=7 n=34 np=16
head-to-head sendrecv: 25 iterations in 0.370915 secs (61.840575 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
head-to-head sendrecv: 25 iterations in 0.068585 secs (0.816505 MFlops); diffnorm 0.055291, m=7 n=66 np=32
head-to-head sendrecv: 25 iterations in 0.367531 secs (124.819888 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Nonblocking send/recv

Target jacobi is up to date.
irecv/isend: 6 iterations in 0.002634 secs (0.159471 MFlops); diffnorm 0.008134, m=7 n=4 np=1
irecv/isend: 7 iterations in 0.021806 secs (18.407870 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
irecv/isend: 24 iterations in 0.031714 secs (0.211891 MFlops); diffnorm 0.009895, m=7 n=10 np=4
irecv/isend: 25 iterations in 0.307855 secs (18.626934 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
irecv/isend: 25 iterations in 0.051460 secs (0.544115 MFlops); diffnorm 0.036615, m=7 n=34 np=16
irecv/isend: 25 iterations in 0.350507 secs (65.441241 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
irecv/isend: 25 iterations in 0.060896 secs (0.919604 MFlops); diffnorm 0.055291, m=7 n=66 np=32
irecv/isend: 25 iterations in 0.368205 secs (124.591481 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Nonblocking send/recv for receiver pull

Target jacobi is up to date.
isend/irecv: 6 iterations in 0.002911 secs (0.144278 MFlops); diffnorm 0.008134, m=7 n=4 np=1
isend/irecv: 7 iterations in 0.021792 secs (18.420055 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
isend/irecv: 24 iterations in 0.033644 secs (0.199736 MFlops); diffnorm 0.009895, m=7 n=10 np=4
isend/irecv: 25 iterations in 0.332836 secs (17.228921 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
isend/irecv: 25 iterations in 0.071952 secs (0.389151 MFlops); diffnorm 0.036615, m=7 n=34 np=16
isend/irecv: 25 iterations in 0.371885 secs (61.679361 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
isend/irecv: 25 iterations in 0.062465 secs (0.896496 MFlops); diffnorm 0.055291, m=7 n=66 np=32
isend/irecv: 25 iterations in 0.406233 secs (112.928350 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Synchronous send

Target jacobi is up to date.
ssend/irecv: 6 iterations in 0.006455 secs (0.065066 MFlops); diffnorm 0.008134, m=7 n=4 np=1
ssend/irecv: 7 iterations in 0.021829 secs (18.388538 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
ssend/irecv: 24 iterations in 0.063690 secs (0.105512 MFlops); diffnorm 0.009895, m=7 n=10 np=4
ssend/irecv: 25 iterations in 0.296673 secs (19.329001 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
ssend/irecv: 25 iterations in 0.080165 secs (0.349281 MFlops); diffnorm 0.036615, m=7 n=34 np=16
ssend/irecv: 25 iterations in 0.343747 secs (66.728064 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
ssend/irecv: 25 iterations in 0.088993 secs (0.629261 MFlops); diffnorm 0.055291, m=7 n=66 np=32
ssend/irecv: 25 iterations in 0.354813 secs (129.293886 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Ready send

Target jacobi is up to date.
rsend: 6 iterations in 0.000314 secs (1.336834 MFlops); diffnorm 0.008134, m=7 n=4 np=1
rsend: 7 iterations in 0.021740 secs (18.464093 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
rsend: 24 iterations in 0.030727 secs (0.218701 MFlops); diffnorm 0.009895, m=7 n=10 np=4
rsend: 25 iterations in 0.335406 secs (17.096892 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
rsend: 25 iterations in 0.052040 secs (0.538049 MFlops); diffnorm 0.036615, m=7 n=34 np=16
rsend: 25 iterations in 0.374883 secs (61.186044 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
rsend: 25 iterations in 0.062123 secs (0.901438 MFlops); diffnorm 0.055291, m=7 n=66 np=32
rsend: 25 iterations in 0.382195 secs (120.030741 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Overlapping communication

Target jacobi is up to date.
isend/overlap: 6 iterations in 0.002704 secs (0.155321 MFlops); diffnorm 0.008134, m=7 n=4 np=1
isend/overlap: 7 iterations in 0.021670 secs (18.523908 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
isend/overlap: 24 iterations in 0.033490 secs (0.200655 MFlops); diffnorm 0.009895, m=7 n=10 np=4
isend/overlap: 25 iterations in 0.301209 secs (19.037947 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
isend/overlap: 25 iterations in 0.062912 secs (0.445069 MFlops); diffnorm 0.036615, m=7 n=34 np=16
isend/overlap: 25 iterations in 0.356756 secs (64.294906 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
isend/overlap: 25 iterations in 0.062354 secs (0.898095 MFlops); diffnorm 0.055291, m=7 n=66 np=32
isend/overlap: 25 iterations in 0.369554 secs (124.136813 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Overlapping communication (sends first)

Target jacobi is up to date.
send first/overlap: 6 iterations in 0.002756 secs (0.152399 MFlops); diffnorm 0.008134, m=7 n=4 np=1
send first/overlap: 7 iterations in 0.021358 secs (18.794137 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
send first/overlap: 24 iterations in 0.032777 secs (0.205023 MFlops); diffnorm 0.009895, m=7 n=10 np=4
send first/overlap: 25 iterations in 0.331630 secs (17.291538 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
send first/overlap: 25 iterations in 0.052683 secs (0.531483 MFlops); diffnorm 0.036615, m=7 n=34 np=16
send first/overlap: 25 iterations in 0.368361 secs (62.269384 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
send first/overlap: 25 iterations in 0.065713 secs (0.852188 MFlops); diffnorm 0.055291, m=7 n=66 np=32
send first/overlap: 25 iterations in 0.380592 secs (120.536348 MFlops); diffnorm 0.470684, m=4098 n=66 np=32

Jacobi Iteration - Persistent send/recv

Target jacobi is up to date.
persistent send/recv: 6 iterations in 0.000302 secs (1.392919 MFlops); diffnorm 0.008134, m=7 n=4 np=1
persistent send/recv: 7 iterations in 0.021310 secs (18.836846 MFlops); diffnorm 0.006899, m=4098 n=4 np=1
persistent send/recv: 24 iterations in 0.030315 secs (0.221672 MFlops); diffnorm 0.009895, m=7 n=10 np=4
persistent send/recv: 25 iterations in 0.301785 secs (19.001604 MFlops); diffnorm 0.138820, m=4098 n=10 np=4
persistent send/recv: 25 iterations in 0.051264 secs (0.546197 MFlops); diffnorm 0.036615, m=7 n=34 np=16
persistent send/recv: 25 iterations in 0.363528 secs (63.097277 MFlops); diffnorm 0.468864, m=4098 n=34 np=16
persistent send/recv: 25 iterations in 0.061367 secs (0.912542 MFlops); diffnorm 0.055291, m=7 n=66 np=32
persistent send/recv: 25 iterations in 0.358814 secs (127.852233 MFlops); diffnorm 0.470684, m=4098 n=66 np=32