ffc team mailing list archive
-
ffc team
-
Mailing list archive
-
Message #00378
Benchmark results for new BLAS mode
Here are the benchmark results/evaluation of the new BLAS mode as
promised. Full log files attached.
The test compares FFC default mode with FFC BLAS mode for Poisson
in 2D and 3D for q = 1, 2, ..., 8 and different levels of optimization
during the compilation with g++: -O0, -O1, -O2.
Here are some conclusions:
- Optimization (-O1, -O2) has little effect on run-time performance
in FFC default mode. These are numbers for Poisson degree 5 in 3D:
| Run-time | Compile-time (g++)
------------------------------------
-O0 | 3.417e-05 | 4.596e+00
-O1 | 3.116e-05 | 5.725e+01
-O2 | 3.788e-05 | 6.929e+01
With -O1, the code runs about 10% faster and with -02, the code
is actually slower, possibly due to increase in size of object code.
On the other hand, the level of optimization has quite a big
influence on the compile-time (g++). Compile-time increases with
more than a factor 10 when going from -O0 to -O1 and even more when
going to -O2.
Compiling with optimization also increases the memory usage for g++
and I can't go beyond q = 5 for Poisson on my 2GB machine with -O2
and g++ 4.0.2.
Conclusion: -O1 may improve the run-time performance, but only
little and with a big penalty in compile-time.
- Optimization (-O1, -O2) has no effect on run-time performance in
BLAS mode, at least not for Poisson, where the computation of the
geometry tensor is only a small contribution to the overall work.
These are numbers for Poisson degree 5 in 3D:
| Run-time | Compile-time (g++)
------------------------------------
-O0 | 6.787e-05 | 1.070e+00
-O1 | 6.817e-05 | 1.305e+00
-O2 | 6.805e-05 | 1.448e+00
Also note that compile-time (g++) is much the same with different
levels of optimizations in BLAS mode.
Conclusion: the level of optimization has little effect in BLAS
mode.
- Comparing now FFC default mode with FFC BLAS mode, we have the
following timings for Poisson degree 8 in 3D:
| BLAS mode | default MODE
-----------------------------------
FFC | 6.039e+01 | 6.807e+01
g++ | 1.383e+00 | 9.053e+01
Run-time | 5.907e-04 | 1.333e-03
Compile-time with FFC is about 1 minute in both modes.
Compile-time with g++ is a factor 6 better in BLAS mode.
Run-time is more than a factor 2 better in BLAS mode.
When examining the influence of the order q, one finds that BLAS
mode is always equal or faster than default mode in terms of g++
compile-time. In 3D, the difference is significant for q > 3.
Concerning run-time performance, FFC default mode is faster than
BLAS mode for small q, and BLAS is faster for large q. For Poisson
in 3D, the break-even point is at q = 6.
Conclusion: the FFC blas mode will in some cases generate faster
code, but it's first and foremost an option that should be
considered to decrease the compile-time (with g++).
- Overall conclusions:
It's unecessary to compile with -O1 or -O2. We have -O2 by
default in DOLFIN, but should probably consider to change to
-O0, at least for the compilation of forms. There is probably
a huge improvement in compile-time for Navier-Stokes if we switch
to -O0.
BLAS can be an option to reduce compile-time and possibly to
improve the run-time for high-order forms.
- Finally, note that these benchmarks are for Poisson, where the main
part of the work is the computation of the element tensor (doing the
tensor product). For other forms, computing the geometry tensor can
dominate, and then BLAS mode won't help, but compiling with -O0 may.
/Anders
Benchmark results: 2005-10-10-00-00
Columns: blas blas_a blas_g default default_a default_g
blas - FFC blas mode, evaluating element and geometry tensors
blas_a - FFC blas mode, evaluating only element tensor
blas_g - FFC blas mode, evaluating only geometry tensor
default - FFC default mode, evaluating element and geometry tensors
default_a - FFC default mode, evaluating only element tensor
default_g - FFC default mode, evaluating only geometry tensor
Timings in seconds
Benchmark results (compiling with FFC)
--------------------------------------
Poisson_2D_1: 1.055e-02 1.037e-02 1.075e-02 1.145e-02 1.127e-02 1.126e-02
Poisson_2D_2: 2.958e-02 2.937e-02 2.941e-02 3.339e-02 3.328e-02 3.334e-02
Poisson_2D_3: 7.361e-02 7.331e-02 7.329e-02 8.940e-02 8.565e-02 8.526e-02
Poisson_2D_4: 1.600e-01 1.661e-01 1.595e-01 1.888e-01 1.890e-01 1.880e-01
Poisson_2D_5: 3.068e-01 3.048e-01 3.063e-01 3.672e-01 3.649e-01 3.634e-01
Poisson_2D_6: 5.558e-01 5.474e-01 5.512e-01 6.572e-01 6.541e-01 6.592e-01
Poisson_2D_7: 9.120e-01 9.106e-01 9.122e-01 1.094e+00 1.084e+00 1.086e+00
Poisson_2D_8: 1.468e+00 1.478e+00 1.491e+00 1.776e+00 1.751e+00 1.753e+00
Poisson_3D_1: 2.977e-02 2.977e-02 2.987e-02 3.289e-02 3.283e-02 3.296e-02
Poisson_3D_2: 1.546e-01 1.549e-01 1.550e-01 1.782e-01 1.777e-01 1.779e-01
Poisson_3D_3: 6.197e-01 6.130e-01 6.138e-01 7.115e-01 7.086e-01 7.153e-01
Poisson_3D_4: 1.956e+00 1.975e+00 1.958e+00 2.260e+00 2.287e+00 2.263e+00
Poisson_3D_5: 5.251e+00 5.317e+00 5.295e+00 6.082e+00 6.081e+00 6.074e+00
Poisson_3D_6: 1.297e+01 1.331e+01 1.285e+01 1.474e+01 1.473e+01 1.484e+01
Poisson_3D_7: 2.843e+01 2.940e+01 2.850e+01 3.247e+01 3.280e+01 3.324e+01
Poisson_3D_8: 6.039e+01 6.124e+01 6.142e+01 6.807e+01 6.816e+01 6.889e+01
Benchmark results (compiling with gcc)
--------------------------------------
Poisson_2D_1: 9.197e-01 9.245e-01 9.192e-01 9.144e-01 9.193e-01 9.106e-01
Poisson_2D_2: 9.314e-01 9.217e-01 9.250e-01 9.369e-01 9.265e-01 9.158e-01
Poisson_2D_3: 9.448e-01 9.357e-01 9.371e-01 9.811e-01 9.711e-01 9.298e-01
Poisson_2D_4: 9.576e-01 9.474e-01 9.511e-01 1.069e+00 1.060e+00 9.437e-01
Poisson_2D_5: 9.641e-01 9.614e-01 9.706e-01 1.209e+00 1.207e+00 9.627e-01
Poisson_2D_6: 9.817e-01 9.856e-01 9.815e-01 1.462e+00 1.456e+00 9.738e-01
Poisson_2D_7: 1.001e+00 9.969e-01 1.008e+00 1.829e+00 1.835e+00 9.965e-01
Poisson_2D_8: 1.019e+00 1.025e+00 1.020e+00 2.379e+00 2.377e+00 1.015e+00
Poisson_3D_1: 9.264e-01 9.189e-01 9.256e-01 9.237e-01 9.234e-01 9.165e-01
Poisson_3D_2: 9.456e-01 9.293e-01 9.373e-01 9.942e-01 9.810e-01 9.296e-01
Poisson_3D_3: 9.732e-01 9.609e-01 9.662e-01 1.269e+00 1.257e+00 9.566e-01
Poisson_3D_4: 1.014e+00 1.005e+00 1.019e+00 2.120e+00 2.115e+00 1.004e+00
Poisson_3D_5: 1.070e+00 1.061e+00 1.063e+00 4.596e+00 4.583e+00 1.054e+00
Poisson_3D_6: 1.152e+00 1.142e+00 1.142e+00 1.085e+01 1.076e+01 1.142e+00
Poisson_3D_7: 1.247e+00 1.248e+00 1.249e+00 2.689e+01 2.754e+01 1.238e+00
Poisson_3D_8: 1.383e+00 1.373e+00 1.418e+00 9.053e+01 9.392e+01 1.380e+00
Benchmark results (evaluating form)
-----------------------------------
Poisson_2D_1: 4.600e-07 7.300e-07 4.000e-08 5.000e-08 5.000e-08 3.000e-08
Poisson_2D_2: 8.100e-07 8.900e-07 3.000e-08 1.300e-07 1.300e-07 3.000e-08
Poisson_2D_3: 2.120e-06 1.740e-06 4.000e-08 3.600e-07 3.600e-07 3.000e-08
Poisson_2D_4: 4.950e-06 3.640e-06 4.000e-08 9.100e-07 9.200e-07 3.000e-08
Poisson_2D_5: 9.850e-06 8.260e-06 5.000e-08 1.820e-06 1.810e-06 2.000e-08
Poisson_2D_6: 1.742e-05 1.408e-05 4.000e-08 4.470e-06 4.420e-06 3.000e-08
Poisson_2D_7: 2.318e-05 2.863e-05 5.000e-08 7.810e-06 7.790e-06 3.000e-08
Poisson_2D_8: 3.643e-05 4.416e-05 5.000e-08 1.266e-05 1.267e-05 3.000e-08
Poisson_3D_1: 9.200e-07 9.200e-07 9.000e-08 1.300e-07 8.000e-08 8.000e-08
Poisson_3D_2: 2.710e-06 2.520e-06 9.000e-08 5.700e-07 5.400e-07 7.000e-08
Poisson_3D_3: 9.130e-06 9.180e-06 1.000e-07 2.520e-06 2.500e-06 7.000e-08
Poisson_3D_4: 2.682e-05 2.687e-05 1.200e-07 1.057e-05 1.053e-05 8.000e-08
Poisson_3D_5: 6.787e-05 6.792e-05 1.500e-07 3.417e-05 3.244e-05 8.000e-08
Poisson_3D_6: 1.532e-04 1.533e-04 2.200e-07 1.504e-04 1.774e-04 8.000e-08
Poisson_3D_7: 3.116e-04 3.118e-04 3.700e-07 6.142e-04 6.106e-04 8.000e-08
Poisson_3D_8: 5.907e-04 5.907e-04 6.200e-07 1.333e-03 1.335e-03 7.000e-08
Lines of code
-------------
178 450 4017 Poisson_2D_1_blas_a.h
186 478 4268 Poisson_2D_1_blas_g.h
188 498 4423 Poisson_2D_1_blas.h
187 486 4544 Poisson_2D_1_default_a.h
176 449 4047 Poisson_2D_1_default_g.h
187 494 4698 Poisson_2D_1_default.h
198 530 4897 Poisson_2D_2_blas_a.h
206 558 5148 Poisson_2D_2_blas_g.h
208 578 5303 Poisson_2D_2_blas.h
234 701 7470 Poisson_2D_2_default_a.h
196 529 4927 Poisson_2D_2_default_g.h
234 709 7624 Poisson_2D_2_default.h
232 698 6851 Poisson_2D_3_blas_a.h
240 726 7102 Poisson_2D_3_blas_g.h
242 746 7257 Poisson_2D_3_blas.h
332 1309 16246 Poisson_2D_3_default_a.h
230 697 6881 Poisson_2D_3_default_g.h
332 1317 16400 Poisson_2D_3_default.h
262 846 8463 Poisson_2D_4_blas_a.h
270 874 8714 Poisson_2D_4_blas_g.h
272 894 8869 Poisson_2D_4_blas.h
487 2490 34036 Poisson_2D_4_default_a.h
260 845 8493 Poisson_2D_4_default_g.h
487 2498 34190 Poisson_2D_4_default.h
298 1018 10341 Poisson_2D_5_blas_a.h
306 1046 10592 Poisson_2D_5_blas_g.h
308 1066 10747 Poisson_2D_5_blas.h
739 4414 63294 Poisson_2D_5_default_a.h
296 1017 10371 Poisson_2D_5_default_g.h
739 4422 63448 Poisson_2D_5_default.h
340 1218 12517 Poisson_2D_6_blas_a.h
348 1246 12768 Poisson_2D_6_blas_g.h
350 1266 12923 Poisson_2D_6_blas.h
1124 7537 111102 Poisson_2D_6_default_a.h
338 1217 12547 Poisson_2D_6_default_g.h
1124 7545 111256 Poisson_2D_6_default.h
388 1446 14971 Poisson_2D_7_blas_a.h
396 1474 15222 Poisson_2D_7_blas_g.h
398 1494 15377 Poisson_2D_7_blas.h
1684 12237 183694 Poisson_2D_7_default_a.h
386 1445 15001 Poisson_2D_7_default_g.h
1684 12245 183848 Poisson_2D_7_default.h
442 1702 17707 Poisson_2D_8_blas_a.h
450 1730 17958 Poisson_2D_8_blas_g.h
452 1750 18113 Poisson_2D_8_blas.h
2467 19122 290619 Poisson_2D_8_default_a.h
440 1701 17737 Poisson_2D_8_default_g.h
2467 19130 290773 Poisson_2D_8_default.h
184 478 4453 Poisson_3D_1_blas_a.h
197 549 5176 Poisson_3D_1_blas_g.h
199 569 5331 Poisson_3D_1_blas.h
205 581 5821 Poisson_3D_1_default_a.h
187 525 4960 Poisson_3D_1_default_g.h
205 617 6337 Poisson_3D_1_default.h
222 642 6421 Poisson_3D_2_blas_a.h
235 713 7144 Poisson_3D_2_blas_g.h
237 733 7299 Poisson_3D_2_blas.h
327 1519 19786 Poisson_3D_2_default_a.h
225 689 6928 Poisson_3D_2_default_g.h
327 1555 20302 Poisson_3D_2_default.h
298 1016 11177 Poisson_3D_3_blas_a.h
311 1087 11900 Poisson_3D_3_blas_g.h
313 1107 12055 Poisson_3D_3_blas.h
703 5367 78687 Poisson_3D_3_default_a.h
301 1063 11684 Poisson_3D_3_default_g.h
703 5403 79203 Poisson_3D_3_default.h
400 1562 17881 Poisson_3D_4_blas_a.h
413 1633 18604 Poisson_3D_4_blas_g.h
415 1653 18759 Poisson_3D_4_blas.h
1630 16650 252862 Poisson_3D_4_default_a.h
403 1609 18388 Poisson_3D_4_default_g.h
1630 16686 253378 Poisson_3D_4_default.h
526 2236 26005 Poisson_3D_5_blas_a.h
539 2307 26728 Poisson_3D_5_blas_g.h
541 2327 26883 Poisson_3D_5_blas.h
3667 45107 695453 Poisson_3D_5_default_a.h
529 2283 26512 Poisson_3D_5_default_g.h
3667 45143 695969 Poisson_3D_5_default.h
694 3128 36789 Poisson_3D_6_blas_a.h
707 3199 37512 Poisson_3D_6_blas_g.h
709 3219 37667 Poisson_3D_6_blas.h
7755 105953 1643270 Poisson_3D_6_default_a.h
697 3175 37296 Poisson_3D_6_default_g.h
7755 105989 1643786 Poisson_3D_6_default.h
910 4272 50663 Poisson_3D_7_blas_a.h
923 4343 51386 Poisson_3D_7_blas_g.h
925 4363 51541 Poisson_3D_7_blas.h
15315 224559 3496931 Poisson_3D_7_default_a.h
913 4319 51170 Poisson_3D_7_default_g.h
15315 224595 3497447 Poisson_3D_7_default.h
1180 5698 67949 Poisson_3D_8_blas_a.h
1193 5769 68672 Poisson_3D_8_blas_g.h
1195 5789 68827 Poisson_3D_8_blas.h
28410 449924 7023698 Poisson_3D_8_default_a.h
1183 5745 68456 Poisson_3D_8_default_g.h
28410 449960 7024214 Poisson_3D_8_default.h
157916 1906296 29087154 totalt
Benchmark results: 2005-10-10-08-08
Columns: blas blas_a blas_g default default_a default_g
blas - FFC blas mode, evaluating element and geometry tensors
blas_a - FFC blas mode, evaluating only element tensor
blas_g - FFC blas mode, evaluating only geometry tensor
default - FFC default mode, evaluating element and geometry tensors
default_a - FFC default mode, evaluating only element tensor
default_g - FFC default mode, evaluating only geometry tensor
Timings in seconds
Benchmark results (compiling with FFC)
--------------------------------------
Poisson_2D_1: 1.061e-02 1.038e-02 1.056e-02 1.133e-02 1.127e-02 1.129e-02
Poisson_2D_2: 2.950e-02 2.956e-02 2.941e-02 3.353e-02 3.332e-02 3.332e-02
Poisson_2D_3: 7.428e-02 7.348e-02 7.363e-02 8.940e-02 9.257e-02 8.550e-02
Poisson_2D_4: 1.590e-01 1.594e-01 1.597e-01 1.896e-01 1.886e-01 1.882e-01
Poisson_2D_5: 3.073e-01 3.071e-01 3.077e-01 3.650e-01 3.673e-01 3.707e-01
Poisson_3D_1: 2.986e-02 2.968e-02 2.984e-02 3.307e-02 3.289e-02 3.297e-02
Poisson_3D_2: 1.552e-01 1.562e-01 1.554e-01 1.776e-01 1.779e-01 1.774e-01
Poisson_3D_3: 6.121e-01 6.111e-01 6.208e-01 7.126e-01 7.123e-01 7.082e-01
Poisson_3D_4: 1.982e+00 1.964e+00 1.956e+00 2.277e+00 2.289e+00 2.265e+00
Poisson_3D_5: 5.340e+00 5.238e+00 5.275e+00 6.096e+00 6.097e+00 6.110e+00
Benchmark results (compiling with gcc)
--------------------------------------
Poisson_2D_1: 9.836e-01 9.721e-01 9.888e-01 9.712e-01 9.650e-01 9.690e-01
Poisson_2D_2: 9.943e-01 9.835e-01 1.015e+00 1.006e+00 9.877e-01 9.824e-01
Poisson_2D_3: 1.024e+00 1.010e+00 1.028e+00 1.082e+00 1.036e+00 1.007e+00
Poisson_2D_4: 1.054e+00 1.052e+00 1.050e+00 1.332e+00 1.147e+00 1.032e+00
Poisson_2D_5: 1.070e+00 1.065e+00 1.067e+00 1.881e+00 1.330e+00 1.048e+00
Poisson_3D_1: 9.990e-01 9.748e-01 1.005e+00 9.905e-01 9.753e-01 9.757e-01
Poisson_3D_2: 1.022e+00 9.997e-01 1.028e+00 1.090e+00 1.034e+00 9.991e-01
Poisson_3D_3: 1.084e+00 1.053e+00 1.077e+00 1.900e+00 1.323e+00 1.047e+00
Poisson_3D_4: 1.195e+00 1.164e+00 1.185e+00 7.175e+00 2.240e+00 1.155e+00
Poisson_3D_5: 1.305e+00 1.279e+00 1.309e+00 5.725e+01 4.511e+00 1.274e+00
Benchmark results (evaluating form)
-----------------------------------
Poisson_2D_1: 3.800e-07 7.100e-07 2.000e-08 3.000e-08 1.000e-08 1.000e-08
Poisson_2D_2: 7.900e-07 8.800e-07 2.000e-08 8.000e-08 3.000e-08 1.000e-08
Poisson_2D_3: 2.080e-06 1.720e-06 2.000e-08 3.000e-07 1.000e-07 1.000e-08
Poisson_2D_4: 4.960e-06 3.630e-06 3.000e-08 9.500e-07 2.200e-07 1.000e-08
Poisson_2D_5: 9.760e-06 8.230e-06 2.000e-08 2.150e-06 4.300e-07 1.000e-08
Poisson_3D_1: 9.800e-07 9.400e-07 4.000e-08 7.000e-08 2.000e-08 0.000e+00
Poisson_3D_2: 2.600e-06 2.550e-06 5.000e-08 4.500e-07 1.100e-07 1.000e-08
Poisson_3D_3: 9.160e-06 9.130e-06 5.000e-08 2.400e-06 3.900e-07 1.000e-08
Poisson_3D_4: 2.698e-05 2.687e-05 7.000e-08 9.590e-06 1.160e-06 0.000e+00
Poisson_3D_5: 6.817e-05 6.804e-05 1.100e-07 3.116e-05 2.930e-06 0.000e+00
Lines of code
-------------
178 450 4017 Poisson_2D_1_blas_a.h
186 478 4268 Poisson_2D_1_blas_g.h
188 498 4423 Poisson_2D_1_blas.h
187 486 4544 Poisson_2D_1_default_a.h
176 449 4047 Poisson_2D_1_default_g.h
187 494 4698 Poisson_2D_1_default.h
198 530 4897 Poisson_2D_2_blas_a.h
206 558 5148 Poisson_2D_2_blas_g.h
208 578 5303 Poisson_2D_2_blas.h
234 701 7470 Poisson_2D_2_default_a.h
196 529 4927 Poisson_2D_2_default_g.h
234 709 7624 Poisson_2D_2_default.h
232 698 6851 Poisson_2D_3_blas_a.h
240 726 7102 Poisson_2D_3_blas_g.h
242 746 7257 Poisson_2D_3_blas.h
332 1309 16246 Poisson_2D_3_default_a.h
230 697 6881 Poisson_2D_3_default_g.h
332 1317 16400 Poisson_2D_3_default.h
262 846 8463 Poisson_2D_4_blas_a.h
270 874 8714 Poisson_2D_4_blas_g.h
272 894 8869 Poisson_2D_4_blas.h
487 2490 34036 Poisson_2D_4_default_a.h
260 845 8493 Poisson_2D_4_default_g.h
487 2498 34190 Poisson_2D_4_default.h
298 1018 10341 Poisson_2D_5_blas_a.h
306 1046 10592 Poisson_2D_5_blas_g.h
308 1066 10747 Poisson_2D_5_blas.h
739 4414 63294 Poisson_2D_5_default_a.h
296 1017 10371 Poisson_2D_5_default_g.h
739 4422 63448 Poisson_2D_5_default.h
184 478 4453 Poisson_3D_1_blas_a.h
197 549 5176 Poisson_3D_1_blas_g.h
199 569 5331 Poisson_3D_1_blas.h
205 581 5821 Poisson_3D_1_default_a.h
187 525 4960 Poisson_3D_1_default_g.h
205 617 6337 Poisson_3D_1_default.h
222 642 6421 Poisson_3D_2_blas_a.h
235 713 7144 Poisson_3D_2_blas_g.h
237 733 7299 Poisson_3D_2_blas.h
327 1519 19786 Poisson_3D_2_default_a.h
225 689 6928 Poisson_3D_2_default_g.h
327 1555 20302 Poisson_3D_2_default.h
298 1016 11177 Poisson_3D_3_blas_a.h
311 1087 11900 Poisson_3D_3_blas_g.h
313 1107 12055 Poisson_3D_3_blas.h
703 5367 78687 Poisson_3D_3_default_a.h
301 1063 11684 Poisson_3D_3_default_g.h
703 5403 79203 Poisson_3D_3_default.h
400 1562 17881 Poisson_3D_4_blas_a.h
413 1633 18604 Poisson_3D_4_blas_g.h
415 1653 18759 Poisson_3D_4_blas.h
1630 16650 252862 Poisson_3D_4_default_a.h
403 1609 18388 Poisson_3D_4_default_g.h
1630 16686 253378 Poisson_3D_4_default.h
526 2236 26005 Poisson_3D_5_blas_a.h
539 2307 26728 Poisson_3D_5_blas_g.h
541 2327 26883 Poisson_3D_5_blas.h
3667 45107 695453 Poisson_3D_5_default_a.h
529 2283 26512 Poisson_3D_5_default_g.h
3667 45143 695969 Poisson_3D_5_default.h
28449 196792 2775747 totalt
Benchmark results: 2005-10-10-08-19
Columns: blas blas_a blas_g default default_a default_g
blas - FFC blas mode, evaluating element and geometry tensors
blas_a - FFC blas mode, evaluating only element tensor
blas_g - FFC blas mode, evaluating only geometry tensor
default - FFC default mode, evaluating element and geometry tensors
default_a - FFC default mode, evaluating only element tensor
default_g - FFC default mode, evaluating only geometry tensor
Timings in seconds
Benchmark results (compiling with FFC)
--------------------------------------
Poisson_2D_1: 1.059e-02 1.050e-02 1.068e-02 1.131e-02 1.127e-02 1.126e-02
Poisson_2D_2: 2.945e-02 2.939e-02 2.962e-02 3.349e-02 3.343e-02 3.332e-02
Poisson_2D_3: 7.369e-02 7.341e-02 7.353e-02 9.019e-02 8.545e-02 8.509e-02
Poisson_2D_4: 1.587e-01 1.596e-01 1.599e-01 1.888e-01 1.901e-01 1.881e-01
Poisson_2D_5: 3.078e-01 3.073e-01 3.078e-01 3.663e-01 3.673e-01 3.722e-01
Poisson_3D_1: 2.996e-02 2.980e-02 2.984e-02 3.312e-02 3.300e-02 3.277e-02
Poisson_3D_2: 1.631e-01 1.554e-01 1.552e-01 1.782e-01 1.779e-01 1.775e-01
Poisson_3D_3: 6.123e-01 6.114e-01 6.152e-01 7.215e-01 7.120e-01 7.064e-01
Poisson_3D_4: 1.984e+00 1.950e+00 1.965e+00 2.272e+00 2.284e+00 2.264e+00
Poisson_3D_5: 5.352e+00 5.313e+00 5.355e+00 6.101e+00 6.095e+00 6.213e+00
Benchmark results (compiling with gcc)
--------------------------------------
Poisson_2D_1: 1.003e+00 9.863e-01 1.012e+00 9.892e-01 9.910e-01 9.896e-01
Poisson_2D_2: 1.019e+00 1.006e+00 1.019e+00 1.053e+00 1.009e+00 1.054e+00
Poisson_2D_3: 1.056e+00 1.035e+00 1.047e+00 1.129e+00 1.073e+00 1.029e+00
Poisson_2D_4: 1.097e+00 1.072e+00 1.083e+00 1.442e+00 1.226e+00 1.067e+00
Poisson_2D_5: 1.116e+00 1.109e+00 1.114e+00 2.179e+00 1.552e+00 1.093e+00
Poisson_3D_1: 1.027e+00 1.013e+00 1.031e+00 1.107e+00 9.935e-01 9.874e-01
Poisson_3D_2: 1.055e+00 1.019e+00 1.057e+00 1.202e+00 1.075e+00 1.066e+00
Poisson_3D_3: 1.149e+00 1.102e+00 1.142e+00 2.212e+00 1.520e+00 1.088e+00
Poisson_3D_4: 1.268e+00 1.293e+00 1.279e+00 8.795e+00 3.134e+00 1.235e+00
Poisson_3D_5: 1.448e+00 1.398e+00 1.423e+00 6.929e+01 7.050e+00 1.399e+00
Benchmark results (evaluating form)
-----------------------------------
Poisson_2D_1: 3.800e-07 7.200e-07 1.000e-08 3.000e-08 2.000e-08 1.000e-08
Poisson_2D_2: 7.500e-07 8.800e-07 2.000e-08 1.200e-07 4.000e-08 1.000e-08
Poisson_2D_3: 2.090e-06 1.730e-06 2.000e-08 5.100e-07 1.300e-07 0.000e+00
Poisson_2D_4: 4.960e-06 3.660e-06 3.000e-08 1.230e-06 2.800e-07 0.000e+00
Poisson_2D_5: 9.750e-06 8.220e-06 2.000e-08 2.450e-06 5.300e-07 1.000e-08
Poisson_3D_1: 9.700e-07 9.400e-07 4.000e-08 9.000e-08 2.000e-08 0.000e+00
Poisson_3D_2: 2.590e-06 2.550e-06 5.000e-08 9.800e-07 1.300e-07 1.000e-08
Poisson_3D_3: 9.150e-06 9.150e-06 5.000e-08 6.260e-06 4.700e-07 1.000e-08
Poisson_3D_4: 2.696e-05 2.682e-05 7.000e-08 1.602e-05 1.440e-06 1.000e-08
Poisson_3D_5: 6.805e-05 6.803e-05 1.100e-07 3.788e-05 3.720e-06 0.000e+00
Lines of code
-------------
178 450 4017 Poisson_2D_1_blas_a.h
186 478 4268 Poisson_2D_1_blas_g.h
188 498 4423 Poisson_2D_1_blas.h
187 486 4544 Poisson_2D_1_default_a.h
176 449 4047 Poisson_2D_1_default_g.h
187 494 4698 Poisson_2D_1_default.h
198 530 4897 Poisson_2D_2_blas_a.h
206 558 5148 Poisson_2D_2_blas_g.h
208 578 5303 Poisson_2D_2_blas.h
234 701 7470 Poisson_2D_2_default_a.h
196 529 4927 Poisson_2D_2_default_g.h
234 709 7624 Poisson_2D_2_default.h
232 698 6851 Poisson_2D_3_blas_a.h
240 726 7102 Poisson_2D_3_blas_g.h
242 746 7257 Poisson_2D_3_blas.h
332 1309 16246 Poisson_2D_3_default_a.h
230 697 6881 Poisson_2D_3_default_g.h
332 1317 16400 Poisson_2D_3_default.h
262 846 8463 Poisson_2D_4_blas_a.h
270 874 8714 Poisson_2D_4_blas_g.h
272 894 8869 Poisson_2D_4_blas.h
487 2490 34036 Poisson_2D_4_default_a.h
260 845 8493 Poisson_2D_4_default_g.h
487 2498 34190 Poisson_2D_4_default.h
298 1018 10341 Poisson_2D_5_blas_a.h
306 1046 10592 Poisson_2D_5_blas_g.h
308 1066 10747 Poisson_2D_5_blas.h
739 4414 63294 Poisson_2D_5_default_a.h
296 1017 10371 Poisson_2D_5_default_g.h
739 4422 63448 Poisson_2D_5_default.h
184 478 4453 Poisson_3D_1_blas_a.h
197 549 5176 Poisson_3D_1_blas_g.h
199 569 5331 Poisson_3D_1_blas.h
205 581 5821 Poisson_3D_1_default_a.h
187 525 4960 Poisson_3D_1_default_g.h
205 617 6337 Poisson_3D_1_default.h
222 642 6421 Poisson_3D_2_blas_a.h
235 713 7144 Poisson_3D_2_blas_g.h
237 733 7299 Poisson_3D_2_blas.h
327 1519 19786 Poisson_3D_2_default_a.h
225 689 6928 Poisson_3D_2_default_g.h
327 1555 20302 Poisson_3D_2_default.h
298 1016 11177 Poisson_3D_3_blas_a.h
311 1087 11900 Poisson_3D_3_blas_g.h
313 1107 12055 Poisson_3D_3_blas.h
703 5367 78687 Poisson_3D_3_default_a.h
301 1063 11684 Poisson_3D_3_default_g.h
703 5403 79203 Poisson_3D_3_default.h
400 1562 17881 Poisson_3D_4_blas_a.h
413 1633 18604 Poisson_3D_4_blas_g.h
415 1653 18759 Poisson_3D_4_blas.h
1630 16650 252862 Poisson_3D_4_default_a.h
403 1609 18388 Poisson_3D_4_default_g.h
1630 16686 253378 Poisson_3D_4_default.h
526 2236 26005 Poisson_3D_5_blas_a.h
539 2307 26728 Poisson_3D_5_blas_g.h
541 2327 26883 Poisson_3D_5_blas.h
3667 45107 695453 Poisson_3D_5_default_a.h
529 2283 26512 Poisson_3D_5_default_g.h
3667 45143 695969 Poisson_3D_5_default.h
28449 196792 2775747 totalt
Follow ups