Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisione Revisione precedente Prossima revisione | Revisione precedente | ||
roberto.alfieri:pub:vectorization [13/06/2017 19:54] roberto.alfieri |
roberto.alfieri:pub:vectorization [14/06/2017 11:30] (versione attuale) roberto.alfieri |
||
---|---|---|---|
Linea 1: | Linea 1: | ||
====== Vectorization ====== | ====== Vectorization ====== | ||
+ | |||
+ | [[ http://www.training.prace-ri.eu/uploads/tx_pracetmo/intel_mic_optimization.pdf | Vectorization and Code Optimization ]] | ||
+ | |||
Processor peak performance includes the speed-up provided by the vector instructions, | Processor peak performance includes the speed-up provided by the vector instructions, | ||
Linea 16: | Linea 19: | ||
- | Auto-vectorization is the easiest and more portable way to get vectorization , but not all loops can be vectorized: | + | Auto-vectorization is the easiest and more portable way to get vectorization. |
+ | |||
+ | The compiler recognize several vectiorization options. | ||
+ | |||
+ | Main vectorization options: | ||
+ | |||
+ | ^ ^ Intel compiler ^ | ||
+ | ^ KNL | -xMIC-AVX512 | | ||
+ | ^ BDW | -xCORE-AVX2 | | ||
+ | ^ Disable | -no-vec | | ||
+ | |||
+ | Not all loops can be vectorized: | ||
Some examples: | Some examples: | ||
Linea 35: | Linea 49: | ||
for (int i = 0; i < N; i++) a[i] = foo(b[i]); | for (int i = 0; i < N; i++) a[i] = foo(b[i]); | ||
- | | + | |
- | * Loops on data that are not aligned in the memory | + | * Loops on data that are not aligned in memory |
| |