Recommendation: Specify the expected loop trip count | Confidence: | Low |
Recommendation: Disable unrolling | Confidence: | Medium |
Recommendation: Use a smaller vector length | Confidence: | Medium |
Recommendation: Align data | Confidence: | Medium |
Recommendation: Add data padding | Confidence: | Medium |
Windows* OS | Linux* OS |
---|---|
/Qopt-assume-safe-padding | -qopt-assume-safe-padding |
Recommendation: Collect trip counts data | Confidence: | Need more data |
Recommendation: Force vectorized remainder | Confidence: | Medium |
Recommendation: Use the smallest data type | Confidence: | Low |
Recommendation: Enable inline expansion | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/Ob1 or /Ob2 | -inline-level=1 or -inline-level=2 |
Recommendation: Vectorize user function(s) inside loop | Confidence: | Low |
Target | Directive |
---|---|
Source loop | #pragma simd or #pragma omp simd |
Inner function definition or declaration | #pragma omp declare simd |
Recommendation: Enable inline expansion | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/Ob1 or /Ob2 | -inline-level=1 or -inline-level=2 |
Recommendation: Vectorize serialized function(s) inside loop | Confidence: | Medium |
Target | Directive |
---|---|
Source loop | #pragma simd or #pragma omp simd |
Inner function definition or declaration | #pragma omp declare simd |
Recommendation: Enable inline expansion | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/Ob1 or /Ob2 | -inline-level=1 or -inline-level=2 |
Recommendation: Use the Intel short vector math library for vector intrinsics | Confidence: | High |
Recommendation: Use a Glibc library with vectorized SVML functions | Confidence: | Low |
Recommendation: Vectorize math function calls inside loops | Confidence: | Medium |
Windows* OS | Linux* OS |
---|---|
/Qfast-transcendentals | -fast-transcendentals |
Recommendation: Change the floating point model | Confidence: | Medium |
Windows* OS | Linux* OS |
---|---|
/fp:fast | -fp-model fast |
/fp:precise /Qfast-transcendentals | -fp-model precise -fast-transcendentals |
Recommendation: Remove system function call(s) inside loop | Confidence: | Low |
Recommendation: Move OpenMP call(s) outside the loop body | Confidence: | Low |
Target | Directive |
---|---|
Outer section | #pragma omp parallel sections |
Inner section | #pragma omp for nowait |
Recommendation: Remove OpenMP lock functions | Confidence: | Low |
Recommendation: Remove indirect call(s) inside loop | Confidence: | Low |
Recommendation: Replace call(s) to virtual method with direct call(s) | Confidence: | Low |
Target | Directive |
---|---|
Source loop | #pragma simd or #pragma omp simd |
Inner function definition or declaration | #pragma omp declare simd |
Recommendation: Confirm dependency is real | Confidence: | Need More Data |
Recommendation: Remove dependency | Confidence: | Low |
Recommendation: Enable vectorization | Confidence: | Low |
Directive | Outcome |
---|---|
#pragma simd or #pragma omp simd | Ignores all dependencies in the loop |
#pragma ivdep | Ignores only vector dependencies (which is safest) |
Recommendation: Decrease unroll factor | Confidence: | Low |
Recommendation: Split loop into smaller loops | Confidence: | Low |
Recommendation: Confirm inefficient memory access patterns | Confidence: | Need More Data |
Recommendation: Use SoA instead of AoS | Confidence: | Low |
Recommendation: Reorder loops | Confidence: | Low |
Recommendation: Target the AVX2 ISA | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/QxCORE-AVX2 or /QaxCORE-AVX2 | -xCORE-AVX2 or -axCORE-AVX2 |
Recommendation: Target a specific ISA instead of using the xHost option | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/QxCORE-AVX2 or /QaxCORE-AVX2 | -xCORE-AVX2 or -axCORE-AVX2 |
Recommendation: Explicitly enable FMA generation when using the strict floating-point model | Confidence: | Low |
Windows* OS | Linux* OS |
---|---|
/Qfma | -fma |
Intel, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © 2016 Intel Corporation