AVX / AVX2 指令編程帶例子推薦優質文章

本文轉載自查看原文 2020-08-13 12:38 1518 AVX2/ AVX

1. 查看自己cpu支持指令集：

直接去官網查：

https://ark.intel.com/content/www/cn/zh/ark.html#@Processors

比如這顆

https://ark.intel.com/content/www/cn/zh/ark/products/75131/intel-core-i7-4900mq-processor-8m-cache-up-to-3-80-ghz.html

2. 測試例子：

#include <immintrin.h>
#include <stdio.h>

int main(int argc, char* argv[]) 
{

    __m256i first = _mm256_set_epi64x(10, 20, 30, 40);
    __m256i second = _mm256_set_epi64x(5, 5, 5, 5);
    __m256i result = _mm256_add_epi64(first, second);

    long int* values = (long int*) &result;
	printf("==%ld \n", sizeof(long int));
    for (int i = 0;i < 4; i++)
	{
        printf("%ld ", values[i]);
    }

    return 0;
}

_mm256_set_epi64x() _mm256_add_epi64() 等內建函數的含義和用法：

https://software.intel.com/sites/landingpage/IntrinsicsGuide

注意：左邊欄勾選后，右欄結果不一定准確。比如SSE的addss指令在有AVX機器中中變為vaddvss，但是勾選AVX512中才能搜到。

編譯命令：

gcc -mavx2 -S -fverbose-asm fun.c  #看詳細的匯編語言結果

gcc -mavx2 fun.c

補充個例子：

#include <immintrin.h>
#include <stdio.h>

float aa[] = {10, 20, 30, 40, 50, 60, 70, 80};
float bb[] = {0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5};
float cc[] = {0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5};

int main(int argc, char* argv[]) 
{

    __m256 first  = _mm256_loadu_ps (aa);
    __m256 second = _mm256_loadu_ps (bb);
    __m256 result = _mm256_add_ps (first, second);
					_mm256_storeu_ps (cc, result);
					
	printf("==%ld \n", sizeof(float));
    for (int i = 0;i < 8; i++)
	{
        printf("%f\n", cc[i]);
    }

    return 0;
}

查錯手冊：

AVX vector return without AVX enabled changes the ABI ——————————沒有 -mavx2

inlining failed in call to always_inline 'xxx': target specific option mismatch —————— 架構不匹配，看看cpu是否支持 avx2

參考資料：

https://software.intel.com/content/www/cn/zh/develop/articles/introduction-to-intel-advanced-vector-extensions.html

https://zhuanlan.zhihu.com/p/94649418

https://www.codeproject.com/Articles/874396/Crunching-Numbers-with-AVX-and-AVX

https://software.intel.com/sites/landingpage/IntrinsicsGuid

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Intel的AVX2指令集解讀由於服務器不支持avx2指令集導致dorisdb部署運行be失敗 AVX指令使用 intel AVX指令集解決Tensorflow 使用時cpu編譯不支持警告：that this TensorFlow binary was not compiled to use: AVX AVX2 tensorflow提示:此版本TensorFlow不支持AVX2但是你的cpu支持警告：Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 報錯：Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 Spring Cloud集成相關優質項目推薦 Win10下TensorFlow安裝錯誤解決：Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

AVX / AVX2 指令編程 帶例子 推薦優質文章

免責聲明！

AVX / AVX2 指令編程帶例子推薦優質文章