ARM 和 X86 雲服務器的算力對比


背景

目前國內信創(信息技術應用創新產業)趨勢發展正猛,借此機會,眾多國內服務器,芯片廠商都推出了國產服務器和國產芯片。同時各大雲計算廠商也推出了信創雲(服務器),但是針對 ARM 和 X86 兩種架構的 CPU 算力,很多人都存在疑問,今天我們就一起來對某主流雲廠商的 ARM 和 X86 架構雲服務器的 CPU 算力進行測試。

工具安裝

sysbench

用於測試 CPU 整型算力。

 

 
# 安裝依賴yum install automake libtool gcc -y
# 下載sysbench源碼包wget https://github.com/akopytov/sysbench/archive/1.0.20.tar.gz -O sysbench-1.0.20.tar.gz
# 解壓tar -xvf sysbench-1.0.20.tar.gz
# 執行autogen.shcd sysbench-1.0.20sh autogen.sh
# 生成Makefile./configure --without-mysql
# 編譯並安裝make -j8 && make install
# 查看安裝結果(版本信息)sysbench --version
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Unixbench

用於測試 CPU 浮點數算力。

 

 
# 下載wget http://soft.vpser.net/test/unixbench/unixbench-5.1.2.tar.gz
# 解壓tar zxvf unixbench-5.1.2.tar.gz
# 配置如果不需要進行圖形測試或者不在圖形化界面下測試,則將Makefile文件中GRAPHICS_TEST = defined注釋掉
make
# 安裝依賴yum install -y perl
# 執行測試cd unixbench-5.1.2./Run
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

執行測試-整型

被測機型規格

 
被測X86和ARM雲服務器規格都為:8C32G,雲盤2T。
 
 
 
復制代碼
 
 

被測機型 CPU 型號

 
被測X86雲服務器CPU型號:Intel(R) Xeon(R) Silver 4114 CPU @2.20GHz被測ARM雲服務器CPU型號:Phytium FT-2000+/64 @2.2GHz
 
 
 
復制代碼
 
 
 

X86

測試 8 線程,20000 內的質數計算能力。Score:2813.42

 

 
[root@X86-Performance ~]# sysbench cpu --cpu-max-prime=20000  --threads=8 --time=60 runsysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:Number of threads: 8Initializing random number generator from current time

Prime numbers limit: 20000
Initializing worker threads...
Threads started!
CPU speed:    events per second:  2813.42
General statistics:    total time:                          60.0025s    total number of events:              168818
Latency (ms):         min:                                    2.82         avg:                                    2.84         max:                                   17.52         95th percentile:                        2.86         sum:                               479885.99
Threads fairness:    events (avg/stddev):           21102.2500/13.03    execution time (avg/stddev):   59.9857/0.01
[root@X86-Performance ~]#
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ARM

測試 8 線程,20000 內的質數計算能力。Score:7077.50

 

 
[root@performance-arm ~]# sysbench cpu --cpu-max-prime=20000  --threads=8 --time=60 runsysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:Number of threads: 8Initializing random number generator from current time

Prime numbers limit: 20000
Initializing worker threads...
Threads started!
CPU speed:    events per second:  7077.50
General statistics:    total time:                          60.0024s    total number of events:              424684
Latency (ms):         min:                                    1.12         avg:                                    1.13         max:                                   18.34         95th percentile:                        1.14         sum:                               479797.10
Threads fairness:    events (avg/stddev):           53085.5000/32.63    execution time (avg/stddev):   59.9746/0.00
[root@performance-arm ~]#
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

結果分析

根據測試結果可以得出 ARM 結構的雲服務器算力比 X86 的算力在整型計算能力上高出 2 倍多。

執行測試-浮點數

X86

使用 Unixbench 分別測試單線程和 8 線程 CPU 在 Double-Precision Whetstone 項目中的得分。

 

1 線程:3946.1 MWIPS

 

8 線程:31546.4 MWIPS

 

 
------------------------------------------------------------------------Benchmark Run: Wed May 19 2021 19:24:55 - 19:53:028 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables       33293015.3 lps   (10.0 s, 7 samples)Double-Precision Whetstone                     3946.1 MWIPS (9.8 s, 7 samples)Execl Throughput                                984.1 lps   (30.0 s, 2 samples)File Copy 1024 bufsize 2000 maxblocks        466370.4 KBps  (30.0 s, 2 samples)File Copy 256 bufsize 500 maxblocks          119865.0 KBps  (30.0 s, 2 samples)File Copy 4096 bufsize 8000 maxblocks       1466024.1 KBps  (30.0 s, 2 samples)Pipe Throughput                              583004.5 lps   (10.0 s, 7 samples)Pipe-based Context Switching                 129953.0 lps   (10.0 s, 7 samples)Process Creation                               3494.1 lps   (30.0 s, 2 samples)Shell Scripts (1 concurrent)                   2352.7 lpm   (60.0 s, 2 samples)Shell Scripts (8 concurrent)                   2701.8 lpm   (60.0 s, 2 samples)System Call Overhead                         495048.1 lps   (10.0 s, 7 samples)
System Benchmarks Index Values               BASELINE       RESULT    INDEXDhrystone 2 using register variables         116700.0   33293015.3   2852.9Double-Precision Whetstone                       55.0       3946.1    717.5Execl Throughput                                 43.0        984.1    228.9File Copy 1024 bufsize 2000 maxblocks          3960.0     466370.4   1177.7File Copy 256 bufsize 500 maxblocks            1655.0     119865.0    724.3File Copy 4096 bufsize 8000 maxblocks          5800.0    1466024.1   2527.6Pipe Throughput                               12440.0     583004.5    468.7Pipe-based Context Switching                   4000.0     129953.0    324.9Process Creation                                126.0       3494.1    277.3Shell Scripts (1 concurrent)                     42.4       2352.7    554.9Shell Scripts (8 concurrent)                      6.0       2701.8   4502.9System Call Overhead                          15000.0     495048.1    330.0                                                                   ========System Benchmarks Index Score                                         756.6
------------------------------------------------------------------------Benchmark Run: Wed May 19 2021 19:53:02 - 20:21:108 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables      265277164.6 lps   (10.0 s, 7 samples)Double-Precision Whetstone                    31546.4 MWIPS (9.8 s, 7 samples)Execl Throughput                              20901.0 lps   (30.0 s, 2 samples)File Copy 1024 bufsize 2000 maxblocks        871968.8 KBps  (30.0 s, 2 samples)File Copy 256 bufsize 500 maxblocks          234891.6 KBps  (30.0 s, 2 samples)File Copy 4096 bufsize 8000 maxblocks       2799968.7 KBps  (30.0 s, 2 samples)Pipe Throughput                             4642141.4 lps   (10.0 s, 7 samples)Pipe-based Context Switching                1059963.5 lps   (10.0 s, 7 samples)Process Creation                              55490.3 lps   (30.0 s, 2 samples)Shell Scripts (1 concurrent)                  33809.9 lpm   (60.0 s, 2 samples)Shell Scripts (8 concurrent)                   4641.0 lpm   (60.1 s, 2 samples)System Call Overhead                        3522148.0 lps   (10.0 s, 7 samples)
System Benchmarks Index Values               BASELINE       RESULT    INDEXDhrystone 2 using register variables         116700.0  265277164.6  22731.5Double-Precision Whetstone                       55.0      31546.4   5735.7Execl Throughput                                 43.0      20901.0   4860.7File Copy 1024 bufsize 2000 maxblocks          3960.0     871968.8   2201.9File Copy 256 bufsize 500 maxblocks            1655.0     234891.6   1419.3File Copy 4096 bufsize 8000 maxblocks          5800.0    2799968.7   4827.5Pipe Throughput                               12440.0    4642141.4   3731.6Pipe-based Context Switching                   4000.0    1059963.5   2649.9Process Creation                                126.0      55490.3   4404.0Shell Scripts (1 concurrent)                     42.4      33809.9   7974.0Shell Scripts (8 concurrent)                      6.0       4641.0   7735.0System Call Overhead                          15000.0    3522148.0   2348.1                                                                   ========System Benchmarks Index Score                                        4450.0
[root@X86-Performance unixbench-5.1.2]#
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ARM

使用 Unixbench 分別測試單線程和 8 線程 CPU 在 Double-Precision Whetstone 項目中的得分。

 

1 線程:3626.3 MWIPS

 

8 線程:28926.4 MWIPS

 

 
------------------------------------------------------------------------Benchmark Run: Wed May 19 2021 18:59:02 - 19:27:088 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables       22270696.0 lps   (10.0 s, 7 samples)Double-Precision Whetstone                     3626.3 MWIPS (9.3 s, 7 samples)Execl Throughput                               2591.5 lps   (29.7 s, 2 samples)File Copy 1024 bufsize 2000 maxblocks        402971.9 KBps  (30.0 s, 2 samples)File Copy 256 bufsize 500 maxblocks          121834.3 KBps  (30.0 s, 2 samples)File Copy 4096 bufsize 8000 maxblocks       1069823.2 KBps  (30.0 s, 2 samples)Pipe Throughput                              730925.1 lps   (10.0 s, 7 samples)Pipe-based Context Switching                 101991.7 lps   (10.0 s, 7 samples)Process Creation                               5187.1 lps   (30.0 s, 2 samples)Shell Scripts (1 concurrent)                   3884.2 lpm   (60.0 s, 2 samples)Shell Scripts (8 concurrent)                   1588.8 lpm   (60.0 s, 2 samples)System Call Overhead                         514939.2 lps   (10.0 s, 7 samples)
System Benchmarks Index Values               BASELINE       RESULT    INDEXDhrystone 2 using register variables         116700.0   22270696.0   1908.4Double-Precision Whetstone                       55.0       3626.3    659.3Execl Throughput                                 43.0       2591.5    602.7File Copy 1024 bufsize 2000 maxblocks          3960.0     402971.9   1017.6File Copy 256 bufsize 500 maxblocks            1655.0     121834.3    736.2File Copy 4096 bufsize 8000 maxblocks          5800.0    1069823.2   1844.5Pipe Throughput                               12440.0     730925.1    587.6Pipe-based Context Switching                   4000.0     101991.7    255.0Process Creation                                126.0       5187.1    411.7Shell Scripts (1 concurrent)                     42.4       3884.2    916.1Shell Scripts (8 concurrent)                      6.0       1588.8   2648.0System Call Overhead                          15000.0     514939.2    343.3                                                                   ========System Benchmarks Index Score                                         783.9
------------------------------------------------------------------------Benchmark Run: Wed May 19 2021 19:27:08 - 19:55:158 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables      177048367.5 lps   (10.0 s, 7 samples)Double-Precision Whetstone                    28926.4 MWIPS (9.3 s, 7 samples)Execl Throughput                              15952.7 lps   (30.0 s, 2 samples)File Copy 1024 bufsize 2000 maxblocks        598099.6 KBps  (30.0 s, 2 samples)File Copy 256 bufsize 500 maxblocks          160373.3 KBps  (30.0 s, 2 samples)File Copy 4096 bufsize 8000 maxblocks       1793541.5 KBps  (30.0 s, 2 samples)Pipe Throughput                             5840652.5 lps   (10.0 s, 7 samples)Pipe-based Context Switching                 904721.9 lps   (10.0 s, 7 samples)Process Creation                              16460.6 lps   (30.0 s, 2 samples)Shell Scripts (1 concurrent)                  15821.5 lpm   (60.0 s, 2 samples)Shell Scripts (8 concurrent)                   2313.4 lpm   (60.1 s, 2 samples)System Call Overhead                        1259178.2 lps   (10.0 s, 7 samples)
System Benchmarks Index Values               BASELINE       RESULT    INDEXDhrystone 2 using register variables         116700.0  177048367.5  15171.2Double-Precision Whetstone                       55.0      28926.4   5259.3Execl Throughput                                 43.0      15952.7   3709.9File Copy 1024 bufsize 2000 maxblocks          3960.0     598099.6   1510.4File Copy 256 bufsize 500 maxblocks            1655.0     160373.3    969.0File Copy 4096 bufsize 8000 maxblocks          5800.0    1793541.5   3092.3Pipe Throughput                               12440.0    5840652.5   4695.1Pipe-based Context Switching                   4000.0     904721.9   2261.8Process Creation                                126.0      16460.6   1306.4Shell Scripts (1 concurrent)                     42.4      15821.5   3731.5Shell Scripts (8 concurrent)                      6.0       2313.4   3855.7System Call Overhead                          15000.0    1259178.2    839.5                                                                   ========System Benchmarks Index Score                                        2792.1
[root@performance-arm UnixBench]#
 
 
 
復制代碼
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

結果分析

根據測試結果得出,在浮點數計算中,ARM 架構的 CPU 算力約為 X86 的 92%,表現還是不錯的。

Tips

為什么 ARM 的整型算力比 X86 高?

因為 ARM 和 X86 的指令集架構不同,ARM 天生在簡單指令處理中就比 X86 快,所以在整型計算中才能大幅領先。

ARM 和 X86 的指令集有什么區別?

針對這個問題,我相信很多人和小編一樣一時無法搞清楚,但是我們都知道 Intel 采用 CISC(復雜指令集),而 ARM 采用 RISC(簡單指令集)。

 

對於拉屎這個動作,CISC 和 RISC 會向人發送不同的指令。RISC 的指令為:去拉屎吧!而 CISC 的指令為:起身,走到廁所,座上馬桶,脫下褲子,開始拉屎!

ARM 和 X86 版本的軟件一樣嗎?

arm 和 x86 架構的軟件會有所不同,你可以在線或者離線下載,或者從廠家 support 處獲取。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM