VASP currently offers parallelization and data distribution over bands and/or over plane wave coefficients (see NCORE and NPAR), and parallelization over k-points (no data distribution, see KPAR).
To obtain high efficiency on massively parallel systems or modern multi-core machines, it is strongly recommended to use all at the same time.
以下假設計算總核心數為N,單節點計算核心數為n
- NPAR與NCORE
- 兩者相關:NCORE*NPAR=N/KPAR, 取其一設置即可,vaspwiki上建議優先設置NPAR,但NCORE更為方便 (NCORE is available from VASP.5.2.13 on, and is more handy than the previous parameter NPAR)
- NPAR決定能帶並行策略(NPAR determines the number of bands that are treated in parallel)
- NCORE決定軌道並行策略(NCORE determines the number of compute cores that work on an individual orbital)
- 大規模並行,vaspwiki上建議NPAR≈Sqrt(N) or NCORE = n. 若N開方非整數,則取開方結果附近的整數,這里需要注意:NPAR需要適當選取,使得NCORE=N/NPAR是n的factor,以減小節點間通信的overhead,否則VASP會采用默認設置,即NPAR=N or NCORE=1,默認值適用於小核數(8核以內)和小的通信帶寬
- NCORE的最佳值取決於一個unit cell中的具體原子數,對於100個原子左右的unit cell, NCORE∼4;對於大的unit cell (more than 400 atoms), NCORE∼12-16
- KPAR
- The set of k-points is distributed over KPAR groups
- KPAR決定K點並行策略(KPAR determines the number of k-points that are to be treated in parallel)
- choose KPAR such that it is an integer divisor of N
- the data is not distributed additionally over k-points
總結:
以上源於vaspwiki的一般性總結,具體請以實際測試為准!
特別在超算上,一定要花時間測試!否則可能吃(算)力不討好!
計算測試從單節點開始,不要一上來就做多節點測試,確保每一次增加節點后要比之前快!
個人實踐測試經驗
VASP針對k點和能帶做了並行計算處理,可以從vasp剛開始輸出的結果看出
running on N total cores
distrk: each k-point on N/KPAR cores, KPAR groups
distr: one band NCORE cores, NPAR groups
其中NCORE需要為n的factor. 相對於NPAR,設置NCORE更為方便,因為直接設NPAR還要驗證NCORE=N/NPAR是否是n的factor. vaspwiki建議對於100個原子左右的unit cell, NCORE∼4;對於大的unit cell (more than 400 atoms), NCORE∼12-16. 因此,可根據自己的體系和資源合理選取.