sas優化技巧(1) 追蹤資源使用情況選項,控制內存使用情況bufsize、bufno、sasfile、ibufsize


CPU time:the amount of time the central processing unit (CPU) uses to perform requested tasks such as calculations, reading and writing data, conditional logic, and iterative logic.CPU time is measured when data must be processed in the program data vector.

I/O:a measurement of the read and write operations that are performed as data and programs are copied from a storage device to memory (input) or from memory to a storage or display device (output).

 

 

1:用sas選項來追蹤資源使用情況

選項關鍵詞前面加NO,可以取消該選項。

 

2:控制內存使用情況

2.1:Measuring I/O

Improvement in I/O can come at the cost of increased memory consumption

用一張圖展示緩沖區也就是memory的那塊與I/O的關系

I/O的計算是在從數據集到緩沖區和從緩沖區到數據集這兩部分組成

 

2.1:如何改變單次I/O讀入數據的大小?

2.1.1 page size

在sas中page size和buffer size是一個意思,那么增加size就會減少I/O也是一個很自然的道理,但是這樣的代價就是memory consumption的消耗增加

sas通過一系列算法來給定一個默認的page size,對於多任務的sas程序很實用,但是需要手動改變page size也是可以的。

一般不用min,會出現不可預見的問題。

如果用copy過程拷貝數據集,原有的page size不會保留

 

2.1.2 page no

BUFNO= control the number of buffers that are available for reading or writing a SAS data set with each I/O transfer

建議使用10

總結:對於小數據集,盡可能的一次性分配可以足夠讀取數據集的buffer

 

3:使用sasfile語句

sasfile語句可以將數據集hold在內存中,減少open/close操作,包括釋放和分配內存

The SASFILE statement opens a SAS data file and allocates enough buffers to hold the entire file in memory

Once the data file is read, the data is held in memory, and it is available to subsequent DATA and PROC steps or applications until either

sasfile close or 程序結束自動釋放內存

 

如果沒有足夠的空間則會用虛擬內存或者默認的buffer大小

在data步或proc步,sas會自動釋放buffer,在這個程序中如果不用sasfile,company.sales則要被讀取兩次,浪費了資源
sasfile company.sales load; proc print data=company.sales; var Customer_Age_Group; run; proc tabulate data=company.sales; class Customer_Age_Group; var Customer_BirthDate; table Customer_Age_Group,Customer_BirthDate*(mean median); run; sasfile company.sales close;

總結

1:If you need to repeatedly process a SAS data file that will fit entirely in memory,use the SASFILE statement to reduce I/O and some CPU usage

2:If you use the SASFILE statement and the SAS data file will not fit entirely in memory, the code will execute, but there might be a degradation in performance

3:如果只要反復文件的一部分,則最好用sasfile,這樣提升效率

 

4:用IBUFSIZE=改變索引緩存的大小

這對於經常用索引的程序有改善,但是改變大小后要重新建立索引

IBUFSIZE=0重新設置為系統默認大小

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM