CPU time:the amount of time the central processing unit (CPU) uses to perform requested tasks such as calculations, reading and writing data, conditional logic, and iterative logic.CPU time is measured when data must be processed in the program data vector.
I/O:a measurement of the read and write operations that are performed as data and programs are copied from a storage device to memory (input) or from memory to a storage or display device (output).
1:用sas選項來追蹤資源使用情況
選項關鍵詞前面加NO,可以取消該選項。
2:控制內存使用情況
2.1:Measuring I/O
Improvement in I/O can come at the cost of increased memory consumption
用一張圖展示緩沖區也就是memory的那塊與I/O的關系
I/O的計算是在從數據集到緩沖區和從緩沖區到數據集這兩部分組成
2.1:如何改變單次I/O讀入數據的大小?
2.1.1 page size
在sas中page size和buffer size是一個意思,那么增加size就會減少I/O也是一個很自然的道理,但是這樣的代價就是memory consumption的消耗增加
sas通過一系列算法來給定一個默認的page size,對於多任務的sas程序很實用,但是需要手動改變page size也是可以的。
一般不用min,會出現不可預見的問題。
如果用copy過程拷貝數據集,原有的page size不會保留
2.1.2 page no
BUFNO= control the number of buffers that are available for reading or writing a SAS data set with each I/O transfer
建議使用10
總結:對於小數據集,盡可能的一次性分配可以足夠讀取數據集的buffer
3:使用sasfile語句
sasfile語句可以將數據集hold在內存中,減少open/close操作,包括釋放和分配內存
The SASFILE statement opens a SAS data file and allocates enough buffers to hold the entire file in memory
Once the data file is read, the data is held in memory, and it is available to subsequent DATA and PROC steps or applications until either
sasfile close or 程序結束自動釋放內存
如果沒有足夠的空間則會用虛擬內存或者默認的buffer大小
在data步或proc步,sas會自動釋放buffer,在這個程序中如果不用sasfile,company.sales則要被讀取兩次,浪費了資源
sasfile company.sales load; proc print data=company.sales; var Customer_Age_Group; run; proc tabulate data=company.sales; class Customer_Age_Group; var Customer_BirthDate; table Customer_Age_Group,Customer_BirthDate*(mean median); run; sasfile company.sales close;
總結
1:If you need to repeatedly process a SAS data file that will fit entirely in memory,use the SASFILE statement to reduce I/O and some CPU usage
2:If you use the SASFILE statement and the SAS data file will not fit entirely in memory, the code will execute, but there might be a degradation in performance
3:如果只要反復文件的一部分,則最好用sasfile,這樣提升效率
4:用IBUFSIZE=改變索引緩存的大小
這對於經常用索引的程序有改善,但是改變大小后要重新建立索引
IBUFSIZE=0重新設置為系統默認大小