data me; put _n_= x=;
/*******1******/ input x/*input這里是讀入緩沖流的關鍵步驟變量是從緩沖流中取出數據,根據緩沖流中指針的位置來獲取變量信息*/;
/*INPUT statement causes SAS to read the first record of raw data into the input buffer. Then, according to the instructions in the INPUT statement, SAS reads the data values in the input buffer and assigns them to variables in the program data vector*/
/*將記錄讀入緩沖流,從緩沖流中讀出數據,再將數據賦值給pdv,這是input語句的工作*/
/*******2******/ put x=; /*對讀取的數據進行操作的步驟放在input和cards之間*/
/*每一個data步結束,進行的工作有如下幾個 1:清空pdv 2:返回data步開頭 3:_n_遞增1 4:_error_設置為0 5:將pdv中的數據寫入數據集*/
/*******3******/ cards;
1 /*每一行代表一個record,一個input將一條record讀入input buffer,然后再分別對input 后的pdv變量進行賦值*/ 2 3 ; run;
data步中input和其余可執行語句之間的執行順序問題
這里是按順序執行
1:執行put _n_ x;輸出結果為 _N_=1 x=. 執行input,跳轉到cards語句輸入第一行觀測值,(此時x已有值),執行put x=,輸出x=1執行到run,清空pdv,返回data步開頭
2:執行put _n_ x;輸出結果為 _N_=2 x=. (前一步因為已清空了pdv,所以x為缺失值) 執行input,跳轉到cards語句輸入第二行觀測值,(此時x已有值),執行put x=,輸出x=2執行到run,清空pdv,返回data步開頭
3:執行put _n_ x;輸出結果為 _N_=3 x=. (前一步因為已清空了pdv,所以x為缺失值) 執行input,跳轉到cards語句輸入第三行觀測值,(此時x已有值),執行put x=,輸出x=3執行到run,清空pdv,返回data步開頭
4:執行put _n_ x;輸出結果為 _N_=4 x=. (前一步因為已清空了pdv,所以x為缺失值) 執行input,跳轉到cards語句輸入第四行觀測值,但因為讀取到了底部,所以直接跳轉到run,退出程序。
編譯階段所做的工作
When you submit a DATA step for execution, SAS checks the syntax of the SAS statements and compiles them, that is, automatically translates the statements into machine code. SAS further processes the code, and creates the following three items
當向系統提交data步執行時,sas檢驗語法並進行編譯(也就是將其轉化為機器代碼,計算機能識別的代碼,010101),然后sas會進一步處理代碼代碼,並創造如下三個項目:
input buffer:
is a logical area in memory into which SAS reads each record of data from a raw data file when the program executes. (When SAS reads from a SAS data set, however, the data is written directly to the program data vector.)
內存中存儲一行record的邏輯存儲區域,只有從raw data中讀取時才產生,如果直接從數據集中讀取,並不會山上,數據集中的數據是直接寫入pdv
pdv(program data vector)
is a logical area of memory where SAS builds a data set, one observation at a time. When a program executes, SAS reads data values from the input buffer or creates them by executing SAS language statements. SAS assigns the values to the appropriate variables in the program data vector. From here, SAS writes the values to a SAS data set as a single observation
建立數據集的內存中的邏輯區域,一般是一條觀測對應一個,程序執行時,會從input buffer中讀取數據,或直接依靠sas系統語句賦值。
Along with data set variables and computed variables,the PDV contains two automatic variables,_N_ /_ERROR_,
_N_:counts the number of times the data step begins to iterate.
_ERROR_: 0->> no error 1->> has error.
descriptor information
is information about each SAS data set, including data set attributes and variable attributes. SAS creates and maintains the descriptor information.
數據集屬性和變量屬性的描述信息。
The Execution Phase
By default, a sample Data step iterates once for each observation that is being created. The flow of action in the Execution Phase of a simple DATA step i wsdescribe as follows
1.The DATA step begin with a DATA statement.each time the DATA statement executes, a new iteration of DATA step begins,and the _N_
automatic variable is incremented by 1.
2.SAS sets the newly created program variables to missing in the program data vector(PDV).
3.SAS reads a data record from a raw data file into the input buffer,or it read an observation from a SAS data set directly into the PDV.
4.SAS executes any subsequent programming statements for the current record.
5.At the end of the statements, an output、return、reset occur automatically.SAS write an observation to the SAS data set,the system
automatically return to the top of the DATA step, and the values of variable created by INPUT and assignment statement are reset to missing
in the PDV. NOTE::variables that you read with a SET,MERGE,MODIFY or UPDATE statement are not reset to missing here.
when sas reset the PDV, (1):the values of variables created by the INPUT statement are set to missing.
(2):the value created by sum/retain statement is automatically retained.
(3):_N_ incremented by 1, the value of _ERROR_ is reset to 0
6. SAS counts another iteration, reads the next record or observation, and execute the subsequent programming statements for the current observation.
7. the DATA step terminates when SAS encounter the end-of-file in a SAS data set or a raw data file.
當一個數據步結束后只會有三項工作進行,緩沖流中的數據指針的位置並不會自動轉到下一個record中,使其轉到下一個record中的原因是當前record已讀取完畢,才會轉入下一個record!