This post is an attempt to answer those questions like what exactly happen when SAS runs a code or What happens at the back-end, when a user writes a code in SAS programming window & submit or run the program. Understanding this back-end SAS processing helps a programmer to know how SAS run & sometimes make their life easy by taking advantage of execution process.
SAS process any submitted code or data in 2 steps.
Compile Stage: In this stage SAS performs following tasks
1. In compile stage SAS assigns area in memory to store dataset called input buffer
2. SAS checks for input file & determines various variable attributes ( i.e. datatype, length etc.)
3. Reads code for any invalid syntax, errors & determines names of variable.
4. A descriptor portion is formed which will store all information related to variables such as variable name, datatype, length, label, default format & informat etc.
During compile state, SAS doesnot read any data from input file & doesnot evaluate any logical or condtional loops or statement. A reserved memory called as program data vector is also created to store all information about variables, data step, errors etc. Then SAS starts checking input data code & if any variable is assigned in between, it checks for its datatype, name & assign it a place in descriptor portion.
Tip: We can declare the length for different variables before an input statement. This length will be then stored in descriptor portion as default length.
Lets' consider a small program for salary of employees.
DATA EMP_SAL;
INPUT EMP_ID EMP_NAM$ AGE GENDER$ SALARY;
SALPERAGE = SALARY/AGE;
DATALINES;
101 AJAY 30 M 30000
102 MANI 28 F 28500
103 SAHIL 32 M 35000
;
RUN;
After submitting above program, SAS will first create a descriptor portion, which will store all attributes of variables. Descriptor portion will look like this
Descriptor Portion:
EMP_ID
|
EMP_NAM
|
AGE
|
GENDER
|
SALARY
|
SALPERAGE
|
(8BYTES)
Format $8. Informat $8. | NUMERIC (8BYTES) Informat 12. | CHARACTER (8BYTES) Format $8. Informat $8. | NUMERIC (8BYTES) Informat 12. | NUMERIC (8BYTES) |
As you can see SAS has allocated default memories to each variable in input buffer.
Now SAS is ready to run it's second stage of processing code.
Execution Stage: In excution stage, all values are set to missing or no value. In SAS numeric missing value is denoted by 'PERIOD' i.e. "." & a character missing value is denoted by 'BLANK SPACE' i.e. " ".SAS starts with intially setting all variable values to missing & this happens every time SAS reads a new line of data. An internal pointer keeps a track of current record executed. SAS will keep on running or executing till it reaches an end of file marker.
Program Data Vector: _n_ =1
EMP_ID
|
EMP_NAM
|
AGE
|
GENDER
|
SALARY
|
SALPERAGE
|
(8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
CHARACTER (8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
NUMERIC (8BYTES)
| |
When SAS executes first obs or line of data in above program
Program Data Vector: _n_ =1
EMP_ID
|
EMP_NAM
|
AGE
|
GENDER
|
SALARY
|
SALPERAGE
|
(8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
CHARACTER (8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
NUMERIC (8BYTES)
| |
AJAY
|
30
|
M
|
30000
|
Then it calculates SALPERAGE variable & put this value in Program data Vector
Program Data Vector: _n_ =1
EMP_ID
|
EMP_NAM
|
AGE
|
GENDER
|
SALARY
|
SALPERAGE
|
(8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
CHARACTER (8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
NUMERIC (8BYTES)
| |
AJAY
|
M
|
Once SAS reaches a run statement, it save data in input buffer or dataset & again turns back to dataline statement. Internal pointer for program is set to 2 & all values of variables in Program Data Vector is set to missing and SAS is ready to execute second line of data.
Program Data Vector: _n_ =2
EMP_ID
|
EMP_NAM
|
AGE
|
GENDER
|
SALARY
|
SALPERAGE
|
(8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
CHARACTER (8BYTES)
Format $8.
Informat $8.
|
NUMERIC (8BYTES)
Informat 12.
|
NUMERIC (8BYTES)
| |
SAS will keep on executing until it reaches an end of file marker.
No comments:
Post a Comment