Conditional Statement & Loops in SAS

A Conditional statements checks for a condition or a set of condition & decide further flow of a program.  SAS, like any other programming language has some predefined functions which helps a coder to put a conditional check on the variables & then execute the code written inside conditional parentheses accordingly.

SAS has following condtional statements, which we are going to explain in detail & then compare-contrast them.

IF-Else-Then Statement: An If-else statement checks for the one or more than one conditions & 'IF' conditions are true 'THEN' SAS executes all the statements written within conditional parentheses. If condition is false then SAS will not execute Then statements & go straight to 'ELSE' statement.
Syntax of If-else-then statement is 

IF CONDITION THEN <EXECUTE THIS>;
ELSE <EXECUTE THIS>;

Writing an 'ELSE' statement is not necessary, if we don't need to test any other conditions.

Sometimes, it is required to test more than 1 conditions in a single statement & we have to check them simultaneously. In such cases we can use operators like 'AND', 'OR', 'IN' etc. This is the right time to introduce all operators of SAS & their applications. 

Operators: These are used to check more than one conditions at a time. 

Operator  Application Conditional Statements
OR (Boolean) Returns a TRUE, When checking 2 or more conditions, & one or more can be true IF-ELSE
AND (Boolean) Returns a TRUE, When checking 2 or more conditions, & all of then must be true IF-ELSE
NOT (Boolean) Returns a TRUE, if condition mentioned after NOT is false. IF-ELSE
EQ, = To check arithmetic equality IF-ELSE
NE , != Arithmetic operator to check inequality IF-ELSE
LE, <= Less than & Equal to IF-ELSE
GE, >= Greater than & Equal to IF-ELSE
IN Multiple 'OR', To check more than one condition on same variable. IF-ELSE
IS MISSING To check for missing value. WHERE
IS NULL To check for null value. WHERE
BETWEEN …. AND  ….. To check between. WHERE
CONTAINS Checks for more than one strings in a variable. WHERE
LIKE "%...." To check for strings, alphabets & wildcards WHERE
 =* For phonetic mathes WHERE


Precedence order of Boolean operators is decided in SAS, This order is as follows

                                                                     NOT > AND > OR

Nested IF statement: A nested if statement is used to check for conditions within conditions. This makes program a bit  less complicated to understand, if written properly in alignment & indents. Below is a simple syntax for Nested IF statement.

IF <CONDITION1> AND <CONDITION2>  THEN OUTPUT;
     ELSE 
     IF <CONDITION3> OR <CONDITION4> THEN DELETE;
         ELSE
         IF <CONDITION5> THEN CONTINUE;

Here OUTPUT, DELETE, CONTINUE are options defined in SAS, to change program flow. 

Select-When Statement: This conditional statement is used to test conditions on a variable. This variable is declared in SELECT statement as an arguement. This statement is more efficient when conditions to be checked in are mutually exclusive. Syntax is as below.

SELECT ( VAR1);
WHEN (CONDITION1)  
              EXECUTE THIS;
WHEN (CONDITION2)  
              EXECUTE THIS;
.
.
.
.
OTHERWISE EXECUTE THIS
END;
RUN;


Where Statement: A Where statement is also used to test conditions & its used mostly in subsetting dataset. Syntax of a where statement subsetting data is given below.

DATA XYZ;
SET ABC;
WHERE CONDITION1;
RUN




DO Loops: Sometimes we need a certain part of code to be run iteratively. One way is to write these codes again & again or simply use a 'DO' loop statement. A Do loop can run a particlular lines of code for a defined no. of times.
This Do loop can perform a code after a condition is checked as mentioned below. Suppose we have if-conditional statements which can perform a single action after checking given conditions. This will increase length of code & hence in-turn will decrease program inefficiency.

IF <CONDITION1> THEN <ACTION1>;
IF <CONDITION1> THEN <ACTION2>;
IF <CONDITION1> THEN <ACTION3>;
IF <CONDITION1> THEN <ACTION4>;

DO Group statement:  Using a Do group, we can perform multiple actions in a single conditional statement. For above IF statements we can write,

IF <CONDITION1> THEN  DO;
<ACTION1>;
<ACTION2>;
<ACTION3>;
<ACTION4>;
END;


Iterative DO loops: When we have to run a group of statement multiple times, without checking any condition on variable. An index variable is used to program no. of iteration. This index variable is incremented every time by a pre-defined incremental value & following statements will be executed till end statement. This loop will keep on iterating till the index variable value is less than or equal to 'TO' value. Its syntax is defined as

DO  N=  <initial value>  TO <terminal value>  BY <incremental value>;
STATEMENT1;
STATEMENT2;
.
.
.
END;
RUN;


DO WHILE/ DO UNTIL Loops: These loops are used when in place of index variable we have to test a condition on variable present in data or calculated variable. Choosing between DO-WHILE or DO-UNTIL depends upon when conditions applied to the loop will be checked. In a DO-UNTIL loop, condition is checked at the bottom of the loop, everytime after loop is iterated. While, a DO-WHILE loop is used when condition needs to be checked at the beginning of loop. So here first condition is tested & if true then loop is run one time & then again condition is checked. Syntax of loops are

DO UNTIL (<VARIABLE CONDITION>);
STATEMENT1
STATEMENT2
.
.
.
END;


DO WHILE (<VARIABLE CONDITION>);
STATEMENT1
STATEMENT2
.
.
.
END;

Formats & Informats in SAS


Formats and informats are one of the most important PROCs in SAS, as it defines the appearance of SAS variable in ouputs of a dataset. They are used to group variables, without changing the internal data of input dataset. 

Informats are used to 'READ' a value in a particular manner or it generally tells SAS about what is the structure of datatype of an input dataset. It doesnot give any information about how SAS is going to write this data in output dataset. This information is defined by Formats, which will tell SAS how to 'WRITE' data in dataset.

There are many informats & formats pre-defined into SAS. Few of them are...

DOLLAR5. : These informats are used to read values like salary, revenue, profit etc., values which are having a '$' sign mentioned in data. '5.' which is mentioned after 'DOLLAR' informat is the length of variable. In order to write the values in same format as in raw datafile, we have to mention all formats with variable in a FORMAT statement. 

COMMA6.:  A COMMA informat is used to read values having a comma delimiter in data value. These values are generally for a numeric variable & in absence of the appropriate COMMA informat SAS will treate those numeric values as charcters & output a missing "." value in place of value.

DATE9.:  SAS has lots of Date formats which we will explain in detail in our future post related to Date type variable. For now just remember DATE9. is a format which is used to read date type variable like 12JUN1993.

MMDDYY10.: This informat/format is used to read/write a date variable which is like 10/06/1983. In its absence SAS will treat these values as character variable.

MMDDYY8.: This is used to read or write 10/06/83.

Let's write  SAS formats for a data file having data of employees.

EMP_ID   GENDER  SALARY 
101               M              23000
103               m              35000
104               Male          42500
110                F               32500
119               Female      39500


SAS gives liberty to user to create user defined formats & informats using PROC FORMAT & PROC INFORMAT.

We will define a fomat for the salary bracket of the company employees.

PROC FORMAT;
VALUE SAL_BCKT  LOW - <35000 = 'Less than 35k'
                                    35000 - <50000 = 'btw 35 - 50k'
                                    50000 - HIGH = 'Gtr than 50k'
                                    other   = 'Missing';
RUN;

Since Gender is defined in so many ways we have to create a format in order to bring all in a similar standard form.

PROC FORMAT;
VALUE  $GEN   'm','M', 'Male''MALE'
                             'f','F','Female' = 'FEMALE';
RUN;      

Now when we write a code to input this employee data file we will introduce these formats

DATA  EMP_SAL;
INPUT EMP_ID GENDER$ SALARY;
INFILE " <FILE-PATH>" DSD;
FORMAT SALARY SAL_BCKT. GENDER $GEN.;
RUN;

Output dataset EMP_SAL Contains following values.


SAS users will always want to use their customised Formats & it is a general practise to save all the formats together in a single library which when loaded to SAS, will bring all previously defined Formats & Informats to SAS.

A SAS Format default library is Work folder. It can be changed to a pre-determined location using Library option in PROC FORMAT statement.

PROC FORMAT  LIBRARY = mylib;

Now whenevr we use a Format statement, SAS will first look into default formats, then into work library, then any other user defined library. In order to increase system performance & save all the time SAS spends looking for user defined formats in other places. we have to set

OPTIONS FMTSEARCH (mylib);

Now SAS will start looking for Formats from mylib library.

If a user want to see definition of all the Formats defined in SAS, he can do so by giving fmtlib option into Proc format statement. All the definition will come in output window.

PROC FORMAT LIBRARY = mylib FMTLIB;
RUN;

A 'SELECT' statement  can be used in order to see a particular format definition.





Steps involved in Code processing in SAS

This post is an attempt to answer those questions like what exactly happen when SAS runs a code or What happens at the back-end, when a user writes a code in SAS programming window & submit or run the program. Understanding this back-end SAS processing helps a programmer to know how SAS run & sometimes make their life easy by taking advantage of execution process.

SAS process any submitted code or data in 2 steps.

Compile Stage: In this stage SAS performs following tasks
1. In compile stage SAS assigns area in memory to store dataset called input buffer
2. SAS checks for input file & determines various variable attributes ( i.e. datatype, length etc.)
3. Reads code for any invalid syntax, errors & determines names of variable.
4. A descriptor portion is formed which will store all information related to variables such as variable name, datatype, length, label, default format & informat etc.

During compile state, SAS doesnot read any data from input file & doesnot evaluate any logical or condtional loops or statement. A reserved memory called as program data vector is also created to store all information about variables, data step, errors etc. Then SAS starts checking input data code & if any variable is assigned in between, it checks for its datatype, name & assign it a place in descriptor portion. 

Tip: We can declare the length for different variables before an input statement. This length will be then stored in descriptor portion as default length. 

Lets' consider a small program for salary of employees.

DATA EMP_SAL;
INPUT EMP_ID EMP_NAM$ AGE GENDER$ SALARY;
SALPERAGE = SALARY/AGE;
DATALINES;
101 AJAY 30 M 30000
102 MANI 28 F 28500
103 SAHIL 32 M 35000
;
RUN;

After submitting above program, SAS will first create a descriptor portion, which will store all attributes of variables. Descriptor portion will look like this

Descriptor Portion:

EMP_ID
EMP_NAM
AGE
GENDER
SALARY
SALPERAGE
NUMERIC (8BYTES) 
Format 12.
Informat 12.
 CHARACTER
 (8BYTES)
Format $8.
Informat $8.
NUMERIC (8BYTES)
Format 12.
Informat 12. 
CHARACTER (8BYTES)
Format $8.
Informat $8. 
NUMERIC (8BYTES)
Format 12.
Informat 12. 
NUMERIC (8BYTES)
Format 12.
Informat 12.

As you can see SAS has allocated default memories to each variable in input buffer. 

Now SAS is ready to run it's second stage of processing code.

Execution Stage: In excution stage,  all values are set to missing or no value. In SAS numeric missing value is denoted by 'PERIOD' i.e. "." & a character missing value is denoted by 'BLANK SPACE'  i.e. "  ".SAS starts with intially setting all variable values to missing & this happens every time SAS reads a new line of data. An internal pointer keeps a track of current record executed. SAS will keep on running or executing till it reaches an end of file marker.

Program Data Vector:  _n_ =1


EMP_ID
EMP_NAM
AGE
GENDER
SALARY
SALPERAGE
NUMERIC (8BYTES) 
Format 12.
Informat 12.
 CHARACTER
 (8BYTES)
Format $8.
Informat $8.
NUMERIC (8BYTES)
Format 12.
Informat 12. 
CHARACTER (8BYTES)
Format $8.
Informat $8. 
NUMERIC (8BYTES)
Format 12.
Informat 12. 
NUMERIC (8BYTES)
Format 12.
Informat 12.
 .

. 

. 
. 


When SAS executes first obs or line of data in above program

Program Data Vector: _n_ =1
EMP_ID
EMP_NAM
AGE
GENDER
SALARY
SALPERAGE
NUMERIC (8BYTES) 
Format 12.
Informat 12.
 CHARACTER
 (8BYTES) 
Format $8.
Informat $8.
NUMERIC (8BYTES)
Format 12.
Informat 12. 
CHARACTER (8BYTES)
Format $8.
Informat $8. 
NUMERIC (8BYTES)
Format 12.
Informat 12. 
NUMERIC (8BYTES)
Format 12.
Informat 12.
 101
AJAY
30
M
30000
. 

Then it calculates SALPERAGE variable & put this value in Program data Vector

Program Data Vector: _n_ =1
EMP_ID
EMP_NAM
AGE
GENDER
SALARY
SALPERAGE
NUMERIC (8BYTES) 
Format 12.
Informat 12.
 CHARACTER
 (8BYTES) 
Format $8.
Informat $8.
NUMERIC (8BYTES)
Format 12.
Informat 12. 
CHARACTER (8BYTES)
Format $8.
Informat $8. 
NUMERIC (8BYTES)
Format 12.
Informat 12. 
NUMERIC (8BYTES)
Format 12.
Informat 12.
 101
AJAY
30 
M
30000 
1000 

Once SAS reaches a run statement, it save data in input buffer or dataset & again turns back to dataline statement. Internal pointer for program is set to 2 & all values of variables in Program Data Vector is set to missing and SAS is ready to execute second line of data.

Program Data Vector: _n_ =2
EMP_ID
EMP_NAM
AGE
GENDER
SALARY
SALPERAGE
NUMERIC (8BYTES) 
Format 12.
Informat 12.
 CHARACTER
 (8BYTES) 
Format $8.
Informat $8.
NUMERIC (8BYTES)
Format 12.
Informat 12. 
CHARACTER (8BYTES)
Format $8.
Informat $8. 
NUMERIC (8BYTES)
Format 12.
Informat 12. 
NUMERIC (8BYTES)
Format 12.
Informat 12.
 .

. 

. 
. 

SAS will keep on executing until it reaches an end of file marker.

Inputting / Importing a file into SAS.


SAS has various methods of input a file into SAS as a SAS dataset. These method depend upon the type & format of data that we are bringing into SAS. Following are some of various methods to bring in large data files as SAS dataset

1. Infile statement
2. Column input
3. Format input
4. Using library engine
5. Proc import
6. Using menu options.

Infile statement: An infile statement is the most common & most frequently used data input method. This method has various options depending upon the format of data in input file. Below is the syntax for a file in csv format. I will explain the usage of all the options in detail

DATA FILE1;
INPUT STUDENT ID GENDER$ HEIGHT WEIGHT GRADES$;
INFILE " <FILE PATH>.../CLASS5.CSV" DSD TRUNCOVER;
RUN;

An Input statement will declare all the variable names before an Infile statement. This will create a field for all the variables in discriptor portion at the compile stage.
In an infile statement we have to give a path or a file reference in " " in place of file path. This will tell SAS where the data file is located, which we have to fetch in SAS.
A DSD option tells that data is "delimiter sensitive data" & if no delimiter is mentioned  in delim = or dlm =' '  option then, SAS takes it as a comma seperated value file.

TRUNCOVER, MISSOVER, STOPOVER, SCANOVER & FLOWOVER  are some of important options we need to mention, after having a look at the csv file.

Truncover: This option is used in column input & formatted input method. It is used to assign contents of input buffer to a variable field, when default length for variable is shorter than expected.

Missover:  It will put all remaining variables to missing in a line, if no. of variables are more than no. of observation.

Flowover: It continues to read input data record if it doesnot find values in current input line.

Stopover: Causes datastep to stop processing if an input statement reaches end of current line without finding values for all variables. It sets _error_ to 1 & stops inputting values in dataset & print an incomplete dataline in log. 

Column input: A Column input is a simpler way of inputting a .txt file in SAS. There is a drawback though that if data value consists of  commas or dollar sign then it cannot be used, It can input dates as character value.

Consider a txt file having data as below

Col. No.5             10         15          20          25          30          35
_  _  _  _ || _ _ _ _  ||  _ _ _ _ || _ _ _ _ || _ _ _ _||_ _ _ _ || _ _ _ _ ||
A J  A Y   S I N G H    3 5    3 5 0 0 0   M
N E E R J A                  2 8   3 2 0 0 0    F

Here is a program showing column input!!

DATA SAL_DATA;
INPUT NAME $ 1-10 AGE 12-13 SALARY 15-19 GENDER $ 21;
INFILE "<FILE-PATH.TXT>";
RUN;

NAME column starts from 1 to 10th & hence its position is mentioned in input statement 1-10. Similarly for AGE column position is 12-13 & others are SALARY at 15-19 & GENDER at 21.


Formatted Input:
 In this method we can mention informats to let SAS know what kind of variable observation we are inputting. This method can input all non-standard or standard firmats, date formats etc. We have to mention starting position of values, and not complete length, unlike column input method where we have to mention starting & finishing position for a variable observation. All starting position are prefixed by a "@" which is called as column pointer.


Now consider data in above file, but this time in more complex form.



Col. No.5             10         15          20          25          30          35
_  _  _  _ || _ _ _ _ ||  _  _ _  _ || _ _ _ _ || _ _ _ _ ||_ _ _ _ || _ _ _ _ ||
M R .     A JA Y   S I N G H    3 5 .  5   $ 35 0 0 0 . 0 0   M
N E E R J A                             2 8          $ 32 0 0 0 . 0 0   F


DATA SAL_DATA;
INPUT  @ 1  NAME  $  15.
              @16 AGE  3.1.
              @21 SALARY DOLLAR9.2.
              @31 GENDER $ 1.;
FORMAT NAME $15. AGE 3.1. SALARY DOLLAR9.2. GENDER $1.;
INFILE "<FILE-PATH>";
RUN;

Looking at input statement, it can be mentioned in 1 line, but in order to be in more undertandable form i have mentioned each variable declaration in new line. This is a valid statement as SAS needs a ";" semicolon at end of every statement & we have closed our Input statemen after declaring GENDER variable. Notice an 'AT' "@" sign before every variable name which is pointing towards its starting position. An informat is declared after every name. This tells SAS how to read the data & stored in descriptor portion of data.

In order to store or write a data value in SAS dataset in the similar way it is read or mentioned in data file, we have to mention a format statement after input statement. Here it is required to introduce FORMATS & INFORMATS, which will be described in greater length in upcoming post. For now you can just remember that A SAS Informat is used to read data into input buffer, while a SAS Format is used to write the data into SAS Dataset. 

My First SAS Program


SAS, Statistical Analysis Software is a software tool used to implement different statistical techniques and analysis on large data. SAS is one of the market leader in analysis tools industry. Its some of the major features include wide range of function library, results acceptability & data security.
This page will introduce you to the SAS coding &guide you through various aspects of sas coding.
I will keep on updating this page, with new functions time to time.

Let's begin & open your SAS window.
Courtesy: SAS institute, this edition is for education purpose only.

Once you click on SAS software icon, a window displayed as above will open. This window consist of three major sub-windows.

Program Editor Window: A SAS User write all programming codes in this window.

Log Window: In this section, a log of all your program compile or run stage errors, warning are displayed. It is imperative for a user to understand these errors & make changes to the program accordingly.

Explorer window: This acts as a navigation for user to move from one folder to other. All the dataset, formats & macros are saved in libraries. One can go to these libraries folder by navigation window & explore them.

Output window:  All program outputs are seen in this window. There are many configuration options to optimise the output results of any procedure.

My First SAS program:

Lets write a simple code for a Class-V student program. This data consist of Student ID, Height, Weight, Gender, Grades.

Every SAS program starts with a data step or a proc step.
A data step is used while creating, editing or sub-setting a dataset.

While a proc step is a procedure applied to a dataset.  This procedure can invoke a function like calculating mean of data, creating a cross tabulation, analysis of data.

Since we are creating a dataset, we will begin with datastep.

DATA CLASS5;
INPUT ID GENDER$ HEIGHT WEIGHT GRADES$;
DATALINES;
1 Male 78 33 C
2 Female 78 31 A+
3 Female 103 35 C
4 Female 71 37 C
5 Female 84 28 A
6 Male 104 34 C
7 Male 85 30 B+
...
..
.
57 Female 105 44 A+
58 Male 88 39 B+
59 Male 83 43 A
60 Male 77 28 C
;
RUN;

PROC PRINT DATA = CLASS5;
VAR HEIGHT WEIGHT;
RUN;

This is the right time to introduce the rules of naming of a variable or a dataset.
1, A variable or dataset name should not start with a number, underscore ( _ ) character.
2. Length of a name can not be more than 32 characters.
3. Spaces or dashes are not allowed.

Names like Class5, ID, Gender, Height, Weight, Grades are all valid names & can be used in SAS programming.
Class5 is a dataset name & that's why mentioned in the DATA step.

SAS has simplified datatypes by classifying into 2 categories.

First, Numeric, which represent data consisting of numbers. Here in our dataset we have height & weight as two numeric variable. The storage length in SAS is 8 bytes.

Character type data consist of letter, special characters, & numbers. In order to tell SAS that this is a character type variable name we have to place a dollar sign '$' after variable name.The storage length is 32 bytes

DATALINES is a keyword which tells SAS that lines following are of data, which are already declared in INPUT statement.

RUN statement is mentioned in the end of every SAS program. In its absence SAS keep on running & searching for a RUN statement.

Pic: 2 Showing Dataset CLASS5



PROC PRINT is a procedure statement that prints an output in Output window. Here we are generating a simple output of all the variables. Below is the output of the program.

Pic:3 Showing Output on Output window

Arithmetic Functions

MS-Excel is an important & most commonly used software tool to perform simple mathematical functions, visualization like plots & graphs as well as advanced excel functions for statistical techniques for data analysis. It has come a long way when first introduced in 1987 by Microsoft as a pioneer spreadsheet software. Since then many new functions are introduced & capabilities are also increased of this software.

On this page we will discuss about some important functions which are used by statisticians & researches everyday. We will also look into some advanced excel functions like pivot tables, data validation, & other analysis functions which are recently introduced in Excel-2016.

I assume that you are aware of simple drag & drop methods & special short keys & skipping the same for now.

All excel formulas can be written in cells where it is being applied or after selecting the cell we can write the same after an '=' sign in formula bar.

                                     
First let's begin with simple mathematical functions.

Arithmetic functions: All arithmetic functions can be performed in excel by just selecting the values. some of the functions are given below.

SUM function: This is used to sum all the selected cells. First cell & last selected cells will appear in the formula  i.e.

                                              = SUM(Cm:Cn)



Average function: It is used to find arithmetic average of selected cells value. This is similar to arithmetic mean. for other kinds of means like Geometric or Harmonic mean excel have different formula in Excel-2016



There are other functions like AVERAGEA or AVERAGEIF that combines power of logic with calculation. AVERAGEA is similar to AVERAGE function, only it consider a text string mentioned in selected cells as 0 or 1, depending upon the argument provided. AVERAGEIF or AVERAGEIFS test for logical conditions applied to particular cells & if it tests TRUE, then only corresponding Average_range cells is considered in calculation. 


COUNT Function: As name suggest it counts no. of cells, based on some criteria attached to it. There are several count functions in this family, COUNT( ) counts for cells that has numbers in them. While COUNTA() counts for cells that are not empty. Some other COUNT functions are listed in table below with their usage. 

Function Explanation
COUNTCounts for no. of cells that has numbers in them
COUNTACounts for all non-empty cells
COUNTBLANK Counts for all blank cells
COUNTIFCounts no. of cells with in a range that meets the condition
COUNTIFSCounts no. of cells with in a range that meets the set of  condition
DCOUNTCounts no. of cells that contains number in a database
DCOUNTACounts no. of cells that are non-empty



Statistical Functions: Excel has lots of functions that are used extensively  by data scientist when analysing &  applying various hypothesis tests, & calculating various statistics of a data. Excel 2016 has special portion under Data tab, which has almost all the statistical techniques. I will explain each & everyone of it in detail with their utility & approach.

Path: Data -> Analyze -> Data Analysis


Pic3: F-Test- 2 sample variance Analysis for Height & Weight of 5 student

Concept of Probability & Probability distributions

Probability plays an important role in  estimation of various statistic and parameters, & hence it is imperative to understand probability concepts before moving forward. Whenever you are interacting with any problem related to statistic, there is always a "chance" that your results are acceptable or within a acceptable range. 
A Hypothesis testing is solely based on the chance that some statement made is supported by the results or not. Here also we have difference probability distributions that comes into picture.

Probability: A probability is a measure of chance of a favorable outcome, it can be calculated as ratio of  no. of all favorable outcomes to total no. of outcomes. This is an important concept as it helps in predicting the chances of a future event, which is at the core of any predictive or prescriptive data analysis.
Many of such events are not falling in the definition of probability but still can be figured, using probability based naive bayes theorem & other advanced probability concepts.

 Probability =      No. of favorable outcomes    
                               Total no, of outcomes

It can be represented as
                      P[x] ϵ [0,1] ,  it means probability of any event 'X' belongs to a set of values between 0 & 1 including both.

To understand the concept of probability in simpler manner, let's take an example of a railway station, from where a passenger can go only in 2 direction, either North or South. You are standing near ticket counter & a passenger comes to ticket window, what is the probability or chances that he boards a Northbound train.

Ok lets count our favorable outcomes = 1 i.e. northbound train
& total outcomes = 2 i.e. northbound or southbound train

So the chances or the probability of this passenger to take a Northbound train is
               
                                                   =    No. of favorable outcomes          =   1     = 0.5 or 50%
                                                              Total no, of outcomes                  2

So there are 50% chance or 1/2 or 0.5 probability of him boarding a northbound train.

Please consider that a passenger after taking a ticket will anyhow board the train, going to his destined direction.

Now does that mean that if there are 2000 passengers coming to station then 1000 passenger will board Northbound train and other 1000 passenger a Southbound train?
Not exactly! The probability is just a measure which defines a chance for a favorable outcome.

Hope this clears your doubt as the probability does not mean that out of 100 passenger, if  50 takes a northbound train then other 50 are bound to take a southbound train, that never happens. But repeating this experiment a large no. of times will return a no. close to 0.5. & if we keep on doing this experiment of asking infinite no. of passengers about their destination( northbound or sothbound) an infinite no. of times then we will get this ratio as 0.5.

There are certain terminalogies which one should be aware of while moving around Probability.

Experiment: An experiment is the occurrance of a random event.
Sample space : This is a set of all possible outcome of an experiment.
Sample point: Sometimes refered to as sample, this is a possible outcomes of an experiment.This is a necessary unit subset of Sample space.

Event:This is a set of one or more sample points or possible outcomes.

Mutually Exclusive Event: An event is said to be mutually exclusive if occurrance of an outcome causes non-occurrance of other events. 

Non-Mutually Exclusive Event: An event is said to be mutually exclusive if occurrance of an outcome causes non-occurrance of other events. 

Exhaustive Events:

My First SAS Program