**SAS Arrays : Introduction**

It provides a simple, appropriate way to process a group of variables in a SAS DATA step.

**Syntax**

Array array-name {number-of-elements} list-of-variables;

**Note:**You can use [ ] or { } or ( ) for defining number of elements in the ARRAY statement.

**Examples**

1. ARRAY ABC[5] a b c d e;

2. ARRAY ABC[*] a b c d e;

In the example above, SAS would automatically calculate the number of variables in array.

Where the X1 variable contains the X1 value, X2 contains the X2 value, etc.

4. ARRAY ABC[*] $ X1-X10;

*If the variables are of character type then use $ sign before specifying list of variables.*

**Sample Data**

SAS Array : Example |

data temp;

input x1 x2 x3 x4$ x5$;

cards;

1 2 3 AA BB

2 3 4 AB CC

3 4 5 AC DD

4 5 6 AD EE

5 6 7 AE FF

6 7 8 AF GG

;

run;

**Example I : Numeric variables having value greater than 3 need to be replaced with missing value**

data test;

set temp;

array nvars {3} x1-x3;

do i = 1 to 3;

if nvars{i} > 3 then nvars{i} =.;

end;

run;

Output : Array Statement |

**Why i is 4 in the output data set?**

The first time the loop processes, the value of count is 1; the second time, 2; and the third time, 3. At the beginning of the fourth iteration, the value of count is 4, which is found to be greater than the stop value of 3 so the loop stops. However, the value of i is now 4 and not 3, the last value before it would be greater than 3 as the stop value.

Note :We can drop variable "i" with drop statement or drop data set option.

**Improvised version of the above code**

data test;

set temp;

array nvars (*)_numeric_;

do i = 1 todim(nvars);

if nvars{i} > 3 then nvars{i} =.;

end;

drop i;

run;

**Notes -**

- The "
**_numeric_**" is used to specify all the numeric variables. - The
**DIM**function returns the number of elements (variables).

**Example II. : Extract first letter of all the character variables**

data test;

set temp;

array cvars (*) _character_;

do i = 1 to dim(cvars);

cvars{i} = substr(cvars{i},1,1);

end;

drop i;

run;

**Note -**The "_character_" is used to specify all the character variables.

**Example III. : Extract first letter and fill in the new character variables**

data test;

set temp;

array cvars (*) _character_;

array dvars (*) $ x6 X7;

do i = 1 to dim(cvars);

dvars{i} = substr(cvars{i},1,1) ;

end;

drop i;

run;

**Example IV : Assign Initial Values in a SAS Array**

data abcd;

set temp;

array nvars (*) _numeric_;

array pvars (*) px1 px2 px3;

array pctinc {3} _temporary_ (1.1 , 1.2 ,1.3);do i = 1 to dim(nvars);

pvars{i} = nvars{i} * pctinc{i};

end;

drop i;

run;

**Notes -**

- In the above example, we are multiplying variables' values with different numbers.
- When the key word
**_TEMPORARY_**is used in a ARRAY statement, data elements are created but are not stored in the data file.

**Example V : Calculate Percentage Growth**

data abcd;

set temp;

array nvars(*) _numeric_;

array diff{2} _temporary_;

array percent{2};

do i = 1 to 2;

diff{i} = nvars{i +1} - nvars{i};

percent{i} = diff{i} / nvars{i} ;

end;

drop i;

run;

**Using the OF Operator in a SAS Array**

**The following two codes are equivalent :**

array gnp (*) x y z;

sumgnp = sum(of gnp(*));

**OR**

sumgnp = sum(x,y,z);*Calculate the mean;

mean_score = mean(of gnp(*));

* Calculate the minimum;

min_score = min(of gnp(*));

**Suppose you are asked to create a flag in cases wherein sum of variables x1,x2 and x3 is greater than 10.**

data test;

set temp;

array nvars (*) x1-x3;

if sum(of nvars(*)) > 10 then flag =1;

else flag=0;

run;

**DO OVER LOOP**

The DO OVER loop is one of the most useful DO loops. It can be used with an array when indexing of the array is not needed.

data test;

set temp;

array nvars _numeric_;

do over nvars;

if nvars > 3 then nvars = .;

end;

run;

its very useful and easily understandable for starters like me ... thank you ...keep posting

ReplyDeleteIt is really good and understandable. Really helpful for the people who are trying to come in SAS. Keep Posting..... Thanks:)

ReplyDeletegood content but confusing without actual figures to imagine the process

ReplyDeleteThe content is very good. Please keep it up.

ReplyDeleteThanks for your teaching. Its easy to understand. Much appreciated

ReplyDeleteThe examples are easy to understand thanks a lot for posting but I didn't understand what is the use of 1,1 in this example

ReplyDeletedata test;

set temp;

array cvars (*) _character_;

array dvars (*) $ x6 X7;

do i = 1 to dim(cvars);

dvars{i} = substr(cvars{i},1,1) ;

end;

drop i;

run;

Substring starting character 1 and length is 1

ReplyDeleteNice site--you showed me some PROC FREQ options that I wasn't aware of!

ReplyDeleteNOTE: If there are more than 2 character variables you'll get an error here because it will try to reference an index that doesn't exist in the second array with only 2 members.

data test;

set temp;

array cvars (*) _character_;

array dvars (*) $ x6 X7;

do i = 1 to dim(cvars);

dvars{i} = substr(cvars{i},1,1) ;

end;

drop i;

run;

can anyone explain do over loop in detail. .what is mean by indexing of array ??

ReplyDeleteIndexing here means that we do not need to declare any variables as we did it in first code. If the operation needs to be performed on all the variables then you do not need to define any array with an index it will automatically pick all the variables and work on it.

Delete