This tutorial explains how to remove duplicates by a column but returns all the columns.
data readin;
input ID Name $ Score;
1     David   45
1     David   74
2     Sam     45
2     Ram     54
3     Bane    87
3     Mary    92
3     Bane    87
4     Dane    23
5     Jenny   87
5     Ken     87
6     Simran  63
8     Priya   72
Suppose you want to remove duplicates based on name but returns all the variables.
proc sql noprint;
create table tt (drop = row_num) as
select *, monotonic() as row_num
from readin
group by name
having row_num = min(row_num)
order by ID;
Method 2 :
proc sql noprint;
create table tt as
select name, max(ID)as ID, max(Score) as Score
from readin
group by name;
The method 2 might not be the desired output. You can also use MIN instead of MAX. 
Love this Post? Spread the Word!
Comment and share to motivate us to write more!
About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 8 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Telecom and Human Resource.

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*
Related Posts:
4 Responses to "NODUPKEY with PROC SQL"
  1. sir, please display the result of query

  2. sir, please display the result of query

  3. Can you please explain the code here? how it is working?

  4. please explain
    not clear how it works


We have Zero Tolerance to Spam. Comments with links will be deleted immediately upon our review.

Next → ← Prev
Scroll to Top