Powered By Blogger

Friday, July 1, 2016

Remove Duplicate Rows from a Table in SQL Server

EMPLOYEE_IDATTENDANCE_DATE
A0012011-01-01
A0012011-01-01
A0022011-01-01
A0022011-01-01
A0022011-01-01
A0032011-01-01
So how can we delete those duplicate data?

Solution

First, insert an identity column in that table by using the following code:
ALTER TABLE dbo.ATTENDANCE ADD AUTOID INT IDENTITY(1,1)  
Now the table data will be like the following table:
EMPLOYEE_IDATTENDANCE_DATEAUTOID
A0012011-01-011
A0012011-01-012
A0022011-01-013
A0022011-01-014
A0022011-01-015
A0032011-01-016
Check the AUTOID column. Now we will start playing the game with this column.
Now use the following code to find out the duplicate rows that exist in the table.
SELECT * FROM dbo.ATTENDANCE WHERE AUTOID NOT IN (SELECT MIN(AUTOID) _
 FROM dbo.ATTENDANCE GROUP BY EMPLOYEE_ID,ATTENDANCE_DATE)  
The above code will give us the following result:
EMPLOYEE_IDATTENDANCE_DATEAUTOID
A0012011-01-012
A0022011-01-014
A0022011-01-015
Ultimately, these are the duplicate rows which we want to delete to resolve the issue. Use the following code to resolve it.
DELETE FROM dbo.ATTENDANCE WHERE AUTOID NOT IN (SELECT MIN(AUTOID) _
 FROM dbo.ATTENDANCE GROUP BY EMPLOYEE_ID,ATTENDANCE_DATE) 
Now check the data. No duplicate rows exist in the table.

No comments:

Post a Comment