CREATE TABLE #src (x int, y int, z int)
GO
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (10, 20, 30)
INSERT #src VALUES (100, 200, 300)
GO
SELECT * FROM #src
At this point, #src looks like this (I cannot seem to remove the odd white space in this table):
x y z
1 2 3
1 2 3
1 2 3
10 20 30
100 200 300
Now let's remove our duplicates:
DELETE TOP(2)
FROM #src
WHERE x = 1
SELECT * FROM #src
And we see the results:
x y z
1 2 3
10 20 30
100 200 300
In the DELETE TOP(n) part of the statement, n=#dupes - 1. In our case, we had 3 duplicates and removed 2 of them. Which brings up an important point - we used the "x=1" condition to identify the group, and we knew that it had exactly 3 rows in that dupe group. So if we generalize the problem, we may want to use dynamic SQL to build a solution:
; WITH DupeGroups AS (
SELECT
[x]
,[cnt] = COUNT(*)
FROM #src
GROUP BY [x]
HAVING COUNT(*) > 1
)
SELECT
[sql] = 'DELETE TOP(' + CONVERT(varchar, [cnt] - 1)
+ ') FROM #src WHERE [x] = ' + CONVERT(varchar, [x]) + ';'
FROM DupeGroupsThis can then be executed using sp_SqlExec to remove the duplicates.
No comments:
Post a Comment