Removing duplicate rows in PostgreSQL, Oracle and SQLite

Sometimes you need to remove nasty duplicate on a table, based on a subset of the column.
On every big database there is something called “rowid” which can be used to indentify a column in a unique way.
On PostgreSQL is called ctid, as we shall see:

Oracle way:

DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);

Postgresql equivalent is ctid, so you can try something like

delete from mytable
where exists (select 1
from mytable t2
where t2.name = mytable.name and
t2.address = mytable.address and
t2.zip = mytable.zip and
t2.ctid > mytable.ctid
);

SQLite demo:

create table dup_table (column1 text, column2 text, column3 text);

insert into dup_table values('1','2','3');
insert into dup_table values('1','2','3');
insert into dup_table values('1','3','3');

DELETE FROM dup_table
WHERE rowid not in
    (SELECT MIN(rowid)
    FROM dup_table
    GROUP BY column1, column2);

select count(*) from dup_table;

 

Also take a look at this old post on postgresql mailing list, if you incur in an older PostgreSQL version

Rev history
Added SQLITE on August 26th 2019
Published on Jun 16,2017

This entry was posted in English Content, Knowledgebase, Structured Query Language (SQL) and tagged , , , , , , , . Bookmark the permalink.