Removing duplicate rows in PostgreSQL, Oracle and SQLite

Jun 15, 2017 · 1 min read

en knowledgebase sql · evil oracle postgresql sql

Sometimes you need to remove nasty duplicate on a table, based on a subset of the column. On every big database there is something called “rowid” which can be used to indentify a column in a unique way. On PostgreSQL is called ctid, as we shall see:

Oracle way:

DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);

Postgresql equivalent is ctid, so you can try something like

delete from mytable
where exists (select 1
from mytable t2
where t2.name = mytable.name and
t2.address = mytable.address and
t2.zip = mytable.zip and
t2.ctid > mytable.ctid
);

SQLite demo:

create table dup_table (column1 text, column2 text, column3 text);

insert into dup_table values('1','2','3');
insert into dup_table values('1','2','3');
insert into dup_table values('1','3','3');

DELETE FROM dup_table
WHERE rowid not in
    (SELECT MIN(rowid)
    FROM dup_table
    GROUP BY column1, column2);

select count(*) from dup_table;

Also take a look at this old post on postgresql mailing list, if you incur in an older PostgreSQL version

Rev history Added SQLITE on August 26th 2019 Published on Jun 16,2017