Sometimes you need to remove nasty duplicate on a table, based on a subset of the column.
On every big database there is something called “rowid” which can be used to indentify a column in a unique way.
On PostgreSQL is called ctid, as we shall see:
Oracle way:
DELETE FROM your_table WHERE rowid not in (SELECT MIN(rowid) FROM your_table GROUP BY column1, column2, column3);
Postgresql equivalent is ctid, so you can try something like
delete from mytable where exists (select 1 from mytable t2 where t2.name = mytable.name and t2.address = mytable.address and t2.zip = mytable.zip and t2.ctid > mytable.ctid );
SQLite demo:
create table dup_table (column1 text, column2 text, column3 text); insert into dup_table values('1','2','3'); insert into dup_table values('1','2','3'); insert into dup_table values('1','3','3'); DELETE FROM dup_table WHERE rowid not in (SELECT MIN(rowid) FROM dup_table GROUP BY column1, column2); select count(*) from dup_table;
Also take a look at this old post on postgresql mailing list, if you incur in an older PostgreSQL version
Rev history
Added SQLITE on August 26th 2019
Published on Jun 16,2017