Removing duplicate rows in PostgreSQL, Oracle and SQLite

This entry is part 3 of 5 in the series SQLite

Sometimes you need to remove nasty duplicate on a table, based on a subset of the column.
On every big database there is something called “rowid” which can be used to indentify a column in a unique way.
On PostgreSQL is called ctid, as we shall see:

Oracle way:

DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);

Postgresql equivalent is ctid, so you can try something like

delete from mytable
where exists (select 1
from mytable t2
where t2.name = mytable.name and
t2.address = mytable.address and
t2.zip = mytable.zip and
t2.ctid > mytable.ctid
);

SQLite demo:

create table dup_table (column1 text, column2 text, column3 text);

insert into dup_table values('1','2','3');
insert into dup_table values('1','2','3');
insert into dup_table values('1','3','3');

DELETE FROM dup_table
WHERE rowid not in
    (SELECT MIN(rowid)
    FROM dup_table
    GROUP BY column1, column2);

select count(*) from dup_table;

 

Also take a look at this old post on postgresql mailing list, if you incur in an older PostgreSQL version

Rev history
Added SQLITE on August 26th 2019
Published on Jun 16,2017

wp-content/uploads/2021/03/2021_03_schema-macchine.png
This is a part of the SQLite Series
This entry was posted in English Content, Knowledgebase, Structured Query Language (SQL) and tagged , , , , , , , . Bookmark the permalink.