lundi 11 juillet 2011

ROWID : Détecter puis supprimer les doublons d'une table

Problème classique : détection puis suppression de doublons.


SQL> CREATE TABLE toto(code_pays CHAR(3), ville VARCHAR2(30));

Table created.

SQL> INSERT INTO toto VALUES ('FRA','PARIS');

1 row created.

SQL> INSERT INTO toto VALUES ('FRA','LYON');

1 row created.

SQL> INSERT INTO toto VALUES ('FRA','LYON');

1 row created.

SQL> INSERT INTO toto VALUES ('USA','DENVER');

1 row created.

SQL> SELECT code_pays, ville, COUNT(*) FROM toto GROUP BY code_pays, ville HAVING COUNT(*) > 1;

COD VILLE                            COUNT(*)
--- ------------------------------ ----------
FRA LYON                                    2

SQL> DELETE FROM toto a WHERE ROWID > (SELECT MIN(b.rowid) FROM toto b WHERE a.code_pays=b.code_pays AND a.ville=b.ville);

1 row deleted.

SQL> SELECT code_pays, ville, COUNT(*) FROM toto GROUP BY code_pays, ville HAVING COUNT(*) > 1;

no rows selected.

Cas où l'une des colonnes contient des valeurs nulles parmi les lignes en doublons

Exemple :

SQL> INSERT INTO toto VALUES (null,'PARIS');

1 row created.

SQL> INSERT INTO toto VALUES (null,'PARIS');

1 row created.

SQL> INSERT INTO toto VALUES (null,'PARIS');

1 row created.

SQL> SELECT code_pays, ville, COUNT(*) FROM toto GROUP BY code_pays, ville HAVING COUNT(*) > 1;

COD VILLE                            COUNT(*)
--- ------------------------------ ----------
    PARIS                                   3

SQL> DELETE FROM toto a WHERE ROWID > (
  2  SELECT MIN(b.rowid) FROM toto b
  3  WHERE a.code_pays=b.code_pays AND a.ville=b.ville
  4  );

0 rows deleted.

SQL> DELETE FROM toto a WHERE ROWID > (
  2  SELECT MIN(b.rowid) FROM toto b
  3  WHERE (a.code_pays=b.code_pays AND a.ville=b.ville)
  4  OR (
  5    a.code_pays IS NULL AND b.code_pays IS NULL
  6    AND a.ville=b.ville
  7  )
  8  );

2 rows deleted.

2 commentaires: