¿Cómo puedo encontrar valores duplicados en una tabla en Oracle?

Question

¿Cómo puedo encontrar valores duplicados en una tabla en Oracle?

¿Cuál es la instrucción SQL más simple que devolverá los valores duplicados para una columna dada y el recuento de sus ocurrencias en una tabla de base de datos Oracle?

Por ejemplo: Tengo una tabla JOBS con la columna JOB_NUMBER. ¿Cómo puedo saber si tengo duplicados JOB_NUMBERy cuántas veces están duplicados?

233

sql oracle duplicate-data

Author: Bill the Lizard, 2008-09-12

Source

13 answers

Otra manera:

SELECT *
FROM TABLE A
WHERE EXISTS (
  SELECT 1 FROM TABLE
  WHERE COLUMN_NAME = A.COLUMN_NAME
  AND ROWID < A.ROWID
)

Funciona bien (lo suficientemente rápido) cuando hay un índice en column_name. Y es una mejor manera de eliminar o actualizar filas duplicadas.

46

Author: Grrey,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2012-10-03 23:28:20

Más simple que se me ocurre:

select job_number, count(*)
from jobs
group by job_number
having count(*) > 1;

29

Author: JosephStyons,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-09-12 15:17:14

No necesita siquiera tener el recuento en las columnas devueltas si no necesita saber el número real de duplicados. por ejemplo,

SELECT column_name
FROM table
GROUP BY column_name
HAVING COUNT(*) > 1

15

Author: Evan,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-09-13 14:55:49

Qué tal:

SELECT <column>, count(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1;

Para responder al ejemplo anterior, se vería como:

SELECT job_number, count(*)
FROM jobs
GROUP BY job_number HAVING COUNT(*) > 1;

7

Author: Andrew,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-09-12 15:18:28

En caso de que varias columnas identifiquen una fila única (por ejemplo, tabla de relaciones), puede usar lo siguiente

Usar id de fila por ejemplo, emp_dept (empid, deptid, startdate, enddate) supongamos que empid y deptid son únicos e identifican la fila en ese caso

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.rowid <> ied.rowid and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);

Y si dicha tabla tiene clave primaria entonces use clave primaria en lugar de rowid, por ejemplo, id es pk entonces

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.id <> ied.id and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);

5

Author: Jitendra Vispute,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2012-09-20 07:25:14

Haciendo

select count(j1.job_number), j1.job_number, j1.id, j2.id
from   jobs j1 join jobs j2 on (j1.job_numer = j2.job_number)
where  j1.id != j2.id
group by j1.job_number

Le dará los id de las filas duplicadas.

4

Author: agnul,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-09-12 15:24:34

SELECT   SocialSecurity_Number, Count(*) no_of_rows
FROM     SocialSecurity 
GROUP BY SocialSecurity_Number
HAVING   Count(*) > 1
Order by Count(*) desc

4

Author: Wahid Haidari,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2013-04-05 07:10:57

Normalmente uso Oracle Analytic function ROW_NUMBER () .

Supongamos que desea verificar los duplicados que tiene con respecto a un índice único o clave primaria construida en columnas (c1, c2, c3). Entonces irás por aquí, trayendo ROWID s de filas donde el número de líneas traídas por ROW_NUMBER() es >1:

Select * From Table_With_Duplicates
      Where Rowid In
                    (Select Rowid
                       From (Select Rowid,
                                    ROW_NUMBER() Over (
                                            Partition By c1 || c2 || c3
                                            Order By c1 || c2 || c3
                                        ) nbLines
                               From Table_With_Duplicates) t2
                      Where nbLines > 1)

0

Author: J. Chomel,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-10-24 08:21:15

Aquí hay una solicitud SQL para hacer eso:

select column_name, count(1)
from table
group by column_name
having count (column_name) > 1;

0

Author: Chaminda Dilshan,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2018-01-12 12:09:38

Sé que es un hilo viejo, pero esto puede ayudar a alguien.

Si necesita imprimir otras columnas de la tabla mientras comprueba si hay duplicados, use a continuación:

select * from table where column_name in
(select ing.column_name from table ing group by ing.column_name having count(*) > 1)
order by column_name desc;

También puede agregar algunos filtros adicionales en la cláusula where si es necesario.

0

Author: Parth Kansara,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2018-07-23 07:57:48

1. solución

select * from emp
    where rowid not in
    (select max(rowid) from emp group by empno);

-1

Author: DoOrDie,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2016-02-11 07:01:27

También puede intentar algo como esto para enumerar todos los valores duplicados en una tabla.]}

SELECT count(poid) 
FROM poitem 
WHERE poid = 50 
AND rownum < any (SELECT count(*)  FROM poitem WHERE poid = 50) 
GROUP BY poid 
MINUS
SELECT count(poid) 
FROM poitem 
WHERE poid in (50)
GROUP BY poid 
HAVING count(poid) > 1;

-1

Author: Stacker,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-05-12 17:06:38

score 502 · Accepted Answer

select column_name, count(column_name)
from table
group by column_name
having count (column_name) > 1;

502

Author: Bill the Lizard,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-09-12 15:13:46