¿Cómo puedo eliminar elementos duplicados de una matriz en Perl?


Tengo una matriz en Perl:

my @my_array = ("one","two","three","two","three");

¿Cómo elimino los duplicados de la matriz?

Author: Сухой27, 2008-08-11

10 answers

Puedes hacer algo como esto como se demuestra en perlfaq4 :

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";

Salidas:

one two three

Si desea utilizar un módulo, pruebe la función uniq desde List::MoreUtils

 147
Author: Greg Hewgill,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2014-06-21 01:39:29

La documentación de Perl viene con una buena colección de preguntas frecuentes. Su pregunta es frecuente:

% perldoc -q duplicate

La respuesta, copiar y pegar desde la salida del comando anterior, aparece a continuación:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;
 117
Author: John Siracusa,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2008-08-11 14:30:52

Install List:: MoreUtils from CPAN

Luego en su código:

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);
 66
Author: Ranguard,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2016-05-15 14:32:32

Mi forma habitual de hacer esto es:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

Si usa un hash y agrega los elementos al hash. También tienes la ventaja de saber cuántas veces aparece cada elemento en la lista.

 22
Author: Xetius,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2014-07-20 16:35:58

Se puede hacer con un simple Perl one liner.

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.

El bloque PFM hace esto:

Los datos en @in se introducen en el MAPA. MAP construye un hash anónimo. Las claves se extraen del hash y se alimentan en @out

 7
Author: Hawk,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2011-11-09 21:30:21

La variable @ array es la lista con elementos duplicados

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;
 6
Author: Sreedhar,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2013-07-15 19:02:12

Ese último fue bastante bueno. Solo lo retocaría un poco:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
  if ( ! grep( /$var/, @uniqarr ) ){
     push( @uniqarr, $var );
  }
}

Creo que esta es probablemente la forma más legible de hacerlo.

 3
Author: jh314,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2013-07-16 03:37:30

Método 1: Usar un hash

Lógica: Un hash puede tener solo claves únicas, por lo que iterar sobre matriz, asignar cualquier valor a cada elemento de matriz, manteniendo el elemento como clave de ese hash. Devuelve las claves del hash, es tu matriz única.

my @unique = keys {map {$_ => 1} @array};

Método 2: Extensión del método 1 para la reutilización

Mejor hacer una subrutina si se supone que debemos usar esta funcionalidad varias veces en nuestro código.

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);

Método 3: Utilizar el módulo List::MoreUtils

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);
 2
Author: Kamal Nayan,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-05-09 15:29:44

Pruebe esto, parece que la función uniq necesita una lista ordenada para funcionar correctamente.

use strict;

# Helper function to remove duplicates in a list.
sub uniq {
  my %seen;
  grep !$seen{$_}++, @_;
}

my @teststrings = ("one", "two", "three", "one");

my @filtered = uniq @teststrings;
print "uniq: @filtered\n";
my @sorted = sort @teststrings;
print "sort: @sorted\n";
my @sortedfiltered = uniq sort @teststrings;
print "uniq sort : @sortedfiltered\n";
 0
Author: saschabeaumont,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2015-05-26 01:56:44

Usando el concepto de claves hash únicas:

my @array  = ("a","b","c","b","a","d","c","a","d");
my %hash   = map { $_ => 1 } @array;
my @unique = keys %hash;
print "@unique","\n";

Salida: a c b d

 0
Author: Sandeep_black,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-03-30 09:47:16