2013-07-04 5 views
1

Je cours un travail de cluster canopy (en utilisant mahout) sur une cloudera cdh4. le contenu à regrouper a environ 1m d'enregistrements (chaque enregistrement a une taille inférieure à 1k). l'ensemble de l'environnement hadoop (y compris tous les nœuds) tourne dans un VM avec une mémoire 4G. l'installation de cdh4 est par défaut. J'ai l'exception suivante lors de l'exécution du travail.cloudera hadoop mapreduce travail GC overhead limit dépassé erreur

Il semble que le client du travail doive avoir besoin d'une taille de segment jvm supérieure selon l'exception. Cependant, il existe quelques options de configuration pour la taille de segment de mémoire jvm dans le gestionnaire cloudera. J'ai changé "taille de tas de Java de client en octets" de 256MiB à 512MiB. Cependant, il n'a pas amélioré. Avez-vous des conseils ou des astuces sur la définition de ces options de taille de segment?

13/07/03 17:12:45 INFO input.FileInputFormat: Total input paths to process : 1 
13/07/03 17:12:46 INFO mapred.JobClient: Running job: job_201307031710_0001 
13/07/03 17:12:47 INFO mapred.JobClient: map 0% reduce 0% 
13/07/03 17:13:06 INFO mapred.JobClient: map 1% reduce 0% 
13/07/03 17:13:27 INFO mapred.JobClient: map 2% reduce 0% 
13/07/03 17:14:01 INFO mapred.JobClient: map 3% reduce 0% 
13/07/03 17:14:50 INFO mapred.JobClient: map 4% reduce 0% 
13/07/03 17:15:50 INFO mapred.JobClient: map 5% reduce 0% 
13/07/03 17:17:06 INFO mapred.JobClient: map 6% reduce 0% 
13/07/03 17:18:44 INFO mapred.JobClient: map 7% reduce 0% 
13/07/03 17:20:24 INFO mapred.JobClient: map 8% reduce 0% 
13/07/03 17:22:20 INFO mapred.JobClient: map 9% reduce 0% 
13/07/03 17:25:00 INFO mapred.JobClient: map 10% reduce 0% 
13/07/03 17:28:08 INFO mapred.JobClient: map 11% reduce 0% 
13/07/03 17:31:46 INFO mapred.JobClient: map 12% reduce 0% 
13/07/03 17:35:57 INFO mapred.JobClient: map 13% reduce 0% 
13/07/03 17:40:52 INFO mapred.JobClient: map 14% reduce 0% 
13/07/03 17:46:55 INFO mapred.JobClient: map 15% reduce 0% 
13/07/03 17:55:02 INFO mapred.JobClient: map 16% reduce 0% 
13/07/03 18:08:42 INFO mapred.JobClient: map 17% reduce 0% 
13/07/03 18:59:11 INFO mapred.JobClient: map 8% reduce 0% 
13/07/03 18:59:13 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_0, Status : FAILED 
Error: GC overhead limit exceeded 
13/07/03 18:59:23 INFO mapred.JobClient: map 9% reduce 0% 
13/07/03 19:00:09 INFO mapred.JobClient: map 10% reduce 0% 
13/07/03 19:01:49 INFO mapred.JobClient: map 11% reduce 0% 
13/07/03 19:04:25 INFO mapred.JobClient: map 12% reduce 0% 
13/07/03 19:07:48 INFO mapred.JobClient: map 13% reduce 0% 
13/07/03 19:12:48 INFO mapred.JobClient: map 14% reduce 0% 
13/07/03 19:19:46 INFO mapred.JobClient: map 15% reduce 0% 
13/07/03 19:29:05 INFO mapred.JobClient: map 16% reduce 0% 
13/07/03 19:43:43 INFO mapred.JobClient: map 17% reduce 0% 
13/07/03 20:49:36 INFO mapred.JobClient: map 8% reduce 0% 
13/07/03 20:49:38 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_1, Status : FAILED 
Error: GC overhead limit exceeded 
13/07/03 20:49:48 INFO mapred.JobClient: map 9% reduce 0% 
13/07/03 20:50:31 INFO mapred.JobClient: map 10% reduce 0% 
13/07/03 20:52:08 INFO mapred.JobClient: map 11% reduce 0% 
13/07/03 20:54:38 INFO mapred.JobClient: map 12% reduce 0% 
13/07/03 20:58:01 INFO mapred.JobClient: map 13% reduce 0% 
13/07/03 21:03:01 INFO mapred.JobClient: map 14% reduce 0% 
13/07/03 21:10:10 INFO mapred.JobClient: map 15% reduce 0% 
13/07/03 21:19:54 INFO mapred.JobClient: map 16% reduce 0% 
13/07/03 21:31:35 INFO mapred.JobClient: map 8% reduce 0% 
13/07/03 21:31:37 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_0, Status : FAILED 
java.lang.Throwable: Child Error 
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250) 
Caused by: java.io.IOException: Task process exit with nonzero status of 65. 
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237) 

13/07/03 21:32:09 INFO mapred.JobClient: map 9% reduce 0% 
13/07/03 21:33:31 INFO mapred.JobClient: map 10% reduce 0% 
13/07/03 21:35:42 INFO mapred.JobClient: map 11% reduce 0% 
13/07/03 21:38:41 INFO mapred.JobClient: map 12% reduce 0% 
13/07/03 21:42:27 INFO mapred.JobClient: map 13% reduce 0% 
13/07/03 21:48:20 INFO mapred.JobClient: map 14% reduce 0% 
13/07/03 21:56:12 INFO mapred.JobClient: map 15% reduce 0% 
13/07/03 22:07:20 INFO mapred.JobClient: map 16% reduce 0% 
13/07/03 22:26:36 INFO mapred.JobClient: map 17% reduce 0% 
13/07/03 23:35:30 INFO mapred.JobClient: map 8% reduce 0% 
13/07/03 23:35:32 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_1, Status : FAILED 
Error: GC overhead limit exceeded 
13/07/03 23:35:42 INFO mapred.JobClient: map 9% reduce 0% 
13/07/03 23:36:16 INFO mapred.JobClient: map 10% reduce 0% 
13/07/03 23:38:01 INFO mapred.JobClient: map 11% reduce 0% 
13/07/03 23:40:47 INFO mapred.JobClient: map 12% reduce 0% 
13/07/03 23:44:44 INFO mapred.JobClient: map 13% reduce 0% 
13/07/03 23:50:42 INFO mapred.JobClient: map 14% reduce 0% 
13/07/03 23:58:58 INFO mapred.JobClient: map 15% reduce 0% 
13/07/04 00:10:22 INFO mapred.JobClient: map 16% reduce 0% 
13/07/04 00:21:38 INFO mapred.JobClient: map 7% reduce 0% 
13/07/04 00:21:40 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_2, Status : FAILED 
java.lang.Throwable: Child Error 
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250) 
Caused by: java.io.IOException: Task process exit with nonzero status of 65. 
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237) 

13/07/04 00:21:50 INFO mapred.JobClient: map 8% reduce 0% 
13/07/04 00:22:27 INFO mapred.JobClient: map 9% reduce 0% 
13/07/04 00:23:52 INFO mapred.JobClient: map 10% reduce 0% 
13/07/04 00:26:00 INFO mapred.JobClient: map 11% reduce 0% 
13/07/04 00:28:47 INFO mapred.JobClient: map 12% reduce 0% 
13/07/04 00:32:17 INFO mapred.JobClient: map 13% reduce 0% 
13/07/04 00:37:34 INFO mapred.JobClient: map 14% reduce 0% 
13/07/04 00:44:30 INFO mapred.JobClient: map 15% reduce 0% 
13/07/04 00:54:28 INFO mapred.JobClient: map 16% reduce 0% 
13/07/04 01:16:30 INFO mapred.JobClient: map 17% reduce 0% 
13/07/04 01:32:05 INFO mapred.JobClient: map 8% reduce 0% 
13/07/04 01:32:08 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_2, Status : FAILED 
Error: GC overhead limit exceeded 
13/07/04 01:32:21 INFO mapred.JobClient: map 9% reduce 0% 
13/07/04 01:33:26 INFO mapred.JobClient: map 10% reduce 0% 
13/07/04 01:35:37 INFO mapred.JobClient: map 11% reduce 0% 
13/07/04 01:38:48 INFO mapred.JobClient: map 12% reduce 0% 
13/07/04 01:43:06 INFO mapred.JobClient: map 13% reduce 0% 
13/07/04 01:49:58 INFO mapred.JobClient: map 14% reduce 0% 
13/07/04 01:59:07 INFO mapred.JobClient: map 15% reduce 0% 
13/07/04 02:12:00 INFO mapred.JobClient: map 16% reduce 0% 
13/07/04 02:37:56 INFO mapred.JobClient: map 17% reduce 0% 
13/07/04 03:31:55 INFO mapred.JobClient: map 8% reduce 0% 
13/07/04 03:32:00 INFO mapred.JobClient: Job complete: job_201307031710_0001 
13/07/04 03:32:00 INFO mapred.JobClient: Counters: 7 
13/07/04 03:32:00 INFO mapred.JobClient: Job Counters 
13/07/04 03:32:00 INFO mapred.JobClient:  Failed map tasks=1 
13/07/04 03:32:00 INFO mapred.JobClient:  Launched map tasks=8 
13/07/04 03:32:00 INFO mapred.JobClient:  Data-local map tasks=8 
13/07/04 03:32:00 INFO mapred.JobClient:  Total time spent by all maps in occupied slots (ms)=11443502 
13/07/04 03:32:00 INFO mapred.JobClient:  Total time spent by all reduces in occupied slots (ms)=0 
13/07/04 03:32:00 INFO mapred.JobClient:  Total time spent by all maps waiting after reserving slots (ms)=0 
13/07/04 03:32:00 INFO mapred.JobClient:  Total time spent by all reduces waiting after reserving slots (ms)=0 
Exception in thread "main" java.lang.RuntimeException: java.lang.InterruptedException: Canopy Job failed processing vector 
+0

Votre application doit-elle utiliser beaucoup de mémoire? Si ce n'est pas le cas, il y a peut-être un bug dans votre application qui consomme toute la mémoire. – zsxwing

+0

il exécute le cluster Mahout Canopy, donc ne devrait pas être le bogue de l'application. Je peux voir chaque client enfant a été alloué autour de 200MB ce qui peut ne pas suffire dans mon cas. – Robin

+0

@zsxwing vous devriez écrire ceci comme "-Xmx1024M" exactement pour cette raison: vous y mettez un zéros de trop. C'est 10.24G –

Répondre

0

Vous devez changer vos paramètres de mémoire pour Hadoop, comme la mémoire allouée pour Hadoop ne suffit pas pour répondre à l'exigence de travail que vous utilisez, essayez d'augmenter la mémoire du tas et de vérifier, en raison de sur les utilisations de la mémoire Le système d'exploitation peut détruire les processus en raison de l'échec du travail.

2

Les travaux Mahout consomment beaucoup de mémoire. Je ne sais pas si les mappeurs ou les réducteurs sont les coupables, mais, de toute façon, vous devrez dire à Hadoop de leur donner plus de RAM. "GC Overhead Limit Exceeded" est juste une façon de dire "out of memory" - signifie que la JVM a renoncé à essayer de récupérer les derniers 0.01% de RAM disponible.

Comment vous définissez cela est en effet un peu complexe, car il existe plusieurs propriétés et ils ont changé dans Hadoop 2. CDH4 peut prendre en charge Hadoop 1 ou 2 - lequel utilisez-vous? Si je devais deviner: définir mapreduce.child.java.opts à -Xmx1g. Mais la bonne réponse dépend vraiment de votre version et de vos données.

Questions connexes