J'essaie de démarrer H2O sur un cluster Hadoop. Malheureusement, cela ne fonctionne pas et me donne l'erreur que la classe water.hadoop.h2omapper n'est pas trouvée. L'environnement Hadoop est HDP dans la version 2.6 et comprend 5 nœuds, où 1 exécute le gestionnaire de ressources YARN et 3 nœuds sont des nœuds de données avec le client YARN. Les nœuds de données ont chacun des ressources de 32 Go de RAM et 4 cœurs de processeur chacun. Aucune autre application ne fonctionne sur eux. J'ai configuré un maximum de 16 Go et 3 cœurs par application YARN sur chaque nœud dans Ambari.Impossible de démarrer H2O sur Hadoop Cluster - ClassNotFound Exception
Je commence le cluster de H2O à partir du terminal (ai essayé sur tous les nœuds, même erreur partout) avec la sortie suivante:
[[email protected] h2o-3.14.0.6-hdp2.6]# sudo -u hdfs hadoop jar h2odriver.jar -nodes 3 -mapperXmx 6g -output h2o-test
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.20.35]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 192.168.20.35:46619
(You can override these with -driverif and -driverport/-driverportrange.)
Memory Settings:
mapreduce.map.java.opts: -Xms6g -Xmx6g -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dlog4j.defaultInitOverride=true
Extra memory percent: 10
mapreduce.map.memory.mb: 6758
17/10/13 07:49:14 INFO client.RMProxy: Connecting to ResourceManager at host2/192.168.20.34:8050
17/10/13 07:49:14 INFO client.AHSProxy: Connecting to Application History server at host2/192.168.20.34:10200
17/10/13 07:49:15 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
17/10/13 07:49:15 INFO mapreduce.JobSubmitter: number of splits:3
17/10/13 07:49:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507793796947_0002
17/10/13 07:49:15 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
17/10/13 07:49:15 INFO impl.YarnClientImpl: Submitted application application_1507793796947_0002
17/10/13 07:49:15 INFO mapreduce.Job: The url to track the job: http://host2:8088/proxy/application_1507793796947_0002/
Job name 'H2O_86929' submitted
JobTracker job ID is 'job_1507793796947_0002'
For YARN users, logs command is 'yarn logs -applicationId application_1507793796947_0002'
Waiting for H2O cluster to come up...
17/10/13 07:49:29 INFO client.RMProxy: Connecting to ResourceManager at host2/192.168.20.34:8050
17/10/13 07:49:29 INFO client.AHSProxy: Connecting to Application History server at host2/192.168.20.34:10200
----- YARN cluster metrics -----
Number of YARN worker nodes: 3
----- Nodes -----
Node: http://host5:8042 Rack: /default-rack, RUNNING, 1 containers used, 4,0/16,0 GB used, 1/3 vcores used
Node: http://host4:8042 Rack: /default-rack, RUNNING, 0 containers used, 0,0/16,0 GB used, 0/3 vcores used
Node: http://host3:8042 Rack: /default-rack, RUNNING, 0 containers used, 0,0/16,0 GB used, 0/3 vcores used
----- Queues -----
Queue name: default
Queue state: RUNNING
Current capacity: 0,11
Capacity: 1,00
Maximum capacity: 1,00
Application count: 1
----- Applications in this queue -----
Application ID: application_1507793796947_0002 (H2O_86929)
Started: hdfs (Fri Oct 13 07:49:15 CEST 2017)
Application state: FINISHED
Tracking URL: http://host2:8088/proxy/application_1507793796947_0002/
Queue name: default
Used/Reserved containers: 1/0
Needed/Used/Reserved memory: 4,0 GB/4,0 GB/0,0 GB
Needed/Used/Reserved vcores: 1/1/0
Queue 'default' approximate utilization: 4,0/48,0 GB used, 1/9 vcores used
----------------------------------------------------------------------
ERROR: Unable to start any H2O nodes; please contact your YARN administrator.
A common cause for this is the requested container size (6,6 GB)
exceeds the following YARN settings:
yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-mb
L'entrée d'erreur correspondante dans le journal du système pour l'application de fil :
2017-10-13 07:49:24,505 FATAL [IPC Server handler 1 on 40503] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1507793796947_0002_m_000002_0 - exited : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more
2017-10-13 07:49:24,506 INFO [IPC Server handler 1 on 40503] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1507793796947_0002_m_000002_0: Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more
2017-10-13 07:49:24,507 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1507793796947_0002_m_000002_0: Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more
Le journal complet est disponible here.
Toute aide serait appréciée.
Meilleures salutations, Markus
Désolé, il n'y a pas assez d'informations ici pour dire quelque chose de significatif. Essayez de décrire votre environnement en détail et d'inclure toutes vos commandes de ligne de commande, la sortie complète, les versions de tout et les journaux d'application de fil. – TomKraljevic
@TomKraljevic J'ai mis à jour mon poste avec des informations supplémentaires. J'espère que cela pourra aider. –