Je suis en train d'exécuter un cluster presto à 2 nœuds pointant sur Hive sur EMR, configuré avec des données sur S3.Le cluster Presto ne peut pas exécuter de requêtes sur les tables définies par Hive - "Aucun noeud disponible pour exécuter la requête"
Les métadonnées de la ruche sont visibles; dans la CLI, je peux décrire (table) claim1, et voir les métadonnées à ce sujet.
Les deux nœuds apparaissent comme actifs dans la table sys.node.
Quand je lance une requête (select count (*) à partir claim1 où 'M' col1 =), je vois beaucoup de l'exploitation forestière sur le noeud coordinateur, en finissant par:
2014-05-19T22:19:32.760+0000 INFO HiveHdfsWalker-144 stdout 22:19:32.760 [HiveHdfsWalker-144] DEBUG c.a.s.s.m.t.XmlResponsesSaxParser - Examining listing for bucket: unzippeddata
2014-05-19T22:19:32.763+0000 INFO HiveHdfsWalker-144 stdout 22:19:32.763 [HiveHdfsWalker-144] DEBUG com.amazonaws.request - Received successful response: 200, AWS Request ID: 9B844EEC8586FF3B
2014-05-19T22:19:32.766+0000 DEBUG query-scheduler-8 com.facebook.presto.execution.SqlStageExecution Stage 20140519_221932_00005_mfhtx.1 is FAILED
2014-05-19T22:19:32.766+0000 DEBUG query-scheduler-6 com.facebook.presto.execution.SqlStageExecution Stage 20140519_221932_00005_mfhtx.0 is FAILED
2014-05-19T22:19:32.768+0000 DEBUG query-scheduler-7 com.facebook.presto.execution.QueryStateMachine Query 20140519_221932_00005_mfhtx is FAILED
2014-05-19T22:19:32.770+0000 ERROR Stage-20140519_221932_00005_mfhtx.1-126 com.facebook.presto.execution.SqlStageExecution Error while starting stage 20140519_221932_00005_mfhtx.1
com.facebook.presto.spi.PrestoException: No nodes available to run query
at com.facebook.presto.util.Failures.checkCondition(Failures.java:79) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.util.Failures.checkCondition(Failures.java:73) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.NodeScheduler$NodeSelector.computeAssignments(NodeScheduler.java:184) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.scheduleSourcePartitionedNodes(SqlStageExecution.java:631) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.startTasks(SqlStageExecution.java:549) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.access$200(SqlStageExecution.java:91) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution$4.run(SqlStageExecution.java:521) [presto-main-0.68.jar:0.68]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
2014-05-19T22:19:32.775+0000 INFO query-scheduler-7 com.facebook.presto.event.query.QueryMonitor TIMELINE: Query 20140519_221932_00005_mfhtx :: elapsed 483.00ms :: planning 74.12ms :: scheduling 409.00ms :: running 0.00ms :: finishing 409.00ms :: begin 2014-05-19T22:19:32.285Z :: end 2014-05-19T22:19:32.768Z
2014-05-19T22:19:32.872+0000 DEBUG task-notification-0 com.facebook.presto.execution.TaskStateMachine Task 20140519_221932_00005_mfhtx.0.0 is CANCELED
2014-05-19T22:19:32.880+0000 DEBUG 20140519_221932_00005_mfhtx.0.0-0-56 com.facebook.presto.execution.TaskExecutor Split 20140519_221932_00005_mfhtx.0.0-0 (start = 1400537972443, wall = 437 ms, cpu = 3 ms, calls = 2) is finished
... ou tour à tour:
2014-05-19T22:22:43.972+0000 INFO HiveHdfsWalker-170 stdout 22:22:43.972 [HiveHdfsWalker-170] DEBUG c.a.s.s.m.t.XmlResponsesSaxParser - Parsing XML response document with handler: class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
2014-05-19T22:22:43.972+0000 INFO HiveHdfsWalker-170 stdout 22:22:43.972 [HiveHdfsWalker-170] DEBUG c.a.s.s.m.t.XmlResponsesSaxParser - Examining listing for bucket: unzippeddata
2014-05-19T22:22:43.976+0000 INFO HiveHdfsWalker-170 stdout 22:22:43.976 [HiveHdfsWalker-170] DEBUG com.amazonaws.request - Received successful response: 200, AWS Request ID: 476D4ABA552DAA66
2014-05-19T22:22:43.979+0000 DEBUG query-scheduler-17 com.facebook.presto.execution.SqlStageExecution Stage 20140519_222243_00007_mfhtx.1 is FAILED
2014-05-19T22:22:43.979+0000 ERROR Stage-20140519_222243_00007_mfhtx.1-160 com.facebook.presto.execution.SqlStageExecution Error while starting stage 20140519_222243_00007_mfhtx.1
com.facebook.presto.spi.PrestoException: No nodes available to run query
at com.facebook.presto.util.Failures.checkCondition(Failures.java:79) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.util.Failures.checkCondition(Failures.java:73) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.NodeScheduler$NodeSelector.computeAssignments(NodeScheduler.java:184) ~[presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.scheduleSourcePartitionedNodes(SqlStageExecution.java:631) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.startTasks(SqlStageExecution.java:549) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution.access$200(SqlStageExecution.java:91) [presto-main-0.68.jar:0.68]
at com.facebook.presto.execution.SqlStageExecution$4.run(SqlStageExecution.java:521) [presto-main-0.68.jar:0.68]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
2014-05-19T22:22:43.983+0000 DEBUG query-scheduler-13 com.facebook.presto.execution.SqlStageExecution Stage 20140519_222243_00007_mfhtx.0 is FAILED
2014-05-19T22:22:43.984+0000 DEBUG query-scheduler-14 com.facebook.presto.execution.QueryStateMachine Query 20140519_222243_00007_mfhtx is FAILED
2014-05-19T22:22:43.990+0000 INFO query-scheduler-14 com.facebook.presto.event.query.QueryMonitor TIMELINE: Query 20140519_222243_00007_mfhtx :: elapsed 233.00ms :: planning 60.01ms :: scheduling 173.00ms :: running 0.00ms :: finishing 173.00ms :: begin 2014-05-19T22:22:43.751Z :: end 2014-05-19T22:22:43.984Z
2014-05-19T22:22:44.088+0000 DEBUG task-notification-3 com.facebook.presto.execution.TaskStateMachine Task 20140519_222243_00007_mfhtx.0.0 is CANCELED
2014-05-19T22:22:44.102+0000 DEBUG 20140519_222243_00007_mfhtx.0.0-0-50 com.facebook.presto.execution.TaskExecutor Split 20140519_222243_00007_mfhtx.0.0-0 (start = 1400538163839, wall = 259 ms, cpu = 0 ms, calls = 2) is finished
Le nœud non-coordinateur parfois (mais pas toujours) obtient quelques lignes dans ses journaux:
2014-05-19T22:19:32.340+0000 DEBUG task-notification-10 com.facebook.presto.execution.TaskStateMachine Task 20140519_222103_00006_mfhtx.0.0 is CANCELED
2014-05-19T22:19:32.352+0000 DEBUG 20140519_222103_00006_mfhtx.0.0-0-48 com.facebook.presto.execution.TaskExecutor Split 20140519_222103_00006_mfhtx.0.0-0 (start = 1400538063009, wall = 343 ms, cpu = 0 ms, calls = 2) is finished
Un oubli de ma part - merci. Pourquoi a-t-il réussi à interroger les métadonnées de la table sans cette configuration? – user717847
Cette option de configuration plutôt confuse spécifie quels noeuds doivent traiter les séparations pour une source de données donnée. Par exemple, vous pouvez disposer d'un ensemble de machines plus proches de HDFS sur le réseau et qui ne souhaitent que le traitement des données Hive, mais vous disposez d'un autre ensemble qui ne traite que les étapes ultérieures d'une requête. Cependant, cette propriété n'est pas un excellent moyen d'accomplir cela et prête à confusion, nous prévoyons donc de l'enlever dans une prochaine version (et d'introduire une meilleure façon d'accomplir la tâche susmentionnée si le besoin s'en fait sentir). –