sparklyr: la fonction spark_apply ne fonctionne pas en mode cluster

J'ai combiné deux données ayant des numéros différents. Combiné deux données utilisant cbind.na fonction par qpcR bibliothèque. Il a montré le résultat dans sparklyr en utilisant correctement la fonction spark_apply dans ma machine locale. Mais, en mode cluster, il montre une erreur comme ci-dessous.sparklyr: la fonction spark_apply ne fonctionne pas en mode cluster

Remarque: La structure de données unique affiche le résultat à la fois en cluster et en local.

Error : Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 111.0 failed 4 times, most recent failure: Lost task 0.3 in stage 111.0 (TID 229, 192.168.1.20, executor 1): java.lang.Exception: sparklyr worker rscript failure, check worker logs for details. 
    at sparklyr.Rscript.init(rscript.scala:56) 
    at sparklyr.WorkerRDD$$anon$2.run(rdd.scala:89) 

Driver stacktrace: 
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) 
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) 
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) 
    at scala.Option.foreach(Option.scala:257) 
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) 
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062) 
    at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1354) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) 
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) 
    at org.apache.spark.rdd.RDD.take(RDD.scala:1327) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at sparklyr.Invoke$.invoke(invoke.scala:102) 
    at sparklyr.StreamHandler$.handleMethodCall(stream.scala:97) 
    at sparklyr.StreamHandler$.read(stream.scala:62) 
    at sparklyr.BackendHandler.channelRead0(handler.scala:52) 
    at sparklyr.BackendHandler.channelRead0(handler.scala:14) 
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) 
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) 
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) 
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) 
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) 
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) 
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) 
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) 
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) 
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) 
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) 
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) 
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) 
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) 
    at java.lang.Thread.run(Thread.java:748) 
Caused by: java.lang.Exception: sparklyr worker rscript failure, check worker logs for details. 
    at sparklyr.Rscript.init(rscript.scala:56) 
    at sparklyr.WorkerRDD$$anon$2.run(rdd.scala:89)

Source

2017-10-20 shyam

Si u utilisez qPCR l'intérieur étincelle appliquer peut ne pas fonctionner en mode cluster, car Yor machine locale peut être les fenêtres et les machines à sous-munitions sont linux. Mieux vaut essayer une autre méthode.

Source

2017-10-24 10:03:21 Diljish

ouais !! J'ai vérifié ce que tu as dit. J'ai essayé une autre méthode sans le paquet qpcR et cela a fonctionné dans sparklyr. Merci Diljish. – shyam

sparklyr: la fonction spark_apply ne fonctionne pas en mode cluster

Répondre

Questions connexes