2017-01-30 1 views
0

J'utilise mrjob sur Amazon EMR. Cela fonctionne sans problème sur EMR 4.8.3, mais quand je l'exécute sur EMR 5.x (l'un d'entre eux), quelque chose va bonkers dans l'API de streaming hadoop et je reçois juste beaucoup d'erreurs. Mon programme est mrjob un programme très simple qui ne wordcount:mrjob ne fonctionne pas sur Amazon EMR 5.x, mais fonctionne sur EMR4.8.3

#!/usr/bin/python2.7 
from mrjob.job import MRJob 
import re 

WORD_RE = re.compile(r"[\w']+") 

class MRWordFreqCount(MRJob): 

    def mapper(self, _, line): 
     for word in WORD_RE.findall(line): 
      yield word.lower(), 1 

    def reducer(self, word, counts): 
     yield word, sum(counts) 

if __name__ == '__main__': 
    MRWordFreqCount.run() 

Je vois les mêmes problèmes avec Python3.4.

Voici les erreurs:

[[email protected] L03]$ python34 wordcount.py -r hadoop hamlet.txt 
Using configs in /home/hadoop/.mrjob.conf 
Using Hadoop version 2.7.3 
Looking for Hadoop streaming jar in /usr/lib/hadoop... 
Found Hadoop streaming jar: /usr/lib/hadoop/hadoop-streaming.jar 
Creating temp directory /tmp/wordcount.hadoop.20170202.034007.994458 
Copying local files to hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/files/... 
Running step 1 of 1... 
    packageJobJar: [] [/usr/lib/hadoop/hadoop-streaming-2.7.3-amzn-1.jar] /tmp/streamjob3049587170286741307.jar tmpDir=null 
    Connecting to ResourceManager at ip-172-31-42-125.ec2.internal/172.31.42.125:8032 
    Connecting to ResourceManager at ip-172-31-42-125.ec2.internal/172.31.42.125:8032 
    Loaded native gpl library 
    Successfully loaded & initialized native-lzo library [hadoop-lzo rev f7cb0596948c5bfd3e71d37b0f5bb21a19554666] 
    Total input paths to process : 1 
    number of splits:8 
    Submitting tokens for job: job_1486006378404_0001 
    Submitted application application_1486006378404_0001 
    The url to track the job: http://ip-172-31-42-125.ec2.internal:20888/proxy/application_1486006378404_0001/ 
    Running job: job_1486006378404_0001 
    Job job_1486006378404_0001 running in uber mode : false 
    map 0% reduce 0% 
    Task Id : attempt_1486006378404_0001_m_000001_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000003_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000000_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000005_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000004_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000002_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000006_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000007_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000001_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000003_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000004_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000002_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000006_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000007_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    Task Id : attempt_1486006378404_0001_m_000001_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000003_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000000_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000005_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

    Task Id : attempt_1486006378404_0001_m_000004_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 
Task Id : attempt_1486006378404_0001_m_000007_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

    map 100% reduce 100% 
    Job job_1486006378404_0001 failed with state FAILED due to: Task failed task_1486006378404_0001_m_000001 
Job failed as tasks failed. failedMaps:1 failedReduces:0 

    Job not successful! 
    Streaming Command Failed! 
Counters: 17 
    Job Counters 
     Data-local map tasks=8 
     Failed map tasks=25 
     Killed map tasks=7 
     Killed reduce tasks=3 
     Launched map tasks=30 
     Other local map tasks=22 
     Total megabyte-milliseconds taken by all map tasks=484107840 
     Total megabyte-milliseconds taken by all reduce tasks=0 
     Total time spent by all map tasks (ms)=336186 
     Total time spent by all maps in occupied slots (ms)=15128370 
     Total time spent by all reduce tasks (ms)=0 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total vcore-milliseconds taken by all map tasks=336186 
     Total vcore-milliseconds taken by all reduce tasks=0 
    Map-Reduce Framework 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
Scanning logs for probable cause of failure... 
Looking for history log in hdfs:///tmp/hadoop-yarn/staging... 
    Parsing history log: hdfs:///tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1486006378404_0001-1486006825373-hadoop-streamjob3049587170286741307.jar-1486006895554-0-0-FAILED-default-1486006832990.jhist 
Probable cause of failure: 

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 


(from line 143 of hdfs:///tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1486006378404_0001-1486006825373-hadoop-streamjob3049587170286741307.jar-1486006895554-0-0-FAILED-default-1486006832990.jhist) 

Step 1 of 1 failed: Command '['/usr/bin/hadoop', 'jar', '/usr/lib/hadoop/hadoop-streaming.jar', '-files', 'hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/files/mrjob.zip#mrjob.zip,hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/files/setup-wrapper.sh#setup-wrapper.sh,hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/files/wordcount.py#wordcount.py', '-input', 'hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/files/hamlet.txt', '-output', 'hdfs:///user/hadoop/tmp/mrjob/wordcount.hadoop.20170202.034007.994458/output', '-mapper', 'sh -ex setup-wrapper.sh python3 wordcount.py --step-num=0 --mapper', '-reducer', 'sh -ex setup-wrapper.sh python3 wordcount.py --step-num=0 --reducer']' returned non-zero exit status 256 
[[email protected] L03]$ 
+0

Quelles sont les erreurs? –

+0

D'accord, j'ai ajouté les erreurs. – vy32

+0

La ligne 'shebang' est-elle correcte selon l'environnement? – franklinsijo

Répondre

3

mainteneur mrjob ici. Cela était dû à un peu d'étrangeté dans AMI 5.2.0+ et plus tard avec la commande sh.

La mise à niveau vers mrjob v0.5.9 devrait résoudre ce problème. Si vous êtes curieux, voir https://github.com/Yelp/mrjob/issues/1548 pour plus de détails.

+0

Wow, c'est fou. Merci pour l'info. – vy32