2011-05-23 3 views

Répondre

9

Avec Boto vous pouvez faire quelque chose comme ceci:

args1 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--install-hive', 
     u'--hive-versions', 
     u'0.7'] 
args2 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--hive-versions', 
     u'0.7', 
     u'--run-hive-script', 
     u'--args', 
     u'-f', 
     s3_query_file_uri] 
steps = [] 
for name, args in zip(('Setup Hive','Run Hive Script'),(args1,args2)): 
    step = JarStep(name, 
        's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', 
        step_args=args, 
        #action_on_failure="CANCEL_AND_WAIT" 
        ) 
    #should be inside loop 
    steps.append(step) 
# Kick off the job 
jobid = EmrConnection().run_jobflow(name, s3_log_uri, 
            steps=steps, 
            master_instance_type=master_instance_type, 
            slave_instance_type=slave_instance_type, 
            num_instances=num_instances, 
            hadoop_version="0.20") 
+0

qui a travaillé - grâce à unthingable! – poiuy

+0

Je suis emr terminé en raison de VALIDATION_ERROR .. des idées? – vks

Questions connexes