2017-03-07 1 views
1

J'ai une table de chargement pour charger les données JSON. Il y a deux valeurs dans mon JSON. Les deux ont un type de données en tant que chaîne. Si je les garde comme bigint, puis sélectionnez sur ce tableau ci-dessous donne une erreur:Impossible de travailler avec le type de colonne unixtimestamp dans le type de données de chaîne

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors 
at [Source: [email protected]; line: 1, column: 21] 

Si je changerai deux cordes, il fonctionne bien. Maintenant, parce que ces colonnes sont en chaîne, je ne suis pas en mesure d'utiliser la méthode from_unixtime pour ces colonnes.

Si je tente de modifier ces types de données de colonnes de chaîne à bigint, je reçois ci-dessous erreur:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp 

Ci-dessous est ma create table:

create table ABC 
(
    uploadTimeStamp bigint 
    ,PDID   string 

    ,data   array 
        < 
         struct 
         < 
          Data:struct 
          < 
           unit:string 
           ,value:string 
           ,heading:string 
           ,loc:string 
           ,loc1:string 
           ,loc2:string 
           ,loc3:string 
           ,speed:string 
           ,xvalue:string 
           ,yvalue:string 
           ,zvalue:string 
          > 
          ,Event:string 
          ,PDID:string 
          ,`Timestamp`:string 
          ,Timezone:string 
          ,Version:string 
          ,pii:struct<dummy:string> 
         > 
        > 
) 
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile; 

Mon JSON:

{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]} 

Toutes les façons que je peux convertir cette chaîne unixtimestamp à l'heure standard ou je peux travailler avec bigint pour ces colonnes ns?

+0

Quels champs parlez-vous? veuillez donner leurs noms et leurs définitions dans le JSON –

Répondre

0
  1. Si vous parlez de Timestamp et Fuseau horaire vous peut les définir comme int/grands types int.
    Si vous regardez sur leur définition, vous verrez qu'il n'y a pas de qualification («) autour des valeurs, par conséquent, ils sont des types numériques au sein de la JSON doc:

    « Timestamp »: 1488793268598, » Timezone « : 330


create external table myjson 
(
    uploadTimeStamp string 
    ,PDID   string 

    ,data   array 
        < 
         struct 
         < 
          Data:struct 
          < 
           unit:string 
           ,value:string 
           ,heading:string 
           ,loc3:string 
           ,loc:string 
           ,loc1:string 
           ,loc4:string 
           ,speed:string 
           ,x:string 
           ,y:string 
           ,z:string 
          > 
          ,EventID:string 
          ,PDID:string 
          ,`Timestamp`:bigint 
          ,Timezone:smallint 
          ,Version:string 
          ,pii:struct<dummy:string> 
         > 
        > 
) 
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile 
location '/tmp/myjson' 
; 

+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
| myjson.uploadtimestamp | myjson.pdid |                                                                                                                                                           myjson.data                                                                                                                                                           | 
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
|   1486631318873 |   123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] | 
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 

  1. Même si vous avez défini Horodatage sous forme de chaîne, vous pouvez toujours le transformer en bigint avant de l'utiliser dans une fonction nécessitant un paramètre bigint.

    cast (`Timestamp` BIGINT)


hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t; 

FAILED: SemanticException [Error 10014]: Line 1:45 Wrong arguments 'timestamp': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string). Possible choices: FUNC(bigint) FUNC(bigint, string) FUNC(int) FUNC(int, string)

hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t; 
OK 
1970-01-01 00:00:00