Alejandro de la Torre Alejandro de la Torre - 1 year ago 85
JSON Question

Creating Table on Amazon Athena for Map Attributes

I am trying to get all attributes from a JSON file originated on Amazon DynamoDB. Right now I sending the files to an Amazon S3 bucket and reading it with Amazon Athena. This is an example from a JSON file,

{
"Items": [
{
"payload": {
"phase": "T",
"tms_event": "2017-03-16 18:19:50",
"id_UM": 0,
"num_severity_level": 0,
"event_value": 1,
"int_status": 0
},
"deviceId": 6,
"tms_event": "2017-03-16 18:19:50"
}
]
}


I am looking to get deviceId, tms_event, phase, id_UM and event_value on separate fields. This is the table that I have but it doesnt separate the fields,

CREATE EXTERNAL TABLE IF NOT EXISTS default.iot_table (
`Items` ARRAY<
STRUCT<
`payload`: STRUCT<`phase`: string, `tms_event`: string, `id_UM`: int, `num_severity_level`: int, `event_value`: int, `int_status`: int>,
`deviceId`: int,
`tms_event`: string
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
) LOCATION 's3://iot-logging/'
TBLPROPERTIES ('has_encrypted_data'='false')


Thanks

Answer Source

The issue you are describing lies within the query you are executing. If you are just running a select *, the result is a combined JSON.

To select the elements as single columns you need to select them individually:

select item.deviceId,item.tms_event, item.payload.phase, 
       item.payload.id_UM, item.payload.event_value
from iot_table
CROSS JOIN UNNEST(iot_table.Items) AS t (item);

The cross join is needed due to the fact, that your first element is an array and can contain multiple Items.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download