I am trying to get all attributes from a JSON file originated on Amazon DynamoDB. Right now I sending the files to an Amazon S3 bucket and reading it with Amazon Athena. This is an example from a JSON file,
"tms_event": "2017-03-16 18:19:50",
"tms_event": "2017-03-16 18:19:50"
CREATE EXTERNAL TABLE IF NOT EXISTS default.iot_table (
`payload`: STRUCT<`phase`: string, `tms_event`: string, `id_UM`: int, `num_severity_level`: int, `event_value`: int, `int_status`: int>,
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
) LOCATION 's3://iot-logging/'
The issue you are describing lies within the query you are executing. If you are just running a
select *, the result is a combined JSON.
To select the elements as single columns you need to select them individually:
select item.deviceId,item.tms_event, item.payload.phase, item.payload.id_UM, item.payload.event_value from iot_table CROSS JOIN UNNEST(iot_table.Items) AS t (item);
The cross join is needed due to the fact, that your first element is an array and can contain multiple Items.