Ian Zhao Ian Zhao - 3 years ago 157
SQL Question

How to group columns into separate rows in SQL

I don't have a profound SQL background, recently encountered a problem with SQL that seems hard to do with JUST SQL.

I have a table

```

IMEI | DATE | A_1 | A_2 | A_3 | B_1 | B_2 | B_3
2132 | 09/21| 2 | 4 | 4 | 5 | 2 | 4
4535 | 09/22| 2 | 2 | 4 | 5 | 2 | 3
9023 | 09/21| 2 | 1 | 5 | 7 | 2 | 2


```

How can I group value of
A_1
,
A_2
etc in a way so I can achieve the this table. Basically, I would like to group certain columns in my table, and put them into different rows.

IMEI | DATE | MODULE | val_1 | val_2 | val_3
2132 | 09/21| A | 2 | 4 | 4
2132 | 09/21| B | 5 | 2 | 4
...


The goal is to have value under namespace
A
,
B
etc for a row to be separated into different rows in the new Table.

Also, any suggestions on where can I improve my SQL. any books I should keep as reference or any other resources I should use?

Thanks!

Answer Source

I love playing with data and questions like this!
Below can be considered as over-engineering but I think it is still an option when you don't know your columns names in advance but have a pattern you described or it can be useful just for learning as looks like you are looking for improving your SQL (based on tag for this question I assume you meant BigQuery SQL)

#standardSQL
WITH parsed AS (
  SELECT IMEI, DATE, 
    REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(0)], r'^"|"$', '') key, 
    REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(1)], r'^"|"$', '') value
  FROM `yourTable` t, 
  UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'[{}]', ''))) row
),
grouped AS (
  SELECT 
    IMEI, DATE, 
    REGEXP_EXTRACT(key, r'(.*)_') MODULE, 
    ARRAY_AGG(value ORDER BY CAST(REGEXP_EXTRACT(key, r'_(.*)') AS INT64)) AS vals
  FROM parsed
  WHERE key NOT IN ('IMEI', 'DATE')
  GROUP BY IMEI, DATE, MODULE
)
SELECT IMEI, DATE, MODULE, 
  vals[SAFE_OFFSET(0)] AS val_1,
  vals[SAFE_OFFSET(1)] AS val_2,
  vals[SAFE_OFFSET(2)] AS val_3,
  vals[SAFE_OFFSET(3)] AS val_4
FROM grouped
-- ORDER BY IMEI, DATE, MODULE

You can test / play with dummy data from your question

#standardSQL
WITH `yourTable` AS (
  SELECT 2132 IMEI, '09/21' DATE, 2 A_1, 4 A_2, 4 A_3, 5 B_1, 2 B_2, 4 B_3 UNION ALL
  SELECT 4535, '09/22', 2, 2 ,4, 5, 2, 3 UNION ALL
  SELECT 9023, '09/21', 2, 1 ,5, 7, 2, 2 
),
parsed AS (
  SELECT IMEI, DATE, 
    REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(0)], r'^"|"$', '') key, 
    REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(1)], r'^"|"$', '') value
  FROM `yourTable` t, 
  UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'[{}]', ''))) row
),
grouped AS (
  SELECT 
    IMEI, DATE, 
    REGEXP_EXTRACT(key, r'(.*)_') MODULE, 
    ARRAY_AGG(value ORDER BY CAST(REGEXP_EXTRACT(key, r'_(.*)') AS INT64)) AS vals
  FROM parsed
  WHERE key NOT IN ('IMEI', 'DATE')
  GROUP BY IMEI, DATE, MODULE
)
SELECT IMEI, DATE, MODULE, 
  vals[SAFE_OFFSET(0)] AS val_1,
  vals[SAFE_OFFSET(1)] AS val_2,
  vals[SAFE_OFFSET(2)] AS val_3,
  vals[SAFE_OFFSET(3)] AS val_4
FROM grouped
ORDER BY IMEI, DATE, MODULE

Output will be as below

Row IMEI    DATE    MODULE  val_1   val_2   val_3   val_4    
1   2132    09/21   A       2       4       4       null     
2   2132    09/21   B       5       2       4       null     
3   4535    09/22   A       2       2       4       null     
4   4535    09/22   B       5       2       3       null     
5   9023    09/21   A       2       1       5       null     
6   9023    09/21   B       7       2       2       null     
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download