Joe sal Joe sal - 4 years ago 68
MySQL Question

turn rows values to columns , and count repetitions for all possible values mysql

I have a table(from log file) with emails and three other columns that contains states of that user's interaction with a system, an email(user) may have 100 or 1000 entries, each entries contain those three combinations of values, that might repeat on and on for same email and others.
something look like this:

+---------+---------+---------+-----+
| email | val1 | val2 | val3 |
+---------+---------+---------+-----+
|jal@h | cast | core | cam |
|hal@b |little ja| qar | ja sa |
|bam@t | cast | core | cam |
|jal@h |little ja| qar | jaja |
+---------+---------+---------+-----+


and so, the emails repeat, all values repeat, and there are 40+ possible values for each columns, all strings. so i want to sort distinct email email and then put all possible value as column name, and under it a count for how many this value repeated for a particular email, like so:

+-------+--------+--------+------+----------+-----+--------+-------+
| email | cast | core | cam | little ja| qar | ja sa | blabla |
+-------+--------+--------+------+----------+-----+--------+--------|
|jal@h | 55 | 2 | 44 | 244 | 1 | 200 | 12 |
|hal@b | 900 | 513 | 101 | 146 | 2 | 733 | 833 |
|bam@t | 1231 | 33 | 433 | 411 | 933 | 833 | 53 |
+-------+--------+--------+------+----------+-----+--------+---------


I have tried mysql but i managed to count a certain value total occurances for each email, but not counting all possible values in each columns:

SELECT
distinct email,

count(val1) as "cast"
FROM table1
where val1 = 'cast'
group by email


This query clearly doesn't do it, as it output only on value 'cast' from the first column 'val1', What i'm looking for is all distinct values in first, second, and third columns be turned to columns heads and the values in rows will be the total for that value, for a certain email 'user'.
there is a pivote table thing but i couldn't get it to work.
I'm dealing with this data as a table in mysql, but it is available in csv file, so if it isn't possible with a query, python would be a possible solution, and prefered after sql.

update
in python, is it possible to output the data as:

+-------+--------+---------+------+----------+-----+--------+-------+
| | val1 | val2 | val3 |
+-------+--------+---------+------+----------+-----+--------+-------+
| email | cast |little ja|core | qar |cam | ja sa | jaja |
+-------+--------+---------+------+----------+-----+--------+--------|
|jal@h | 55 | 2 | 44 | 244 | 1 | 200 | 12 |
|hal@b | 900 | 513 | 101 | 146 | 2 | 733 | 833 |
|bam@t | 1231 | 33 | 433 | 411 | 933 | 833 | 53 |
+-------+--------+--------+------+----------+-----+--------+---------


I'm not very familiar with python.

Answer Source

If you use pandas, you can do a value_counts after grouping your data frame by email and then unstack/pivot it to wide format:

(df.set_index("email").stack().groupby(level=0).value_counts()
   .unstack(level=1).reset_index().fillna(0))

enter image description here


To get the updated result, you can group by both the email and val* columns after the stack:

(df.set_index("email").stack().groupby(level=[0, 1]).value_counts()
   .unstack(level=[1, 2]).fillna(0).sort_index(axis=1))

enter image description here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download