Matt Matt - 2 months ago 14
Python Question

How do I re-organise data into a new dataframe in pandas that displays the changes in data in this way?

I have started with two separate dataframes; one retrieved from a MySQL database (df_database) and another that has been created following a web scrape. The web scrape dataframe has already been split into two - df_new (rows not currently in the database) and df_existing (rows that already exist in the database).

From here, I have analysed the df_exsiting dataframe to find where there have been changes two columns in the df_existing dataframe (when compared to the data in df_database) that I am interested in, and saved the results in a new dataframe called df_changes.

An extract of df_changes looks as below (the dataframe is showing up as an HTML table because tried showing the data as it appears in Jupyter Notebook):

enter image description here

It's possible that the ticker and name may both have changed, or just one of them.

What I would like to achieve is a new dataframe set out as below:

enter image description here

I can't figure out how to achieve this. Any help?



    id_vars=['unique_identifier', 'version'],
    value_vars=['ticker', 'name']
).set_index(['unique_identifier', 'variable', 'version']) \

enter image description here