user5359531 user5359531 - 6 months ago 25
Python Question

Mutable indexed heterogeneous data structure?

Is there a data class or type in Python that matches these criteria?

I am trying to build an object that looks something like this:

  • ExperimentData

    • ID 1

      • sample_info_1:
        character string

      • sample_info_2:
        character string

      • Dataframe_1:
        pandas data frame

      • Dataframe_2:
        pandas data frame

    • ID 2

      • (etc.)

Right now, I am using a
to hold the object ('ExperimentData'), which contains
's for each ID. Each of the
's has a named field for the corresponding data attached to the sample. This allows me to keep all the ID's indexed, and have all of the fields under each ID indexed as well.

However, I need to update and/or replace the entries under each ID during downstream analysis. Since a
is immutable, this does not seem to be possible.

Is there a better implementation of this?


You could use a dict of dicts instead of a dict of namedtuples. Dicts are mutable, so you'll be able to modify the inner dicts.

Given what you said in the comments about the structures of each DataFrame-1 and -2 being comparable, you could also group all of each into one big DataFrame, by adding a column to each DataFrame containing the value of sample_info_1 repeated across all rows, and likewise for sample_info_2. Then you could concat all the DataFrame-1s into a big one, and likewise for the DataFrame-2s, getting all your data into two DataFrames. (Depending on the structure of those DataFrames, you could even join them into one.)