I am new to learning Python, and some of its libraries (numpy, pandas).
I have found a lot of documentation on how numpy ndarrays, pandas series and python dictionaries work.
But as part of my inexperience in Python, I have had a really hard time knowing when to use each one of then. And I haven't found some best-practices that will help me understand and decide when it is better to use each type of data structure.
As a general matter, are there any best-practices to decide if a specific data set should be loaded into any of this 3 data structures?
The rule of thumb that I usually apply: use the simplest data structure that still satisfies your needs. If we rank the data structures from most simple to least simple, it usually ends up like this:
So first consider dictionaries / lists. If these allow you to do all data operations that you need, then all is fine. If not, start considering numpy arrays. Some typical reasons for moving to numpy arrays are:
Then there are also some typical reasons for going beyond numpy arrays and to the more-complex but also more-powerful pandas series/dataframes: