Rodolfo Orozco - 3 years ago 68

Python Question

I am new to learning Python, and some of its libraries (numpy, pandas).

I have found a lot of documentation on ** how** numpy ndarrays, pandas series and python dictionaries work.

But as part of my inexperience in Python, I have had a really hard time knowing

As a general matter, are there any best-practices to decide if a specific data set should be loaded into any of this 3 data structures?

Thanks!

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

The rule of thumb that I usually apply: **use the simplest data structure that still satisfies your needs**. If we rank the data structures from most simple to least simple, it usually ends up like this:

- Dictionaries / lists
- Numpy arrays
- Pandas series / dataframes

So first consider dictionaries / lists. If these allow you to do all data operations that you need, then all is fine. If not, start considering numpy arrays. Some typical reasons for moving to numpy arrays are:

- Your data is 2-dimensional (or higher). Although nested dictionaries/lists can be used to represent multi-dimensional data, in most situations numpy arrays will be more efficient.
- You have to perform a bunch of numerical calculations. As already pointed out by
*zhqiat*, numpy will give a significant speed-up in this case. Furthermore numpy arrays come bundled with a large amount of mathematical functions.

Then there are also some typical reasons for going beyond numpy arrays and to the more-complex but also more-powerful pandas series/dataframes:

- You have to merge multiple data sets with each other, or do reshaping/reordering of your data. This diagram gives a nice overview of all the 'data wrangling' operations that pandas allows you to do.
- You have to import data from or export data to a specific file format like Excel, HDF5 or SQL. Pandas comes with convenient import/export functions for this.

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**