Itack Itack - 3 years ago 90
Python Question

Trained Machine Learning model is too big

We have trained an Extra Tree model for some regression task. Our model consists of 3 extra trees, each having 200 trees of depth 30. On top of the 3 extra trees, we use a ridge regression.
We train our model for several hours and pickle the trained model (the entire class object), for later use. However, the size of saved trained model is too big, about 140 GB!
Is there a way to reduce the size of the saved model? are there any configuration in pickle that could be helpful, or any alternative for pickle?

Answer Source

In the best case (binary trees), you will have 3 * 200 * (2^30 - 1) = 644245094400 nodes or 434Gb assuming each one node would only cost 1 byte to store. I think that 140GB is a pretty decent size in comparision.

Edit: Bad maths.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download