We have trained an Extra Tree model for some regression task. Our model consists of 3 extra trees, each having 200 trees of depth 30. On top of the 3 extra trees, we use a ridge regression.
We train our model for several hours and pickle the trained model (the entire class object), for later use. However, the size of saved trained model is too big, about 140 GB!
Is there a way to reduce the size of the saved model? are there any configuration in pickle that could be helpful, or any alternative for pickle?
In the best case (binary trees), you will have
3 * 200 * (2^30 - 1) = 644245094400 nodes or
434Gb assuming each one node would only cost 1 byte to store. I think that 140GB is a pretty decent size in comparision.
Edit: Bad maths.