RaviTej310 - 3 months ago 25

Python Question

I am using Jupyter notebook and python 2.7 from anaconda. I have an approximately 250,000 dimensional data set which I need to compress to n lower dimensions. I am using scikit TSNE. When running the TSNE for

`n=5`

`n=10`

`n=50`

`"The kernel appears to have died."`

My TSNE function:

`def tsne_to_n_dimensions(n):`

start=timer()

#tsne

print diff_df.shape

tsne = sklearn.manifold.TSNE(n_components=n,verbose=2)

data_nd_tsne =tsne.fit_transform(diff_df)

calculate stuff from data_nd_tsne

return stuff

And diff_df is a global panda data frame

I have gone through this

and this but couldn't find a solution

Answer Source

I have found a solution using `python-bhtsne`

which is also an implementation of *Barnes-Hut's t-Distributed Stochastic Neighbor Embedding* approach.

It is very easy to implement and even provides an option to get the same output in every run of `tsne`

with the same parameters - something that is absent in the `scikit`

implementation.

It is a python wrapper for the original implementation by Laurens van der Maaten.

So basically you'll just need to do the following instead of the regular `TSNE`

from `scikit`

:

```
from bhtsne import tsne
data_nd_tsne = tsne(diff_df)
```