Yaoi Dirty - 6 months ago 62

Python Question

I am using tflearn for modeling CNN.

However, my data has different number of rows in each input (but same number of columns).

For example, I have 100 inputs.

The first input's dimension is 4*9 but the second and third have 1*9.

I do not sure how to feed and shape data by using input_data().

Answer

First of all, you have to know what exactly is your training sample. I am not sure what you mean by *"input"*, does one input means one sample? Or does one row in your input means one sample?

If one input means one sample, you are in some trouble, because almost all CNN (and almost any other machine learning algos) requires consistency in the *shape of the data*. Given that some sample have more data than others, it might be a solution to crop out the extra ones from which have more data, or to just ignore those with less rows (so as to maximize the data you use). A more complicated way would be to run a PCA over some of the samples which have more rows (and same number of rows), then use only the principal components for all samples, if possible.

If one row means one sample, you can just merge all the data into a big chunk and process it the usual way. You got it.

Source (Stackoverflow)