Joseph P Nardone Joseph P Nardone - 2 months ago 26
Python Question

Numpy Savetxt Overwriting, Cannot Figure Out Where to Place Loop

I am creating a program that calculates correlations between my customer's data. I want to print the correlation values to a CSV so I can further analyze the data.

I have successfully gotten my program to loop through all the customers (12 months of data each) while calculating their individual correlations for multiple arrangements. I can see this if I print to the dialog.

However, when I try to save using Savetxt, I am only getting the final values I calculate.

I think I have placed my for loop in the wrong place, where should it go? I have tried checking out other questions, but it didn't shed too much light onto it.

EDIT: I have attempted aligning the writing with both the outer for loop and the inner for loop as suggested, both yielded the same results.

for x_customer in range(0,len(overalldata),12):

for x in range(0,13,1):
cust_months = overalldata[0:x,1]
cust_balancenormal = overalldata[0:x,16]
cust_demo_one = overalldata[0:x,2]
cust_demo_two = overalldata[0:x,3]
num_acct_A = overalldata[0:x,4]
num_acct_B = overalldata[0:x,5]
#Correlation Calculations
demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]

result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b)]

result_correlation_combined = emptylist.append([result_correlation])
cust_delete_list = [0,(x_customer),1]
overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)

numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation

Answer

This portion of the code is just sloppy:

                result_correlation = [(demo_one_corr_balance),...]

        result_correlation_combined = emptylist.append([result_correlation])
        cust_delete_list = [0,(x_customer),1]
        overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)

numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation

You set result_correlation in the inner most loop, and then you use it in the final save and print. Obviously it will print the result of the last loop.

Meanwhile you append it to result_correlation_combined, outside of the x loop, near tend of the x_customer loop. But you don't do anything with the list.

And finally in the x_customer loop you play with overalldata, but I don't see any further use.

Forget about the savetxt for now, and get the data collection straight.

Comments