dizzyLife - 1 year ago 119

Python Question

I've been working on Python for around 2 months now so I have a OK understanding of it.

My goal is to create a matrix using CSV data, then populating that matrix from the data in the 3rd column of that CSV file.

I came up with this code thus far:

`import csv`

import csv

def readcsv(csvfile_name):

with open(csvfile_name) as csvfile:

file=csv.reader(csvfile, delimiter=",")

#remove rubbish data in first few rows

skiprows = int(input('Number of rows to skip? '))

for i in range(skiprows):

_ = next(file)

#change strings into integers/floats

for z in file:

z[:2]=map(int, z[:2])

z[2:]=map(float, z[2:])

print(z[:2])

return

After removing the rubbish data with the above code, the data in the CSV file looks like this:

`Input:`

1 1 51 9 3

1 2 39 4 4

1 3 40 3 9

1 4 60 2 .

1 5 80 2 .

2 1 40 6 .

2 2 28 4 .

2 3 40 2 .

2 4 39 3 .

3 1 10 . .

3 2 20 . .

3 3 30 . .

3 4 40 . .

. . . . .

The output should look like this:

`1 2 3 4 . .`

1 51 39 40 60

2 40 28 40 39

3 10 20 30 40

.

.

There are about a few thousand rows and columns in this CSV file, however I'm only interested is the first 3 columns of the CSV file. So the first and second columns are basically like co-ordinates for the matrix, and then populating the matrix with data in the 3rd column.

After lots of trial and error, I realised that numpy was the way to go with matrices. This is what I tried thus far with example data:

`left_column = [1, 2, 1, 2, 1, 2, 1, 2]`

middle_column = [1, 1, 3, 3, 2, 2, 4, 4]

right_column = [1., 5., 3., 7., 2., 6., 4., 8.]

import numpy as np

m = np.zeros((max(left_column), max(middle_column)), dtype=np.float)

for x, y, z in zip(left_column, middle_column, right_column):

x -= 1 # Because the indicies are 1-based

y -= 1 # Need to be 0-based

m[x, y] = z

print(m)

#: array([[ 1., 2., 3., 4.],

#: [ 5., 6., 7., 8.]])

However, it is unrealistic for me to specify all of my data in my script to generate the matrix. I tried using the generators to pull the data out of my CSV file but it didn't work well for me.

I learnt as much numpy as I could, however it appears like it requires my data to already be in matrix form, which it isn't.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

This is my solution using only the csv library, and working with the index\position in the csv (using an offset which I use to mantain memory on the current row)

```
import csv
with open('test.csv', 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
list_of_list = []
j=0
lines = [line for line in spamreader]
for i in range(len(lines)):
list_ = []
if(len(lines)<=i+j):
break;
first = lines[i+j][0]
while(first == lines[i+j][0]):
list_.append(lines[i+j][2])
j+=1
if(len(lines)<=i+j):
break;
j-=1
list_of_list.append(list(map(float,list_)))
maxlen = len(max(list_of_list))
print("\t"+"\t".join([str(el) for el in range(1,maxlen+1)])+"\n")
for i in range(len(list_of_list)):
print(str(i+1)+"\t"+"\t".join([str(el) for el in list_of_list[i]])+"\n")
```

Anyway the solution posted by Saullo is more elegant

This is my output:

```
1 2 3 4 5
1 51.0 39.0 40.0 60.0 80.0
2 40.0 28.0 40.0 39.0
3 10.0 20.0 30.0 40.0
```

I wrote a new version of the code with an iterator, because the csv is too big to fit in memory

```
import csv
with open('test.csv', 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
list_of_list = []
line1 = next(spamreader)
first = line1[0]
list_ = [line1[2]]
for line in spamreader:
while(line[0] == first):
list_.append(line[2])
try:
line = next(spamreader)
except :
break;
list_of_list.append(list(map(float,list_)))
list_ = [line[2]]
first = line[0]
maxlen = len(max(list_of_list))
print("\t"+"\t".join([str(el) for el in range(1,maxlen+1)])+"\n")
for i in range(len(list_of_list)):
print(str(i+1)+"\t"+"\t".join([str(el) for el in list_of_list[i]])+"\n")
```

Anyway probably you need to work on the Matrix in chunks (and doing swaps) because probably the data wan't fit in a 2d array

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**