labmat labmat - 1 month ago 6
Python Question

Read numbers from many text files and average them Python

I have just started python and I have about 6000 .txt files each containing few numbers in a column like:

file1.txt:


2

43

78


file2.txt:


98

12


and so on


I want to read them and store them in an array and calculate its mean.
Mean of (2,43,78,98,12..) i.e. all numbers from all files should give 1 mean
When I read and store them, they look like:


['2, 43, 78', '98, 12',..]


... ( I got rid of the '\n')
But when I use
ave = sum(a)\float(len(a))
I get an error.
What am I doing wrong?
Is there anything I missed or another way to do this?

Code:

import fnmatch
import os

rootPath = 'D:/Data'
pattern = '*.txt'
all_data = []
for root, dirs, files in os.walk(rootPath):
for filename in fnmatch.filter(files, pattern):
#print( filename )
name = os.path.join(root, filename)
str = open(name, 'r').read()
#print str
all_data.append(str)
a=[item.replace('\n', ' ') for item in all_data]
#print a
for val in a:
values = map(float, val.split(", "))
ave = sum(values)/len(values)
print ave


I get error:


invalid literal for float()

Answer

sum("abc") is not defined. Neither is sum("2, 43"). sum works only on numeric types.

You need to split the line first and convert the values to a numeric value (I used float here, because then the sum will be a float, so there is no need to convert the len to a float):

rows = ['2, 43, 78', '98, 12']
total_sum = total_len = 0
for row in rows:
    values = map(float, row.split())
    total_sum += sum(values)
    total_len += len(values)
print total_sum/total_len

For Python 3.x replace the print avg with print(avg) and add a list() around the map, because otherwise len is not defined for it.

This is similar to what @VadimK has in his answer, but avoid list addition and just does integer addition instead.