labmat labmat - 8 months ago 34
Python Question

Read numbers from many text files and average them Python

I have just started python and I have about 6000 .txt files each containing few numbers in a column like:








and so on

I want to read them and store them in an array and calculate its mean.
Mean of (2,43,78,98,12..) i.e. all numbers from all files should give 1 mean
When I read and store them, they look like:

['2, 43, 78', '98, 12',..]

... ( I got rid of the '\n')
But when I use
ave = sum(a)\float(len(a))
I get an error.
What am I doing wrong?
Is there anything I missed or another way to do this?


import fnmatch
import os

rootPath = 'D:/Data'
pattern = '*.txt'
all_data = []
for root, dirs, files in os.walk(rootPath):
for filename in fnmatch.filter(files, pattern):
#print( filename )
name = os.path.join(root, filename)
str = open(name, 'r').read()
#print str
a=[item.replace('\n', ' ') for item in all_data]
#print a
for val in a:
values = map(float, val.split(", "))
ave = sum(values)/len(values)
print ave

I get error:

invalid literal for float()


sum("abc") is not defined. Neither is sum("2, 43"). sum works only on numeric types.

You need to split the line first and convert the values to a numeric value (I used float here, because then the sum will be a float, so there is no need to convert the len to a float):

rows = ['2, 43, 78', '98, 12']
total_sum = total_len = 0
for row in rows:
    values = map(float, row.split())
    total_sum += sum(values)
    total_len += len(values)
print total_sum/total_len

For Python 3.x replace the print avg with print(avg) and add a list() around the map, because otherwise len is not defined for it.

This is similar to what @VadimK has in his answer, but avoid list addition and just does integer addition instead.