user1801867 user1801867 - 2 months ago 13
Python Question

Reading text file and matching against a value threshold

I have a large number of .txt files (N > 1000) that have data of interest, and I wish to identify files whose "mean" value exceeds a given threshold (say, 0.5), and print the name of the file in which that is the case. The data in each file are organized like this:

[
{
"parameter": {
"max": 0.6640571758027143,
"mean": 0.13404294175225137,
"min": 0.0,
"std": 0.09435715828616785
},
{
"intensity": [
{
"max": [
3.1719575216784217
],
"mean": [
-3.552713678800501e-17
],
"min": [
-2.707115982837323
],
"std": [
1.0000000000000004
]
...


To make matters slightly more complicated, I only wish to read the "mean" value for the "parameter" and not for "intensity".

I had the idea that I should read this file in using a for loop, roughly containing the following code:

subjects = [allmyfilenames]
for subj in subjects:
file = open('C:/%s.txt' %subj, 'r')
for line in file.readlines(): print line


From there, I am a bit lost. How might I identify the correct line to use in matching against my threshold (0.5)?

Answer

Try something like this, I wasnt entirely sure of your data format but something like this might work for the data format above. Not tested**

subjects = [allmyfilenames]
    for subj in subjects:
        with open('C:/%s.txt' %subj, 'r') as datafile:
            data = json.load(datafile)
            if data[0]['parameter']['mean'] > 0.5:
                print subj