theguyty theguyty - 9 days ago 6
Python Question

Python read from specific parts of a text file

I have a text file that looks like this:

1
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
2
Subpop 0 best fitness of run: Fitness: Standardized=61.0 Adjusted=0.016129032258064516 Hits=28
3
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
4
Subpop 0 best fitness of run: Fitness: Standardized=70.0 Adjusted=0.014084507042253521 Hits=19
5
Subpop 0 best fitness of run: Fitness: Standardized=72.0 Adjusted=0.0136986301369863 Hits=17
6
Subpop 0 best fitness of run: Fitness: Standardized=67.0 Adjusted=0.014705882352941176 Hits=22
7
Subpop 0 best fitness of run: Fitness: Standardized=65.0 Adjusted=0.015151515151515152 Hits=24
8
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
9
Subpop 0 best fitness of run: Fitness: Standardized=78.0 Adjusted=0.012658227848101266 Hits=11
10
Subpop 0 best fitness of run: Fitness: Standardized=65.0 Adjusted=0.015151515151515152 Hits=24


I am trying to use Python to extract the number from the "Standardized" and "Hits" sections from each line and put these in their own separate lists but I am unfamiliar with reading from files in Python. What would be the best way to do this?

Answer

We do not usually write code for people, but this looks it might not to be homework. I also want to state an important point.

A file is an iterable of newline-terminated strings. A list of newline-terminated strings is also such an iterable. So start with that for development, and switch to an opened file later, when the code works of the in-code list. Not doing this is, in my opinion, a big mistake and source of problems.

Next, iterate and toss 'junk' lines. Then parse payoff lines and do whatever processing of the extracted data. Parsing depends on the problem. I choose below to use splitlines and split methods.

file = '''\
1
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
2
Subpop 0 best fitness of run: Fitness: Standardized=61.0 Adjusted=0.016129032258064516 Hits=28
3
Subpop 0 best fitness of run: Fitness: Standardized=73.0 Adjusted=0.013513513513513514 Hits=16
4
Subpop 0 best fitness of run: Fitness: Standardized=70.0 Adjusted=0.014084507042253521 Hits=19
5
Subpop 0 best fitness of run: Fitness: Standardized=72.0 Adjusted=0.0136986301369863 Hits=17
'''.splitlines(keepends=True)

stand = []
hits = []

for line in file:
    if len(line) < 50:
        continue
    fields = line.split('=')
    stand.append(float(fields[1].split()[0]))
    hits.append(int(fields[3].split()[0]))

print(stand)
print(hits)
# prints
# [73.0, 61.0, 73.0, 70.0, 72.0]
# [16, 28, 16, 19, 17]
Comments