Aquiles Páez Aquiles Páez - 1 month ago 12
Python Question

Python 2.7: Variable "is not defined"

I'm using Physionet's data base for some tasks related to ECG signal analysis. I wanted to read .MAT files, extract the MLII readings on the file (located throughout row 1), adjust the signal to mV using "gain" and "base" (located in the .INFO filed also supplied by Physionet) and finally print the signal values and its period.

I wanted to write a script that could do all of those things to all the files in one folder. Before this, I wrote one in which I could do everythin mentioned above and it worked nicely.

But the script that would manage all the .mat and .info files in my folder is giving me problems with the variables. I tried using the 'global' command in the very beginning of my succession of IFs, but it kept sending a similar error message.

This is the code:

import os
import scipy.io as sio
import numpy as np
import re
import matplotlib.pyplot as plt

for file in os.listdir('C:blablablablabla\Multiple .mat files'):
if file.endswith(".mat"):
file_name=os.path.splitext(file)
ext_txt=".txt"
ext_info=".info"
if file.endswith(".info"):
f=open(file_name[0]+ext_info,'r')
k=f.read()
f.close()
j=re.findall('\d+', k)
Fs=j[9]
gain=j[13]
base=j[14]

RawData=sio.loadmat(file)
signalVectors=RawData['val']
[a,b]=signalVectors.shape
signalVectors_2=np.true_divide((signalVectors-gain),base)
ecgSignal=signalVectors_2[1,1:]
T=np.true_divide(np.linspace(1,b,num=b-1),Fs)
txt_data=np.array([ecgSignal, T])
txt_data=txt_data.T
f=open(file_name[0]+ext_name,'w')
np.savetxt(file_name[0]+ext_txt,txt_data,fmt=['%.8f','%.8f'])
f.close()


The error message I get is:

> File "C:blablablablabla\Multiple .mat files\ecg_mat_multi.py", line 24, in <module>
signalVectors_2=np.true_divide((signalVectors-gain),base)
NameError: name 'gain' is not defined


The problem comes with the variables 'gain', 'base' and 'Fs'. I tried to define them as global variables, but that didn't make a difference. Can you help me fix this error, please?

Thanks a lot for your time and help.

EDIT 1: copied the error message below the script.
EDIT 2: Changed the post title and erased additional questions.

Answer

Use two loops and extract the info before processing the data files

for filepath in os.listdir('C:blablablablabla\Multiple .mat files'):
    if filepath.endswith(".info"):
        Fs, gain, base = get_info(filepath)
        break
for file in os.listdir('C:blablablablabla\Multiple .mat files'):
    if file.endswith(".mat"):
        file_name=os.path.splitext(file)
        ...
        RawData=sio.loadmat(file)
        signalVectors=RawData['val']
        ...

I was working off your first edit so I'll include this even though the question has been streamlined

# foo.info
Source: record mitdb/100 Start: [00:00:10.000]
val has 2 rows (signals) and 3600 columns (samples/signal)
Duration: 0:10
Sampling frequency: 360 Hz Sampling interval: 0.002777777778 sec
Row Signal  Gain    Base    Units
1   MLII    200 1024    mV
2   V5  200 1024    mV

To convert from raw units to the physical units shown
above, subtract 'base' and divide by 'gain'.

I would also write a function that returns the info you want. Using a function to extract the info makes the code in your loop more readable and it makes it easier to test the extraction.

Since the file is well structured, you could probably iterate over the lines and extract the info by counting lines and using str.split and slices.

This function uses regex patterns to extract the info:

# regex patterns
hz_pattern = r'frequency: (\d+) Hz'
mlii_pattern = r'MLII\t(\d+)\t(\d+)'

def get_info(filepath):
    with open(filepath) as f:
        info = f.read()
    match = re.search(hz_pattern, info)
    Fs = match.group(1)
    match = re.search(mlii_pattern, info)
    gain, base = match.groups()
    return map(int, (Fs, gain, base))
Comments