david_doji david_doji - 3 months ago 8
Python Question

select certain files from directory

In the same directory I have several files, some of them are sample measurements and others are references. They look like this:

blablabla_350.dat
blablabla_351.dat
blablabla_352.dat
blablabla_353.dat
...
blablabla_100.dat
blablabla_101.dat
blablabla_102.dat


The ones ending from 350 to 353 are my samples, the ones ending at 100, 101 and 102 are the references. The good thing is that samples and references are consecutives in numbers.

I would like to separate them in two different lists, samples and references.

One idea should be something like (not working yet):

import glob

samples = []
references = []

ref = raw_input("Enter first reference name: ")
num_refs = raw_input("How many references are? ")

ref = sorted(glob.glob(ref+num_refs))

samples = sorted(glob.glob(*.dat)) not in references


So the reference list will take the first name specified and the subsequents (given by the number specified). All the rest will be samples.
Any ideas how to put this in python?

Answer

You can use glob.glob('*.dat') to get a list of all of the files and then slice that list according to your criteria. The slice will begin at the index of the first reference name, and be as large as the number of references.

Extract that slice to get your references. Delete that slice to get your samples.

import glob

samples = []
references = []

ref = raw_input("Enter first reference name: ")        # blablabla_100.dat
num_refs = int(raw_input("How many references are? ")) # 3

all_files = sorted(glob.glob('*.dat'))
first_ref = all_files.index(ref)
ref_files = all_files[first_ref:first_ref+num_refs]

sample_files = all_files
del sample_files[first_ref:first_ref+num_refs]
del all_files

print ref_files, sample_files

Result:

['blablabla_100.dat', 'blablabla_101.dat', 'blablabla_102.dat'] ['blablabla_350.dat', 'blablabla_351.dat', 'blablabla_352.dat', 'blablabla_353.dat']
Comments