swe_mattias swe_mattias - 5 months ago 6
Ruby Question

List of pairs, find those who are not

I read a list of pairs in the filesystem (Linux)...
UniqueDocument.xml
UniqueDocument.pdf

I need to find the entries that does not have a xml file, then I need to fetch it.

Been trying with os.list and regex but havent found a sutible solution and Dir() in Ruby. But I cant get to the end... my mind blocks me.

Answer

In Ruby,

# Get an array of file names for pdf and xml
pdf=Dir.glob("test/*.pdf").map {|f| File.basename(f, '.pdf')}
xml=Dir.glob("test/*.xml").map {|f| File.basename(f, '.xml')}

# Make the difference between xml and pdf to get file names that have a pdf file but no xml
p pdf - xml

How does it work ?

  1. Dir.glob("test/*.pdf")

returns an array with the path to all pdf files in folder test. Looks like ["test/foo.pdf", ...].

  1. File.basename('test/foo.pdf', '.pdf')

returns the file name without the extension. In this case, fill return 'foo' .

  1. Dir.glob("test/*.pdf").map {|f| File.basename(f, '.pdf')}

returns an array of file names without extension, taking only pdf files.

  1. pdf - xml

returns all strings that are in pdf but not in xml.

Comments