בנימן הגלילי בנימן הגלילי - 6 months ago 15
Ruby Question

Match , space and [A-z] with possible second comma RegEx

I have files named

Author_1999.pdf
Authorone, Authortwo_1999.pdf
Authorone, Authortwo, Authorthree.pdf
Arian, Nachmias, Amir_2002.pdf
Author, Review, Source_2015(2).pdf
Avraham, Hacohen_1930.pdf


that were produced by reference manager Mendeley. I need them all in the format of Authorone1999.pdf or Authorone1999(2).pdf. I have regex that only matches the last comma space author and _

/(, )+[A-z ]*,?[A-z]*,?-?[A-z]*_/


enter image description here

How can I match on the optional first comma space Authortwo as well. There are never more than two commas because that produced

Authorone, et al._1999.pdf


and I've already cleaned those up.

Answer

Here's a simple solution:

/^(?<author>[a-z]+).*_(?<year>[\d()]+)/i

Demo

This will store the author and year into two named capture groups.