Is there any benefit in using compile for regular expressions in Python?
h = re.compile('hello')
re.match('hello', 'hello world')
I've had a lot of experience running a compiled regex 1000s of times versus compiling on-the-fly, and have not noticed any perceivable difference. Obviously, this is anecdotal, and certainly not a great argument against compiling, but I've found the difference to be negligible.
After a quick glance at the actual Python 2.5 library code, I see that Python internally compiles AND CACHES regexes whenever you use them anyway (including calls to
re.match()), so you're really only changing WHEN the regex gets compiled, and shouldn't be saving much time at all - only the time it takes to check the cache (a key lookup on an internal
From module re.py (comments are mine):
def match(pattern, string, flags=0): return _compile(pattern, flags).match(string) def _compile(*key): # Does cache check at top of function cachekey = (type(key),) + key p = _cache.get(cachekey) if p is not None: return p # ... # Does actual compilation on cache miss # ... # Caches compiled regex if len(_cache) >= _MAXCACHE: _cache.clear() _cache[cachekey] = p return p
I still often pre-compile regular expressions, but only to bind them to a nice, reusable name, not for any expected performance gain.