BorrajaX BorrajaX - 5 months ago 126
Python Question

Compare version strings

I am walking a directory that contains eggs to add those eggs to the

sys.path
. If there are two versions of the same .egg in the directory, I want to add only the latest one.

I have a regular expression
r"^(?P<eggName>\w+)-(?P<eggVersion>[\d\.]+)-.+\.egg$
to extract the name and version from the filename. The problem is comparing the version number, which is a string like
2.3.1
.

Since I'm comparing strings, 2 sorts above 10, but that's not correct for versions.

>>> "2.3.1" > "10.1.1"
True


I could do some splitting, parsing, casting to int, etc., and I would eventually get a workaround. But this is Python, not Java. Is there an elegant way to compare version strings?

Answer

Use distutils.version.

>>> from distutils.version import LooseVersion, StrictVersion
>>> LooseVersion("2.3.1") < LooseVersion("10.1.2")
True
>>> StrictVersion("2.3.1") < StrictVersion("10.1.2")
True
>>> StrictVersion("1.3.a4")
Traceback (most recent call last):
...
ValueError: invalid version number '1.3.a4'

Note that LooseVersion and StrictVersion have been deprecated under PEP 386 and will at some point be replaced by NormalizedVersion.


As the Python docs are empty, here's the relevant docstrings (based on Python 3.3) for reference (nicked from the source):

Every version number class implements the following interface:

  • the 'parse' method takes a string and parses it to some internal representation; if the string is an invalid version number, 'parse' raises a ValueError exception
  • the class constructor takes an optional string argument which, if supplied, is passed to 'parse'
  • __str__ reconstructs the string that was passed to 'parse' (or an equivalent string -- ie. one that will generate an equivalent version number instance)
  • __repr__ generates Python code to recreate the version number instance
  • _cmp compares the current instance with either another instance of the same class or a string (which will be parsed to an instance of the same class, thus must follow the same rules)

StrictVersion

Version numbering for anal retentives and software idealists. Implements the standard interface for version number classes as described above. A version number consists of two or three dot-separated numeric components, with an optional "pre-release" tag on the end. The pre-release tag consists of the letter 'a' or 'b' followed by a number. If the numeric components of two version numbers are equal, then one with a pre-release tag will always be deemed earlier (lesser) than one without.

The following are valid version numbers (shown in the order that would be obtained by sorting according to the supplied cmp function):

0.4       0.4.0  (these two are equivalent)
0.4.1
0.5a1
0.5b3
0.5
0.9.6
1.0
1.0.4a3
1.0.4b1
1.0.4

The following are examples of invalid version numbers:

1
2.7.2.2
1.3.a4
1.3pl1
1.3c4

The rationale for this version numbering system will be explained in the distutils documentation.


LooseVersion

Version numbering for anarchists and software realists. Implements the standard interface for version number classes as described above. A version number consists of a series of numbers, separated by either periods or strings of letters. When comparing version numbers, the numeric components will be compared numerically, and the alphabetic components lexically. The following are all valid version numbers, in no particular order:

1.5.1
1.5.2b2
161
3.10a
8.02
3.4j
1996.07.12
3.2.pl0
3.1.1.6
2g6
11g
0.960923
2.2beta29
1.13++
5.5.kw
2.0b1pl0

In fact, there is no such thing as an invalid version number under this scheme; the rules for comparison are simple and predictable, but may not always give the results you want (for some definition of "want").

Comments