Kerbiter Kerbiter - 4 months ago 6
Python Question

Regex to match only part of certain line

I have some config file from which I need to extract only some values. For example, I have this:

PART
{
title = Some Title
description = Some description here. // this 2 params are needed
tags = qwe rty // don't need this param
...
}


I need to extract value of certain param, for example
description
's value. How do I do this in Python3 with regex?

Answer

The better approach would be to use an established configuration file system. Python has built-in support for INI-like files in the configparser module.

However, if you just desperately need to get the string of text in that file after the description, you could do this:

def get_value_for_key(key, file):
    with open(file) as f:
        lines = f.readlines()
    for line in lines:
        line = line.lstrip()
        if line.startswith(key + " ="):
            return line.split("=", 1)[1].lstrip()

You can use it with a call like: get_value_for_key("description", "myfile.txt"). The method will return None if nothing is found. It is assumed that your file will be formatted where there is a space and the equals sign after the key name, e.g. key = value.

This avoids regular expressions altogether and preserves any whitespace on the right side of the value. (If that's not important to you, you can use strip instead of lstrip.)

Why avoid regular expressions? They're expensive and really not ideal for this scenario. Use simple string matching. This avoids importing a module and simplifies your code. But really I'd say to convert to a supported configuration file format.