Ben Ben - 2 months ago 19
Python Question

How to use regex to disallow non-digts but allow a dot in Python?

I am new to regex and trying to remove all non-digts but keep the dot (.) of a string:

x = ['ABCD, EFGH ', ' 20.9&dog; ', ' IJKLM />']


So far I have tried the following:

>>> x = re.sub("\D", "", x)
209


However I am trying to get the following outcome:

20.9


Thanks.

Answer

You want an inverted character class:

re.sub(r"[^\d.]", "", x)

Note that [^0-9.] and [^\d.] are not the same, because \d matches many more characters than just 0123456789:

>>> print(textwrap.fill(
...    "".join(x for x in (chr(y) for y in range(0x110000))
...            if re.match(r"\d", x)),
...    break_long_words=True, width=10))
0123456789
٠١٢٣٤٥٦٧٨٩
۰۱۲۳۴۵۶۷۸۹
߀߁߂߃߄߅߆߇߈߉
०१२३४५६७८९
০১২৩৪৫৬৭৮৯
੦੧੨੩੪੫੬੭੮੯
૦૧૨૩૪૫૬૭૮૯
୦୧୨୩୪୫୬୭୮୯
௦௧௨௩௪௫௬௭௮௯
౦౧౨౩౪౫౬౭౮౯
೦೧೨೩೪೫೬೭೮೯
൦൧൨൩൪൫൬൭൮൯
෦෧෨෩෪෫෬෭෮෯
๐๑๒๓๔๕๖๗๘๙
໐໑໒໓໔໕໖໗໘໙
༠༡༢༣༤༥༦༧༨༩
၀၁၂၃၄၅၆၇၈၉
႐႑႒႓႔႕႖႗႘႙
០១២៣៤៥៦៧៨៩
᠐᠑᠒᠓᠔᠕᠖᠗᠘᠙
᥆᥇᥈᥉᥊᥋᥌᥍᥎᥏
᧐᧑᧒᧓᧔᧕᧖᧗᧘᧙
᪀᪁᪂᪃᪄᪅᪆᪇᪈᪉
᪐᪑᪒᪓᪔᪕᪖᪗᪘᪙
᭐᭑᭒᭓᭔᭕᭖᭗᭘᭙
᮰᮱᮲᮳᮴᮵᮶᮷᮸᮹
᱀᱁᱂᱃᱄᱅᱆᱇᱈᱉
᱐᱑᱒᱓᱔᱕᱖᱗᱘᱙
꘠꘡꘢꘣꘤꘥꘦꘧꘨꘩
꣐꣑꣒꣓꣔꣕꣖꣗꣘꣙
꤀꤁꤂꤃꤄꤅꤆꤇꤈꤉
꧐꧑꧒꧓꧔꧕꧖꧗꧘꧙
꧰꧱꧲꧳꧴꧵꧶꧷꧸꧹
꩐꩑꩒꩓꩔꩕꩖꩗꩘꩙
꯰꯱꯲꯳꯴꯵꯶꯷꯸꯹
0123456789
Comments