user3449212 user3449212 - 1 year ago 89
Python Question

Using regex extract all digit and word numbers

I am trying to extract all string and digit numbers from text.

text = 'one tweo three 10 number'
numbers = "(^a(?=\s)|one|two|three|four|five|six|seven|eight|nine|ten| \
eleven|twelve|thirteen|fourteen|fifteen|sixteen|seventeen| \
eighteen|nineteen|twenty|thirty|forty|fifty|sixty|seventy|eighty| \

print, text).group(0)

This gives me first words digit.

my expected result = ['one', 'two', 'three', '10']

How can I modify it so that all words and well digit numbers I Can get in list?

Answer Source

There are several issues here:

  • The pattern should be used with the VERBOSE flag (add (?x) at the start)
  • The nine will match nine in ninety, so you should either put the longer values first, or use word boundaries \b
  • Declare the pattern with a raw string literal to avoid issues like parsing \b as a backspace and not a word boundary
  • To match digits, you may add a |\d+ branch to your number matching group
  • To match multiple non-overlapping occurrences of the substrings inside the input string, you need to use re.findall (or re.finditer), not

Here is my suggestion:

import re
text = 'one two three 10 number eleven eighteen ninety  \n '
numbers = r"""(?x)

print(re.findall(numbers, text))

See Python demo

And here is a regex demo.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download