user2744486 user2744486 - 1 month ago 7
Python Question

How to remove leading underscores and numbers in a string in Python

I need to sanitize some strings and remove invalid leading (non-alphabet) characters from them. For example:

"3_hello" -> "hello"
"_hello" -> "hello"
"__hello" -> "hello"
"++hello" -> "hello"


Is there a quick way to use re to compete the task?

Answer

You can try ^[^A-Za-z]*, some testing cases here:

import re
re.sub('^[^A-Za-z]*', '', "3_hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "_hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "++hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "__hello")
# 'hello'
  • Where the first ^ denotes the beginning of the string;
  • In the character class [] use another ^ to negate the alpha letters;
  • Use * as a greedy quantifier so that any non alphabets starting from the beginning of the string will be removed.