Harrison Harrison - 3 months ago 12
Python Question

Rearrange position of words in string conditionally

I've spent the last few months developing a program that my company is using to clean and geocode addresses on a large scale (~5,000/day). It is functioning adequately well, however, there are certain address formats that I see daily that are causing issues for me.

Addresses with a format such as this

park avenue 1
are causing issues with my geocoding. My thought process to tackle this issue is as follows:


  1. Split the address into a list

  2. Find the index of my delimiter word in the list. The delimiter words are words such as
    avenue, street, road, etc
    . I have a list of these delimiters called
    patterns
    .

  3. Check to see if the word immediately following the delimiter is composed of digits with a length of 4 or less. If the number has a length of higher than 4 it is likely to be a zip code, which I do not need. If it's less than 4 it will most likely be the house number.

  4. If the word meets the criteria that I explained in the previous step, I need to move it to the first position in the list.

  5. Finally, I will put the list back together into a string.



Here is my initial attempt at putting my thoughts into code:

patterns ['my list of delimiters']
address = 'park avenue 1' # this is an example address
address = address.split(' ')
for pattern in patterns:
location = address.index(pattern) + 1
if address[location].isdigit() and len(address[location]) <= 4:
# here is where i'm getting a bit confused
# what would be a good way to go about moving the word to the first position in the list
address = ' '.join(address)


Any help would be appreciated. Thank you folks in advance.

Answer

Make the string address[location] into a list by wrapping it in brackets, then concatenate the other pieces.

address = [address[location]] + address[:location] + address[location+1:]

An example:

address = ['park', 'avenue', '1']
location = 2
address = [address[location]] + address[:location] + address[location+1:]

print(' '.join(address)) # => '1 park avenue'
Comments