peaceandiago peaceandiago - 8 months ago 23
Python Question

How to replace all matched (regex) back to the list to remove them without TypeError: expected a string or other character buffer object

My file has differents urls:;kdas --> trying to get rid off

I was able to isolate the
using regex:

import re

generate_links = re.compile('http://(.*)') #compile all http links
generate_links2 = re.compile('(.*)/eng/(.*)') #compile all english url
with open ("VAC\queue.txt", "r") as queued_list, open('newqueue.txt','w') as queued_list_updated:
for links in queued_list:
url = ""
services_url = ""
valid_url = ""
match =
if match is not None:
url =
generate_links3 = re.compile('(.*)/services/(.*)') #compile all services links
match2 =
if match2 is not None:
services_url =
print services_url
generate_links4 = re.compile('(.*)/search?(.*)') #compiled error links
match3 = #matched all error links

But how do I use
variable back to
to remove itself or be replaced?

So the expected results would be:


If you want to get rid of url containing 'search?' try :

from __future__ import print_function

with open() as in, open() as out:
    cured_url = [l for l in in.readlines() if 'search?' not in l]

    for url in cured_url:
        print(url, file=out)