nemja nemja - 3 years ago 193
Python Question

Removing everything after "?" python

I get an error when I try to get rid of everything behind the "?" in a set of scraped links:

code:

from selenium import webdriver
import pandas as pd
import time
from datetime import datetime
from collections import OrderedDict
import re

browser = webdriver.Firefox()
browser.get('https://www.kickstarter.com/discover?ref=nav')
categories = browser.find_elements_by_class_name('category-container')

category_links = []
for category_link in categories:
category_links.append((str('https://www.kickstarter.com'),
category_link.find_element_by_class_name('bg-white').get_attribute('href')))
print(category_links)
for i in category_link:
category_links2 = re.sub('?$', '', category_link)
print(category_links2)


error:
TypeError: 'FirefoxWebElement' object is not iterable

How can I solve this issue?

Answer Source
  1. You need to iterate over category_links. i is the loop variable

  2. You need to apply re.sub over i, not category_links because the latter is a list, and re.sub will not work on lists.

  3. For a simple task like this, I recommend splitting on ? with str.split:


for i in category_links:
    category_links2 = i[1].split('?')[-1]
    print(category_links2)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download