Alan Alan - 3 months ago 15
Python Question

Finding anchor tags within a table using selenium chromedriver

I am trying to build an application that automates the process of downloading several anime episodes and i am stuck. So far i've been able to locate the episode links using the following code:

def get_episodes(driver):
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//a[contains(@title,'Episode')]")))
episodes = driver.find_elements_by_xpath("//a[contains(@title,'Episode')]")
del episodes[-1]
episodes = list(reversed(episodes))
return episodes


However recently i've found out that not every episode contains the word 'episode' in its link text. As such, i am trying to figure out another way to get every link to an episode. The basic structure of the page contains a table, and each link is located inside a
<td>
element.

I've thought of gathering all the td elements, and then getting their children (or should i say child) by using css selectors. Nevertheless, this won't work either because there are more
<td>
elements than those that meet the eye.

Here's an example page for reference. I am a noob as far as selenium is concerned, and thus not very familiar with its api, so i don't know exactly what i am looking for. Any suggestion is appreciated.

Answer

You're on the right track, but you may be over-thinking this a bit. Why not just target the table that we know has the episodes, then use a list comprehension to grab all the episode links?

def get_episodes():
    episode_table = driver.find_element_by_class_name('listing')
    episode_links = [i.get_attribute('href') for i in episode_table.find_elements_by_tag_name('a')]
    print(episode_links)

    >>>['http://kissanime.to/Anime/Death-Note-Dub/Episode-037?id=97557', 'http://kissanime.to/Anime/Death-Note-Dub/Episode-036?id=97556', 'http://kissanime.to/Anime/Death-Note-Dub/Episode-035?id=97555', 'http://kissanime.to/Anime/Death-Note-Dub/Episode-034?id=97554',etc..]