Ekaterina1234 Ekaterina1234 - 4 months ago 8x
Python Question

How would you simplify this program? Python

I wrote this program, which purpose is to visit the 18th link on the list of links and then on the new page visit the 18th link again.

This program works as intended, but it's a little repetitive and inelegant.

I was wondering if you have any ideas on how to make it simpler, without using any functions. If I wanted to repeat the process 10 or 100 times, this would become very long.

Thanks for any suggestions!

# Note - this code must run in Python 2.x and you must download
# http://www.pythonlearn.com/code/BeautifulSoup.py
# Into the same folder as this program

import urllib
from BeautifulSoup import *

url = raw_input('Enter - ')
if len(url) < 1 :
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)

# Retrieve all of the anchor tags
tags = soup('a')
urllist = list()
count = 0
loopcount = 0
for tag in tags:
count = count + 1
tg = tag.get('href', None)
if count == 18:
print count, tg

url2 = (urllist[0])
html2 = urllib.urlopen(url2).read()
soup2 = BeautifulSoup(html2)

tags2 = soup2('a')
count2 = 0
for tag2 in tags2:
count2 = count2 + 1
tg2 = tag2.get('href', None)
if count2 == 18:
print count2, tg2


This is what you could do.

import urllib
from BeautifulSoup import *

url_1 = input('') or 'http://python-data.dr-chuck.net/known_by_Oluwanifemi.html'

html_1 = urllib.urlopen(url_1).read()
soup_1 = BeautifulSoup(html_1)

tags = soup('a')
url_retr1 = tags[17].get('href', None)

html_2 = urllib.urlopen(url_retr1).read()
soup_2 = BeautifulSoup(html_2)

tags_2 = soup_2('a')
url_retr1 = tags_2[17].get('href', None)