DeA DeA - 3 months ago 15
Python Question

Python retrieving value from URL

I'm trying to write a python script that checks money.rediff.com for a particular stock price and prints it. I know that this can be done easily with their API, but I want to learn how urllib2 works, so I'm trying to do this the old fashioned way. But, I'm stuck on how to use the urllib. Many tutorials online asked me to the "Inspect element" of the value I need to return and split the string to get it. But, all the examples in the videos have the values with easily to split HTML Tags, but mine has it in something like this:

<div class="f16">
<span id="ltpid" class="bold" style="color: rgb(0, 0, 0); background: rgb(255, 255, 255);">6.66</span> &nbsp;
<span id="change" class="green">+0.50</span> &nbsp;

<span id="ChangePercent" style="color: rgb(130, 130, 130); font-weight: normal;">+8.12%</span>
</div>


I only need the "6.66" in Line2 out. How do I go about doing this? I'm very very new to Urllib2 and Python. All help will be greatly appreciated. Thanks in advance.

Answer

You can certainly do this with just urllib2 and perhaps a regular expression, but I'd encourage you to use better tools, namely requests and Beautiful Soup.

Here's a complete program to fetch a quote for "Tata Motors Ltd.":

from bs4 import BeautifulSoup
import requests

html = requests.get('http://money.rediff.com/companies/Tata-Motors-Ltd/10510008').content

soup = BeautifulSoup(html, 'html.parser')
quote = float(soup.find(id='ltpid').get_text())

print(quote)

EDIT

Here's a Python 2 version just using urllib2 and re:

import re
import urllib2

html = urllib2.urlopen('http://money.rediff.com/companies/Tata-Motors-Ltd/10510008').read()

quote = float(re.search('<span id="ltpid"[^>]*>([^<]*)', html).group(1))

print quote
Comments