utkarsh utkarsh - 3 months ago 16
Python Question

webbrowser module searching url with absolute path

I want to open a website to download resume from it, but following code tries to get to absolute path instead of just url:

import webbrowser
soup = BeautifulSoup(webbrowser.open('www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0'),"lxml")


generates the following error:

gvfs-open: /home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu-
Pandit/dee64d1418e20069?sp=0:
error opening location: Error when getting information for file
'/home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu-
Pandit/dee64d1418e20069?sp=0': No such file or directory


Clearly it is taking the home address and trying to search that on web which will not be present. What am I doing wrong here? Thanks in advance

Answer Source

I suppose you are confusing the usage of Beautiful Soup and webbrowser together. Webbrowser it is not needed to access the page.

From Documentation

Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application

Adapting the tutorial example to your task to print the resume in output

from bs4 import BeautifulSoup
import requests
url = "www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0"
r  = requests.get("http://" +url)
data = r.text
soup = BeautifulSoup(data, "html.parser")
print soup.find("div", {"id": "resume"})