I've seen here on SO many ways to initialize a Beautifulsoup object. As far as I can see, you can either pass a string=url or to pass some object. For instance, it's common to use
soup1=BeautifulSoup(url_html, "html.parser") #1st way
print(soup1.find("p").text) #can get the text "asdas"
soup2=BeautifulSoup(urllib.request.urlopen(url).read(), "html.parser") #2nd way
soup3=BeautifulSoup(urllib.request.urlopen(url), "html.parser") #3rd way
urlopen() returns an open file-like object. The constructor of Beautifulsoup uses type-checking to see whether it got a file or a string (to be precise, it does
markup.hasattr("read"). In the first case, it simply calls its
This is a common pattern in Python libraries that deal with big amounts of user-provided text data.
The difference in Soup's case is non-existent. Other libraries might do something more intelligent with a file object, e.g. partition it and not load it to memory en bloque.