Luca Ashok Luca Ashok - 29 days ago 7
Python Question

I need help scraping a text file off of a download button with python + beautiful soup

Hi I'm very new to scraping and trying to do it with python and beautiful soup.

I need to get the text files for each deck on this site http://magic.wizards.com/en/articles/archive/mtgo-standings/competitive-standard-constructed-league-2016-11-08

each deck has a little download button that downloads a text file.

thanks so much!

Answer

They are submitting form near download button. Form fills by function:

wiz_bean_content_deck_list_generate_file

So, it's get innerHTML of "h4" from card and fill it to "title" field and get ".card-count" and ".card-name a" and concatenate it to one string with new lines: output += count + " " + name + breakStr;. So, you can just make post-request to http://magic.wizards.com/decklist with fields(just example):

title: Mogged%20(5-0)
content: 1%20Liliana%2C%20the%20Last%20Hope%5Bb%5D4%20Cryptbreaker%5Bb%5D4%20Haunted%20Dead%5Bb%5D4%20Insolent%20Neonate%5Bb%5D4%20Prized%20Amalgam%5Bb%5D4%20Scrapheap%20Scrounger%5Bb%5D4%20Voldaren%20Pariah%5Bb%5D4%20Cathartic%20Reunion%5Bb%5D4%20Fiery%20Temper%5Bb%5D2%20Lightning%20Axe%5Bb%5D2%20Unlicensed%20Disintegration%5Bb%5D4%20Foreboding%20Ruins%5Bb%5D5%20Mountain%5Bb%5D4%20Smoldering%20Marsh%5Bb%5D10%20Swamp%5Bb%5D%5Bb%5D%5Bb%5D1%20Lightning%20Axe%5Bb%5D1%20Liliana%2C%20the%20Last%20Hope%5Bb%5D1%20Unlicensed%20Disintegration%5Bb%5D3%20Collective%20Brutality%5Bb%5D3%20Distended%20Mindbender%5Bb%5D2%20Kalitas%2C%20Traitor%20of%20Ghet%5Bb%5D3%20Transgress%20the%20Mind%5Bb%5D1%20Vampiric%20Rites%5Bb%5D

and you will get your file. Don't forget to fill headers:

X-DevTools-Emulate-Network-Conditions-Client-Id
Origin

Without them you will get html page 'not logged in'.