Hi I'm very new to scraping and trying to do it with python and beautiful soup.
I need to get the text files for each deck on this site http://magic.wizards.com/en/articles/archive/mtgo-standings/competitive-standard-constructed-league-2016-11-08
each deck has a little download button that downloads a text file.
thanks so much!
They are submitting form near download button. Form fills by function:
So, it's get innerHTML of "h4" from card and fill it to "title" field and get ".card-count" and ".card-name a" and concatenate it to one string with new lines:
output += count + " " + name + breakStr;.
So, you can just make post-request to http://magic.wizards.com/decklist with fields(just example):
title: Mogged%20(5-0) content: 1%20Liliana%2C%20the%20Last%20Hope%5Bb%5D4%20Cryptbreaker%5Bb%5D4%20Haunted%20Dead%5Bb%5D4%20Insolent%20Neonate%5Bb%5D4%20Prized%20Amalgam%5Bb%5D4%20Scrapheap%20Scrounger%5Bb%5D4%20Voldaren%20Pariah%5Bb%5D4%20Cathartic%20Reunion%5Bb%5D4%20Fiery%20Temper%5Bb%5D2%20Lightning%20Axe%5Bb%5D2%20Unlicensed%20Disintegration%5Bb%5D4%20Foreboding%20Ruins%5Bb%5D5%20Mountain%5Bb%5D4%20Smoldering%20Marsh%5Bb%5D10%20Swamp%5Bb%5D%5Bb%5D%5Bb%5D1%20Lightning%20Axe%5Bb%5D1%20Liliana%2C%20the%20Last%20Hope%5Bb%5D1%20Unlicensed%20Disintegration%5Bb%5D3%20Collective%20Brutality%5Bb%5D3%20Distended%20Mindbender%5Bb%5D2%20Kalitas%2C%20Traitor%20of%20Ghet%5Bb%5D3%20Transgress%20the%20Mind%5Bb%5D1%20Vampiric%20Rites%5Bb%5D
and you will get your file. Don't forget to fill headers:
Without them you will get html page 'not logged in'.