brumbrum brumbrum - 5 months ago 56
PHP Question

perform clicks and logout of website using cURL and php

I used cURL to login into a website. The natural question is how to perform clicks on buttons and than eventually logout. For example..javascript uses click() function. What does php use? Thanks for clues.

I am following the book on web scraping. In it the author logins into it's publishers website. The book is old and out of date. More over, it says nothing about logging out. This is the publisher: https://www.packtpub.com/

Answer

You can't click a button using PHP alone. PHP doesn't work like that. PHP can download the HTML of a webpage, but it can't perform actions like a browser can.

If you want to do that, you will need a headless browser. A headless browser is typically seen as an invisible browser. You can do most things a regular browser can do. There's PhantomJS, and CasperJS, for this.

There are also PHP libraries that use PhantomJS. For example PHP PhantomJS. Personally, I've never done this with PHP, but I do use PhantomJS and CasperJS on a regular basis.

Alternative to that, what you can do with PHP is parse the DOM for links, or buttons, and replicate the HTTP request that's made when clicking the links/buttons.

For example, if there's a link that goes to /contactus, you simply create a GET request to this page using cURL. The response will be the source code and/or headers.

I am currently working on a project that uses CasperJS, PHP and Redis to create a rather complex scraper/automation/analysis tool for a large social network.

As a side note, some sites rely heavily on JavaScript and using cURL may not be enough. You can get around this by parsing the JavaScript file/s, and some other advanced magic, but believe me you do not want to go down this route. Which is why I use CasperJS on occasions. It's slower, but that's all we've got at the moment.

As for the logging out ... delete your cookies file. Done.

Comments