I have created a php script to import rss feed into the database. The feed which is huge (from year 2004 to 2010, approx 2 million records) has to be inserted into the database. I have been running the script in browser but the pace it is inserting (approx. 1 per second) i doubt it takes another 20-25 days to input this data even if i run it 24 hrs a day. I have tried it running on different browser windows at the same time and have finished only 70000 records in last two days. I am not sure how the server would react if i run 10-12 instances of it simultaneously.
A programmer at my client's end says that i could run it directly on the server through command line. Could anyone tell me how much difference it would make if i run it through command line? Also what is the command line syntax to run it? I am on apache, php/mysql. I tried out over the web for a similar answer but they seem quite confusing to me as i am not a system administrator or that good in linux although i have done tasks like svn repositories and installing some apache modules on server in the past so i hope i could manage this if someone tell me how to do it.
Difference in speed: Minimal. All you save on is blocking on NET I/O and connection (and the apache overhead which is negligible).
How to do it:
bash> php -f /path/to/my/php/script.php
You may only have the
php5-mod package installed which is php for apache, you may have to install the actual command line interpreter, however a lot of distros install both. Personally I think you have an efficiency problem in the algorithm. Something taking days and days seems like it could be sped up by caching & worst-case performance analysis (Big-O notation).
Also, php vanilla isn't very fast, there's lots of ways to make it really fast, but if you're doing heavy computation, you should consider c/c++, C#/Mono (Maybe), possibly python (can be pre-compiled, may not actually be much faster).
But the exploration of these other outlets is highly recommended.