Bot for Spidering
$30-100 USD
Paid on delivery
I need to tool for spidering websites.
I use wget now. But I want the tool run from my website. I have a shared hosting in Linux (REDHAT Enterprise 4) and can provide Shell access if needed. I can create a separate hosting account for this.
You can use wget or similar free software to build the tool. I want a userfriendly interface to enter the URL of the website and the tool has to spider all webpages.
Configuration to allow for
- excluding certain file extensions (like audio, video, images)
- excluding some paths
- saving all files with .htm extension
- server friendly scheduling of the page requests
Wget allows me to do all these and more.
Alternately, if you can teach me how I can run wget from my hosting server, it will serve the purpose too.
Thanks for your time
Eshwar
Project ID: #3672843