Need a skilled Java developer to debug, fix code of and enhance an existing image web crawler. Successful completion of the following tasks is requested:
Debug:
1) debug and fix a memory leak (java heap error) in the code of the crawler.
2) ever once in a while the app hangs, while memory is still available. Need to understand if this is related to the memory leak.
Enhance:
1) when user stops the app, subsequent restart should continue from the last scanned URL and not from the starting project URL. For example, if starting URL is [login to view URL] and the crawler was stopped when it was scanning http://www.amazon.com/Kindle-eReader-eBook-Reader-e-Reader-Special-Offers/dp/B0051QVESA/ref=amb_link_356991982_1?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=browse&pf_rd_r=10CQ1DAT4KJ1GZASSBYJ&pf_rd_t=101&pf_rd_p=1330783002&pf_rd_i=283155, the restart should begin from the last URL and not fresh from [login to view URL]
2) Should be able to capture and scan through relative URLs
3) When given a sub-directory as a starting point, the scan should begin from that directory inwards, and not from the main domain. For example, if [login to view URL] is the starting URL, then the scan starts from there, stays/scans deeper within that directory (i.e. does not go up to [login to view URL] or [login to view URL]) and neither starts from [login to view URL]