Closed

Cloudera Search Engine & Nutch - Detailed in attachement

Hello!

I'm looking for someone to configure a job search engine, similar to [url removed, login to view], smaller but similar.

PROJECT MAIN OBJECTIVE

To develop a fully customizable, fully operational and easy to manage (through a graphic user interface) job search engine like [url removed, login to view] - similar as capacity, accuracy, search speed and features.

I'm thinking about two main objectives:

Configure Nutch&Hbase with hcode to crawl and store crawled pages (pages will be crawled and stored in RAW HTML format.) - The nutch crawler should be able to crawl around 10000 websites dailly.

Pages collected will be than separated into two categories: JOB POSTING PAGES and NON JOB POSTING PAGES using Apache Mahout, GATE or UIMA - whatever reaches best accuracy and speed.

JOB POSTING PAGES in RAW HTML FORMAT will be than pushed into a CDH 5 - Claudera EXPRESS & Claudera Search machine (single node) and indexed so that users from the web will aceess the Solr index through a very simple interface (see [url removed, login to view]). The Claudera EXPRESS & SEARCH CDH5 Machine should be configured on a single node and in such a way that it would permit very fast search and management of about 10 mil RAW HTML pages.

Web user - Query should be very simple and very very fast. After query the users from the web should see a list like the one below and be able to fallow the links to the original job posting websites:

[url removed, login to view]

This is a very brief form of the project.

PLEASE CHECK THE DETAILED VERSION OF THE PROJECT ATTACHED AND LET ME KNOW OF YOUR COMMENTS AND OFFER.

Kind regards,

CHRISTIAN

Skills: Apache Solr, Hadoop, HTML5, Java, Map Reduce

See more: cloudera search, cdh nutch, nutch cloudera, html attachement, search cdh5, cloudera cdh nutch crawler, www job search com, www job search, www indeed jobs, www indeed com jobs, websites search engine, web crawler job search, web crawler jobs, web crawler features, solr user interface, solr jobs, search engine list, search about jobs, q.c. jobs, objective c jobs, node job, node graphic, linkup, l.a. jobs, job web crawler

About the Employer:
( 0 reviews ) Romania

Project ID: #6237686

12 freelancers are bidding on average $3858 for this job

leadconcept

Kindly ignore the bid amount, it is just a placeholder to submit the proposal & start communication with you to discuss further, as our estimate is higher than your set budget, so would you be flexible in it? -------- More

$3092 USD in 40 days
(3 Reviews)
7.1
mitss

hello i checked your given requirement details and attached file. we develop your website similar like [url removed, login to view] job searching concept website with CMs admin and here please check our developed website and mobile f More

$3157 USD in 55 days
(13 Reviews)
6.2
seekdeveloper

Hi, I have read your post and understood your requirement. I have good experience in handling Java/Wordpress /Magento/Joomla/Drupal/ HTML5/CSS3/PHP/ Yii framework/Javascript/MySQL/ Kindly go through my works More

$3092 USD in 30 days
(15 Reviews)
6.1
mmadi

Hi, Iam interested in your project and I'll be happy to do that for you. I have rich experince in scrapping curl regular expressions Dom and Selenium RC captchas, ips and solr and bigdata. I worked for travelfox.c More

$2368 USD in 30 days
(4 Reviews)
5.2
sergioes

Thanks for inviting me to your project, but it was long ago since I used nucth. I guess I could get back to it, but I would need some extra time. Regards, Sergio.

$3157 USD in 30 days
(48 Reviews)
4.9
freelancerj2ee

I have done a couple hadoop/cloud computing related projects: [url removed, login to view] [url removed, login to view] More

$3222 USD in 30 days
(13 Reviews)
4.2
amitchaps

Hi, This is Amit Chaphekar. I have been working on freelancer for a long time and would like to work on this project. Kindly ping me if you are interested in talking with me. We can discuss mutual specific questi More

$2222 USD in 30 days
(2 Reviews)
3.2
aekpani

Dear Valued Customer, This is a placeholder bid and timeline which we will adjust once we have your precise scope of work which would enable us to assess your requirements and advise an optimum solution that tailore More

$4105 USD in 90 days
(2 Reviews)
3.1
shreeanuya

Hello, We are Web Development and software and mobile application Company with experience over 3 years in all platforms done over 500 sites and over 10 software's so far we work for quality and More

$3157 USD in 30 days
(0 Reviews)
0.0
icelanceer

Hello, We are concerning about your project with our team. If you would like to work with us seriously, Please contact with us. Note: The finishing time of the project can be changed as the concept of the project. T More

$3000 USD in 7 days
(0 Reviews)
0.0
veerac

Our domain expertise is commerce which involves indexing structured and unstructured content. I want to set the expectation clear. We used SOLR (based on lucene), Lucene with java API, easy ask and endicia search engin More

$11111 USD in 90 days
(0 Reviews)
0.0
mkaruppuswamy

Hi , We " GM IT Software Services " specialize in JAVA/J2EE project deliveries , We have over 14 years in IT software services . We commit 100 % performance , On time , within budget delivery , with more than 30 + More

$2500 USD in 30 days
(0 Reviews)
0.0
aditya1989gmail

Hi I went through your project description along with the attachment and understand that you are looking for people who have previous experience in creating JOB POSTING websites. I have done such website projects with More

$5268 USD in 160 days
(0 Reviews)
0.0
swathi112

Hello, Greetings from Globussoft Technologies. We've gone through your Specifications & understood your requirements. After reading all we can say is that yes we can easily develop this application for all the plat More

$2777 USD in 30 days
(0 Reviews)
0.0