Find Jobs
Hire Freelancers

RSS feed Lists and web page parser

$30-250 USD

Completed
Posted about 10 years ago

$30-250 USD

Paid on delivery
I am looking to build a program that checks a list of RSS feeds every day. Each RSS feed will contain URLs. The URLs from the feeds then need to be crawled, and all text needs to be parsed into keywords and totaled. I will be installing this on a local Ubuntu linux computer. The final product will be zipped up and sent to me with any instructions on extra plugins I need to install to make sure this runs properly. Run on a MySQL database. Software can be either PHP or Python. All of the pieces I need are below. ------- "List Admin" feature: Lists are just a name as an identifier. I want to be able to create separate Lists. Each list can add/remove RSS feeds to it’s list. So List A could have 5 RSS feeds, List B could have 9 RSS feeds, etc. An admin tool needs to be able to add/remove feeds to each list. "Ignore common words" admin: I do not want to bother counting common words. An admin needs to exist to add/remove words to ignore. Examples: a, the, and, an, she, he, etc. The report piece below will then IGNORE all words in this list. These basic words should not be listed in the reports. Every 8 hours by Cron: A program will run that initiates checking the RSS feeds for new links within the feed, go to each link, and crawl all text content on that page. The program will parse out every word with a total count of each word on that page and put the totals into a database table. So if the word “horse” shows up 7 times on that particular URL, then it will list the number. Each list’s RSS feeds will need to be separate. So List A won’t have data from List B’s. Each URL found in an RSS feed only needs to be parsed 1 time. So a record should be kept for all URLs previously crawled. Reports: A simple report feature needs to exist that can analyze trending “hot” keywords by Hour, Day, Week, Month, or Year. So if a certain keyword showed up 38 times in one day for List A (total all RSS feeds per List), and it is the most popular keyword for the day, it would need to show up at the top of this report for the “Day”. For example: Report for current week: Horse 38 Donkey 22 Cow 11 Cat 8 Report for current day: Pig 13 Dog 7 Rabbit 3 The report feature should show separate analyzed data for each List. Data needed to be saved: - List Names - RSS feeds for each account name - Urls found in each RSS feed. - If a Url has previous been crawled. - Parsed words
Project ID: 5748681

About the project

10 proposals
Remote project
Active 10 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
I have seven years of experience working professionally with Python and web software. I develop using Ubuntu. I propose developing this as follows: - all options are controlled using a friendly command line Python tool. - options would be: -- add list -- remove list -- add url to list -- remove url from list -- show lists -- show urls in list -- add stopword (ignored word) -- remove stopword -- show stopwords -- crawl all lists -- generate report for given list and time period - of course help text is available, giving examples for usage in case you forget. - all URL:s scanned from the feeds are normalized and inserted into the database, initially with an uncrawled status. - page text will be extracted in a straightforward way. The body text will be extracted, all tags will be stripped away. Dynamic content (Ajax etc) will not be scraped. - words will be inserted in the db once for each currently active time period. Time periods that are expired will be deleted. - It needs to be specified whether time periods are rolling or absolute. For instance, is 'current month' the last 30 days or just the few days we've had in April? Or do you perhaps want to aggregate for the current time period and the previous one? This could be different for different time periods. Let me know if you have any questions, I'm looking forward to hearing from you.
$200 USD in 7 days
5.0 (2 reviews)
3.1
3.1
10 freelancers are bidding on average $196 USD for this job
User Avatar
Hi We have excellent designing and website development skills and have completed over 700 projects at Freelancer.com. We will provide one year hosting service along with this project absolutely free. Thanks Gopisoft Private Limited
$155 USD in 3 days
4.8 (50 reviews)
5.5
5.5
User Avatar
hello i am well experienced in writing scrapers in python , i can make this for you as described below , in python code . thank you
$250 USD in 2 days
4.9 (25 reviews)
5.0
5.0
User Avatar
Hi, I am an experienced in native PHP,Zend(1,2),CI,Laravel,jQuery. -Python(Django) -core Java Repository: GIT,SVN : brij420 (skype) Coding Style:- -System design , database design and documentation -Development by following coding standard -Testing -Deployment with document Please communicate me on skype. Hope to hear from your side soon. Regards: Brijesh
$177 USD in 5 days
5.0 (1 review)
2.9
2.9
User Avatar
Hello I propose you a python/django solution. Please tell me if you are interested s that we can discuss details. Thanks
$155 USD in 5 days
5.0 (3 reviews)
2.1
2.1
User Avatar
Hi, Gone through your requirement. we are ready to do your work without any initial payment. PM us. Add me in skype dotventure2013 We have experts in Joomla, Ecommerce, Drupal, Wordpress, Woocommerce,Magento, Responsive sites using Bootstrap, Android, IOS, DotNet,MVC, SQL, PHP, Mysql, Voip, Unity, Video Editing, Etc My name is Keshav, the Business Development Executive and will help you in managing your project from concept to completion with proper communication, meeting quality standards while also understanding the project scope, time and budget. • We are strong knowledge and 6 years expertise in ERP, CRM, Web Development and Game Development. • I will be available on Email & Chat as per the clients’ timings & convenience. Thanks and regards, Keshav
$277 USD in 3 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Los Angeles, United States
5.0
11
Payment method verified
Member since Jan 8, 2014

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.