Extract word by word all text and meta data from an html page.
€150-500 EUR
Cancelled
Posted over 5 years ago
€150-500 EUR
Paid on delivery
Extract word by word all text and meta data from an html page. Save billions of html pages in a structured way in a sql database so you can perform analysis on words and tags, minimizing storage space required and maximizing performance and still be able to reconstruct the html page with the same text, including punctuation marks and tags.
See instructions, data to be extracted and expected results of an example at:
[login to view URL]
You can use a popular, well maintained, html dom parser like AngleSharp.
Coding must be done in C# using async methods. It must return results quickly and efficiently.
Please state in your answer
1. If you have experience with this and how
2. Which parser you would use
With respect to your project, I would like to inform you that i will be able to work on this project.
I have experience with request module, using many libraries like
httpwebrequest
httpclient
restclient
webclient etc..
For parsing i use htmlagibility , it is more powerful, but if u want to use anglesharp then we can go with it . I would suggest we should go ahead with htmlAgibility.
€300 EUR in 7 days
3.1 (2 reviews)
3.1
3.1
12 freelancers are bidding on average €405 EUR for this job
Hello,
I can help you to get the word from html code.
I have being developed many scraper for 5 years.
1. If you have experience with this and how
I'm used in c#.
2. Which parser you would use
I'm always using htmlagility package.
Please contact me,
I want to work with you for a long time.
Thanks.
Hello, Thanks for your post on my good experience.
Html parser and extractor development is very good job for me.
I'm going to use Python to scrap data.
My releavant skill is:
C# Programming, HTML5
I want to further discuss with you.
I look forward to working together in partnership on your project and into the future.
Regards
Hi
I'm c# developer with 12 years experience on .net framework and related tools.
SQL is not compatible for large data scale. to store huge number same as billion number of records you need to choose right and compatible to your.
It's depend if you need to have a search engine then Elasticsearch is the good one and if you need to store them really fast you need to use one of the Wide column DB engine same as Hypertable or Casandra.
If you want to know which is the good database for you then you need to tell us more about your project.
please send a message for me to have discussion about details.
Hello!
Nice to meet you.
I am Richeng Wu.
I have read your post carefully.
I am a Web expert and have 7 years of experience with web.
I already have experience in developing something like this.
If you hire me, I will give you excellent results with a small amount of money in the shortest time.
Please award me your project.
Thanks