Post a Project

Company Name Matching Engine

Cancelled Posted Mar 31, 2012 Paid on delivery

$30-5000 USD

Paid on delivery

Cancelled Paid on delivery

I often have the need to match company names between two separate large csv files. Matching company names well is not a trivial task. Various algorithms and processes should be considered to do this including: Levenshtein Edit Distance, Smith-Waterman distances, Jaccard token distance, weighing common company name tokens differently than uncommon ones and so on.

For example, provided company names such as:

DSZ Investments, LLC

D.S.Z Investment Company

DSZ Investments, L.L.C

DSG Investments, LLC

The first 3 should be considered the same company, but the fourth should be considered a separate company even though the edit distance is very narrow. The common token "Company" has to have very low weight when doing the match. Whereas the uncommon token DSG must have a much heavier factor on the match due to it's rarity.

A highly relevant document that I read and that the principles within should be codified and integrated into the project is attached to this post.

Experience doing this type of matching or designing these types of algorithms would be very helpful. I work in a unix environment and I am looking for a command line tool that can run from the bash shell.

Please review the attached document and let's get the conversation going. Canned replies will be ignored.

Thanks for your interest in this project.

Script Install Shell Script

Project ID: #2727519

About the project

1 proposal Remote project Active Apr 22, 2012

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe

Get paid for your work

Outline your proposal

It's free to sign up and bid on jobs

1 freelancer is bidding on average $636 for this job

AnkSoftware

See private message.

$635.8 USD in 20 days

(4 Reviews)

5.0

Post a project like this

Company Name Matching Engine

About the project

Looking to make some money?

Benefits of bidding on Freelancer

1 freelancer is bidding on average $636 for this job

Freelancer

About

Terms

Apps