What’s in a (Company) Name?
By Ram Anantha, Infoglide Director of Product Management
Matching company names in a database may seem like a simple task. HP is HP, right? Oh wait, HP is also Hewlett Packard or maybe Hewlett-Packard or Hewlett-Packard Company. And EDS is now also HP due to an acquisition. Oh, and by the way, EDS stands for Electronic Data Systems. Yikes!
Matching company names, it turns out, is actually pretty complicated, which is why some companies can get away with not paying their workers’ compensation premiums simply by going “out of business” and starting up again with a different name.
There are still some things, like applying context to data and reading emotional cues, that human beings do better than computers (phew!). But software has gotten pretty smart, and identity resolution technology is able to make a lot of the same connections that the human mind would. Some examples of the types of variation in data that identity resolution technology can resolve include:
- Suffix variation - Wells Fargo & Company vs. Wells Fargo & CO
- Common substitutions - Department of Defense vs. Dept of Defense
- Company short forms - Federal Express vs. Fedex
- Missing/inserted tokens - Allied Waste Industries Inc vs. Allied Waste Inc
- Token transposition - Law Offices of Dale, Fischer & Cobb vs. Law Offices of Dale, Cobb & Fischer
- Two character transposition - Weis Markets vs. Wies Markets
- Spelling equivalents - Cedar Shopping Centers vs. Cedar Shopping Centres
- Phonetic similarity - Filene’s Basement vs. Philene’s Basement
- International equivalents - Cemex sociedad anónima bursatil de capital variable vs. Cemex s.a.b de c.v.
And when other attributes (e.g. street addresses, phone numbers, executive management, etc.) are taken into consideration, the quality of the matches can be further enhanced. Rapid and reliably accurate searching of company names, including discovery of duplicates and relationship links, is a basic need for many business and government applications. Applying sophisticated identity resolution technology can remove and prevent confusion that would otherwise hamper critical applications.
