HOME

Archive for the ‘Name Matching’ Category

Identity Resolution Daily Links 2011-01-23

Sunday, January 23rd, 2011

[Post from Infoglide] Financial Services Has a Growing Problem: Internal Fraud

“The Aite Group recently authored a report entitled ‘Internal Fraud: The Devil Within.’ After surveying 35 fraud and product executives at financial institutions across the U.S. and Canada, they concluded that internal fraud is a severe and growing problem that often goes undetected and almost always flies under the radar of public scrutiny.”

Bloor: There’s identity resolution and then there’s identity resolution

“The second type of identity resolution is similar but different. The classic example is in police work. Here you want to know that some particular criminal has fifteen different aliases, say. Moreover, under each of those identities he or she will have multiple contacts and you may want to do social network analysis against those contacts to see who else might have criminal tendencies.”

Chicago Sun Times: Police sensing crime before it happens

“In October, the Chicago Police Department’s new crime-forecasting unit was analyzing 911 calls for service and produced an intelligence report predicting a shooting would happen soon on a particular block on the South Side. Three minutes later, it did, police officials say. That got police Supt. Jody Weis thinking. He wondered if the department could produce intelligence reports even quicker. Next time, officers might have an hour’s notice before a shooting — instead of just a few minutes.”

KERO23:Ten People Indicted In Wide-Ranging Real Estate Scam

“The indictment alleges that, from approximately January 2004 to September 2007, the defendants perpetrated a scheme to defraud mortgage lenders by submitting fraudulent loan applications with material misrepresentations, including misrepresentations concerning the borrower’s income, assets, employment status, and intent to use the home as the borrower’s primary residence… The scheme involved more than $20 million in losses to lenders.”

Entity Identity Management

Friday, January 14th, 2011

By John Talburt, PhD, CDMP, Director, UALR Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ)

First, let me wish everyone a Happy and Prosperous New Year.  Also, since my last post, my book Entity Resolution and Information Quality has been published and is now available from Morgan Kaufmann Publishing (http://mkp.com/news/entity-resolution-and-information-quality).

What is entity identity management? It simply means that an ER system can store and maintain a record of identity information that persists over time.  Entity identity management is essential for an ER engine to operate in identity resolution or identity capture mode and for it to maintain persistent entity identifiers.

As you may recall from previous discussions, an identity resolution ER system starts with a set of known (asserted) identities and attempts to determine if a given entity reference refers to one of these known entities.  On the other hand, an identity capture ER system starts with a blank slate and tries to construct an identity based on the (equivalent) references it processes.

Two important concepts here bear further discussion.  One is the structure for representing the identity of an entity, and the second and somewhat more philosophical question is, what constitutes entity identity.

There are two commonly used approaches to representing identity in ER systems – one is an attribute-level structure sometimes called a “merge identity” and the other is a reference-level structure sometimes called a “cluster identity.”  The difference between a merge identity and a cluster identity can be illustrated by a simple example.

Suppose we have a system where entity references have three attributes A, B, and C, and that we are given two specific entity references R1=(a1, b1, c1) and R2=(a2, b2, c1), where a1 and a2 are values for attribute A, b1 and b2 values for attribute B, and c1 a value for attribute C.  Finally assume that references R1 and R2 are determined to be equivalent references (i.e. references to the same real-world entity).  In the merge identity approach, the entity identity EM referenced by R1 and R2 would be represented as

EM=[A:{a1, a2}, B:{b1, b2}, C:{c1}]

Meaning that for identity EM the A attribute can take on either the value a1 or a2, the B attributes can take on the value b1 or b2, and the C attribute the value c1.  In a merge identity the binding between the values a1 and b1 that was expressed by their co-occurrence in the reference R1 is lost.  Similarly the binding between a2 and b2 expressed by R2 is no longer present in EM.

In a cluster identity structure, the original reference binding between attribute values is preserved.  In the cluster identity approach, the entity identity EC referenced by R1 and R2 would be represented as

EC=[(A:a1, B:b2, C:c1), (A:a2, B:b2, C:c1)]

Thus, for identity EC the attributes A, B, and C can only take on the permutations given by the original references R1 and R2. There are advantages and disadvantages to both approaches, but most significantly they can lead to different resolutions for the same set of references.

To illustrate, let’s continue with the preceding example by supposing that the systems using the merge identity and the cluster identity both use the same two resolution rules.  Rule 1 is that the two references are considered equivalent if they agree (exact match) on Attribute C.  Rule 2 is that they are equivalent if they agree (exact match) on both Attributes A and B.

Now suppose that each system processes a third entity reference R3=(a1, b2, c2).  Using the two rules just discussed, the merge identity system would resolve R3 as equivalent to the identity EM represented by references R1 and R2.  By Rule 1, R3 agrees with EM on attribute A and also attribute B.  On the other hand, R3 would not resolve to the identity EC in the cluster identity system.  R3 does not satisfy either Rule 1 or Rule 2 with respect to either of the references R1 and R2 that comprise the cluster identity EC.

Merge identities and cluster identities both represent valid, but different, approaches to identity management.  To some extent they also represent two different ways of thinking about entity identity.  I plan to discuss the concept of the entity identity further in the next post.

Identity Resolution Daily Links 2011-01-11

Tuesday, January 11th, 2011

By the Infoglide Team

BND.com: Insurance fraud investigators begin probe into workers’ comp claims at Menard

“A total of 389 guards and other workers have filed more than 500 claims, including about 290 still pending. About 230 of these claimed injury for the underlying cause of ‘repetitive trauma,’ including carpal tunnel syndrome, an injury of the wrist. The prison employs about 760 workers, of which 567 are guards. ‘The Department of Insurance is investigating recent questions raised in connection with workers’ compensation claims filed against the state of Illinois at the Menard Correctional Center,’ department spokesman Louis Pukelis said Tuesday in a written statement.”

HSToday: Fusion Centers: Tough Tightrope 

“As states and localities have put up fusion centers designed precisely to overcome this, however, they’ve had to face a different challenge: ensuring not only the quantity but the quality of information they collect and report. In candid conversations with Homeland Security Today, leading privacy advocates, scholars and state law enforcement and federal officials addressed some of the key facets of this challenge, as well as steps that can be taken to ensure that fusion centers live up to their full potential as a counterterrorism tool.”

StarNewsOnline: North Carolina collects big from Medicaid fraudsters

“North Carolina’s Medicaid fraud investigators pulled in millions last year through dozens of cases of fraud and patient abuse, the state’s attorney general’s office reported Monday. The office’s Medicaid Investigations Unit prosecuted 22 criminal convictions and 18 civil settlements, recovering $53.5 million, during the federal fiscal year that ended Sept. 30, according to a press release from N.C. Attorney General Roy Cooper.”

ReadWriteWeb: What Cloud Computing Means For Small Businesses

“Needless to say, it’s a huge deal. Gartner recently put cloud computing at the top of its list of top strategic technologies for 2011 and it’s far from the only expert extolling the glory of the Web-hosted software and infrastructure. For small businesses, the significance of this primarily comes down to cost. In many cases, using cloud-based infrastructure is cheaper than running and maintaining one’s own physical servers.”

Identity Resolution Daily Links 2011-01-09

Sunday, January 9th, 2011

[Post from Infoglide] You Can’t Handle the Truth

“We have a new Congress and a new House majority leader as of this week’s swearing in ceremony. The current House majority party (R) plans to pass a bill to repeal the ‘Obamacare’ bill passed during the last session by the former House majority party (D).  Both parties make ‘fact based’ arguments about why killing or keeping the bill will reduce the deficit, yet both can’t be right. This isn’t a political blog, and I’m not going to take a side on this issue. What struck me is how often we use ‘facts’ to bolster our argument, with ‘facts’ defined as any real data that can be massaged or misinterpreted to suggest that our desired outcome appears to be the best one.”

The Washington Post: The Navigator: Does Secure Flight program mean more money for the airlines?

“When she arrived at the screening area, her husband’s incorrect name had already been checked against a list of potential security threats and had passed. Once passengers receive their boarding passes, the Secure Flight process is already complete, according to the TSA.”

LinkedIn: Data Quality of Gender / Sex Codes and the Impacts on Identity Data Matching

“Identity matching requires matching practitioners to decide which collection of fields best allows the correct matching of one record with another. The choice can be made from fields such as name, date of birth, address details, sex / gender, and even unique identifier values (when they exist). The use of sex / gender in that process might be seen in a slightly different light.”

nj.com: Bill would allow people to buy New Jersey Lottery tickets electronically

“Under the bill, the commission would establish procedures for the payment of winning tickets holders, which may include crediting amounts won to a player’s account or direct deposit into a player’s account at a financial institution… The commission would also be directed to ensure that the program includes security measures to protect against fraud, prevent wagering by underage persons and protect the personal and financial information of players.”

Looking Back on 2010

Thursday, December 23rd, 2010

Looking back over the past year, we’re especially grateful for relationships we’ve built and grown with customers and partners. Despite a less than stellar economy, 2010 provided another good year of growth for Infoglide Software.

2010 also proved to be a year of accelerated visibility for identity resolution and entity analytics in general. Industry consolidation moves (e.g., IBM’s March acquisition of Initiate Systems) demonstrate the critical importance of entity resolution in the new era of Big Data that has been developing.

For the readers of IdentityResolutionDaily, please accept our thanks for your continuing interest and participation in the exciting growth of this market. 2010 promises to be a year of continued change and challenge, and we look forward to the opportunities it offers.

We’ll start with new posts again in January.

Happy Holidays, and Best Wishes for a Wonderful 2011!

Mike Shultz
CEO, Infoglide Software

Identity Resolution Daily Links 2010-12-21

Tuesday, December 21st, 2010

By the Infoglide Software Team

Cliffview Pilot: Fighting crime with modern tools amid budget cuts

“Professional analysts and law enforcement officers from more than 15 different agencies including the FBI, ATF, DEA, US Marshall’s, Homeland Security, and state and county partners work from one large room to put out intelligence products in a truly collaborative environment that defines New Jersey’s fusion center. Products include crime mapping with predictive analysis to help local departments know when and where crimes are likely to occur in the future.”

Thomasville Times-Enterprise: Pharmacist fraud

“Morgan’s prison sentence will be followed by three years of supervised release. Morgan was ordered to pay restitution of $2,804,462. Morgan, 64, was convicted in October 2008, of 69 counts of health care fraud, following a two-week jury trial in Albany. Michael J. Moore, U.S. attorney for the Middle District of Georgia, said the indictment charged that for a period of several years ending in August 2007, Morgan, a registered pharmacist and the owner of Thrift Center Pharmacy in Camilla, executed a scheme to defraud the Georgia Medicaid program, which is jointly funded with state and federal funds.

FATF: Money Laundering Using Trusts and Company Service Providers [PDF]

“TCSPs are often involved in some way in the establishment and administration of most legal persons and arrangements; and accordingly in many jurisdictions they play a key role as the gatekeepers for the financial sector. This report provides a number of case studies which demonstrate that TCSPs have often been used, wittingly or unwittingly, in the conduct of money laundering activities.”

Identity Resolution Daily Links 2010-12-19

Sunday, December 19th, 2010

[Post from Infoglide] Big Data and Entity Resolution (part 2)

“We talked a week ago about the rapidly emerging market space called Big Data. One statistic that opened my eyes is Gartner’s prediction that the volume of new data generated by enterprises will grow by 650% in the next five years, and 80% of that will be unstructured data! The 451Group’s definition of Big Data describes a growing need for non-traditional processes that can treat massive amounts of data as a whole, thereby making it impossible to use many traditional tools and techniques.”

KXAN.com: A look inside new crime-fighting tool

InformationWeek Healthcare: Medicare Expands Analytic Tools To Fight Fraud

“These tools will integrate many of the agency’s pilot programs into the National Fraud Prevention Program and complement the work of the joint HHS and Department of Justice Health Care Fraud Prevention and Enforcement Action Team (HEAT). ‘Preventing fraud is more effective than the old ‘pay and chase’ model of fighting fraud after a sham provider has been paid and disappeared,” CMS administrator Donald Berwick said in a statement. “By using new predictive modeling analytic tools we are better able to expand our efforts to save the millions — and possibly billions — of dollars wasted on waste, fraud, and abuse.’”

InformationWeek: The Morphing IT Budget: It’s About More Than Opex

“Concerns that internal initiatives, and the CIO’s clout, will be gutted and most funds redirected to the cloud are overstated–for now. But we are at an inflection point: IT has money to spend, but it can’t be allocated using the same old budget process that’s kept us in a rut of dedicating a third or more of our resources to keeping the lights on. Business leaders have little patience for high-priced, long-term IT slogs. They’ve seen massive 18-month projects fail and experienced success with lightweight software-as-a-service offerings. CIOs must look at each expenditure and think, ‘Will this buy us flexibility and advance the business?’”

Identity Resolution Daily Links 2010-12-14

Tuesday, December 14th, 2010

By the Infoglide Software Team

American Medical Software: Electronic Medical Records Use Over Majority

“Results from the National Ambulatory Medical Care Survey (NAMCS) show that between 2009 and 2010, the percentage of physicians reporting having an electronic medical record/electronic health record (EMR/EHR) system that meets the criteria of a basic system increased by 14% and a fully functional system increased by 46%.”

avanade: Global Survey: The Impact of Big Data

“In the global marketplace, businesses, suppliers and customers are creating and consuming vast amounts of information. Gartner predicts that enterprise data in all forms will grow 650 percent over the next five years. According to IDC, the world’s volume of data doubles every 18 months. This flood of data, often referred to as “information overload,” “data deluge” and “big data,” clearly creates a challenge for business leaders.”

Gartner: Technology Trends You Can’t Afford to Ignore

  1. Virtualization
  2. Data Deluge
  3. Energy and Green IT
  4. Complex Resource Tracking
  5. Consumerization and Social Software
  6. Unified Communications
  7. Mobile and Wireless
  8. System Density
  9. Mashups and Portals
  10. Cloud Computing

Identity Resolution Daily Links 2010-12-12

Sunday, December 12th, 2010

[Post from Infoglide] Big Data and Entity Resolution

“Early this year, Gartner suggested that a ‘data deluge’ has begun. In his recent Dataspora Blog post about ‘Big Data’ and what it means, author Michael Driscoll presents a unique and interesting perspective on the massive amounts of data being generated and stored. According to The 451 Group’s definition…”

Mastering Data Management: Entity Resolution & MDM: Interchangeable?

“Gartner released a report in November entitled, ‘Top 10 Technology Trends Impacting Information Infrastructure, 2011.’ Two of the top ten trends were ‘Entity Resolution and Analysis’ and ‘Master Data Management.’”

KeysNet.com: Keys whistleblowers awarded $88 Million

“According to Taxpayers Against Fraud, a nonprofit group based in Washington, D.C., National Medical then launched a campaign to force Ven-A-Care out of business. But in fighting back, Ven-A-Care staff discovered National Medical was paying kickbacks to doctors who prescribed medicines and services that weren’t needed, then billing Medicare and Medicaid exorbitant sums far in excess of what the medicines and services cost. The Justice Department eventually got a $486 million settlement from National Medical — and Ven-A-Care received $40 million as its reward under the False Claims Act.”

Yahoo News: Russian Banks Report $3.8 Trillion in Suspicious Transactions This Year

“Russian financial institutions reported 120 trillion roubles (2.44 trillion pounds) of suspicious transactions to the anti-money laundering watchdog in the first nine months of 2010, the Kommersant daily reported on Monday.”

Big Data and Entity Resolution

Thursday, December 9th, 2010

By Mike Betron, Infoglide Software Director of Marketing

Early this year, Gartner suggested that a “data deluge” has begun. In his recent Dataspora Blog post about “Big Data” and what it means, author Michael Driscoll presents a unique and interesting perspective on the massive amounts of data being generated and stored. According to The 451 Group’s definition,

“Big data is a term applied to data sets that are large, complex and dynamic (or a combination thereof) and for which there is a requirement to capture, manage and process the data set in its entirety, such that it is not possible to process the data using traditional software tools and analytic techniques within tolerable time frames.”

While the term “Big Data” continues to evolve, no one argues that there are unique problems associated with capturing and using it, and part of the challenge derives from the multiple disparate sources of data.

where-does-big-data-come-from.jpg

Source: Avanade Global Survey: The Business Impact of Big Data, November 2010

In trying to get a handle on the issue’s impact, Avanade published in November the results of an August “survey of 543 C-level executives and IT decision-makers in 17 countries”. Some of the more interesting findings are:

  • The “data deluge” is real and is a source of frustration to many business and government leaders.
  • A majority believe that the data deluge “fundamentally changes the way their businesses operate.”
  • 46 percent of companies report they have made an inaccurate business decision as a result of bad or outdated data.

Highly scalable entity resolution technology will play a key role in solving the Big Data problem. We’ll talk more about this in a later post.


Bad Behavior has blocked 1169 access attempts in the last 7 days.

Close
E-mail It
Portfolio Strategy News The Direct Marketing Voice