HOME

Archive for the ‘Data Quality’ Category

Identity Resolution Daily Links 2011-01-23

Sunday, January 23rd, 2011

[Post from Infoglide] Financial Services Has a Growing Problem: Internal Fraud

“The Aite Group recently authored a report entitled ‘Internal Fraud: The Devil Within.’ After surveying 35 fraud and product executives at financial institutions across the U.S. and Canada, they concluded that internal fraud is a severe and growing problem that often goes undetected and almost always flies under the radar of public scrutiny.”

Bloor: There’s identity resolution and then there’s identity resolution

“The second type of identity resolution is similar but different. The classic example is in police work. Here you want to know that some particular criminal has fifteen different aliases, say. Moreover, under each of those identities he or she will have multiple contacts and you may want to do social network analysis against those contacts to see who else might have criminal tendencies.”

Chicago Sun Times: Police sensing crime before it happens

“In October, the Chicago Police Department’s new crime-forecasting unit was analyzing 911 calls for service and produced an intelligence report predicting a shooting would happen soon on a particular block on the South Side. Three minutes later, it did, police officials say. That got police Supt. Jody Weis thinking. He wondered if the department could produce intelligence reports even quicker. Next time, officers might have an hour’s notice before a shooting — instead of just a few minutes.”

KERO23:Ten People Indicted In Wide-Ranging Real Estate Scam

“The indictment alleges that, from approximately January 2004 to September 2007, the defendants perpetrated a scheme to defraud mortgage lenders by submitting fraudulent loan applications with material misrepresentations, including misrepresentations concerning the borrower’s income, assets, employment status, and intent to use the home as the borrower’s primary residence… The scheme involved more than $20 million in losses to lenders.”

Identity Resolution Daily Links 2010-11-23

Tuesday, November 23rd, 2010

By the Infoglide Staff

Tim Estes: Information Systems in an Entity-Centric World

 

Gartner: Four Converging Trends That Will Change the Face of IT and Business
“Gartner has identified four broad trends that will change IT, and the economy, in the next 10 years:

  1. Cloud
  2. Business impact of social computing
  3. Context Aware Computing
  4. Pattern Based Strategy

WSJ Health Blog: Web-Based Electronic Health Record Safety Registry Launches

“Even if EHRs reduce the risk of errors overall, they may produce entirely new ones, Edward Fotsch, CEO of PDR Network, which will provide network operations for the new reporting system, tells the Health Blog. For example, EHRs may cut the risk of failing to alert a patient to an abnormal test result, but confusing user interfaces may produce their own mistakes and need tinkering.”

Community of Experts: Identities and Entities: Resolution or Dissolution?

“Even with these differences, a human can rapidly determine that they refer to the same individual for two reasons. The first is that the values that differ across the pair of records are not too different from each other, and the second is that there seems to be enough support from across each pair of attributes to assert some degree of similarity.”

Identity Resolution Daily Links 2010-11-18

Thursday, November 18th, 2010

By the Infoglide Staff

24-7: Medicare Claims Database Highlighting Fraud and Abuse

“According to the Centers for Medicare and Medicaid Services, the program made nearly $24 billion in improper payments in 2009, almost doubling the previous years’ rate. The price of fraud, however, runs even higher. A CBS report notes that Medicare fraud costs taxpayers an estimated $60 billion a year, and some estimates put the figure at nearly $100 billion.”

Information Week: Business Intelligence: How To Get Agile

“For the last two years, respondents to our survey have cited several information management-related problems among the top barriers to adopting BI tools company-wide. Data quality problems are cited most often, by 55% in both 2009 and this year, followed by ease-of-use challenges, and integration and compatibility with existing platforms. Among the people directly responsible for information management, the biggest impediments to success are accessing relevant, timely, reliable data (59%); cleansing, deduping, and ensuring consistent data (51%); and integrating data (49%).”

MyFoxDFW: U.S. Congressman Reacts to Undercover Medicare Investigation

“Billions of your tax dollars are lost every year to healthcare fraud. In fact, the tab is $36Million a day for Medicare fraud alone. U.S. Congressman Michael Burgess of Lewisville watched FOX 4’s undercover investigation in to the practices of a home health care recruiter. Today, FOX 4’s Becky Oliver spoke with Congressman Burgess.”

Identity Resolution Daily Links 2010-11-15

Monday, November 15th, 2010

By the Infoglide Team

Main Justice:Eric Holder’s Prepared Remarks at Health Care Fraud Prevention Summit

“In just the last fiscal year, we obtained settlements and judgments of more than $2.5 billion in False Claims Act matters alleging health care fraud. This marked a new record – and an increase of more than 60 percent from fiscal year 2009. We also opened more than 2,000 new criminal and civil health-care fraud investigations, reached an all-time high in the number of health-care fraud defendants charged, stopped numerous large-scale fraud schemes in their tracks, and returned more than $2.5 billion to the Medicare Trust Fund and more than $800 million to cash-strapped state Medicaid programs.”

SearchDataManagement.com: Gartner Magic Quadrant ranks MDM software vendors

Gartner reports that due to the sluggish economy, customer demand for MDM software is growing at a significantly slower rate than years past. But it is growing. The analyst firm predicts that the overall market for MDM software will increase from $1 billion in 2008 to $2.9 billion by 2013. Gartner also predicts that by 2010, investments in MDM software will lead to an 80% reduction in costs associated with managing redundant data.”

The Crime Report: Fusion Centers Could Face Budget Issues As States Cut Back

“Some of the nation’s 72 fusion centers–where federal, state, and local law enforcement agencies share data on terrorism and crime threats–may face budget problems in the nation’s tough economic conditions. Ross Ashley of the National Fusion Center Association, which represents the centers, says that some newly elected governors must be convinced of the centers’ worth. The agencies typically do not have line-item budgets and are dependent on allocations from various levels of government to operate.”

Sponsoring ICIQ This Weekend

Thursday, November 11th, 2010

By Mike Betron, Infoglide Software Director of Marketing

Infoglide Software is a proud sponsor of the 15th International Conference on Information Quality (ICIQ). The 2010 edition of this annual event is being hosted this weekend by the George W. Dohaghey College of Engineering and Information Technology at the University of Arkansas at Little Rock. Researchers from all over the world will convene to share the results of their efforts.

The organizer of the event is John Talburt, PhD, founder and director of the Center for Advanced Research in Entity Resolution and Information Quality (ERIQ). Infoglide has sponsored the ERIQ lab and the Information Quality graduate program in recent years.

If you’re attending, we’ll be there and look forward to meeting you in Little Rock.

Identity Resolution Daily Links 2010-10-30

Saturday, October 30th, 2010

[Post from Infoglide] Absentee Ballot Fraud

“We’re currently in the heat of the election season. No matter how impeccable the record of any candidate that the major parties put forward, minions of the opposing parties go to great lengths to uncover an embarrassing incident that can be exposed (or even an incident that can be twisted to appear embarrassing) in order to influence voters away from voting for that candidate. While the populace is reasonably good at figuring these tricks out, even more disturbing are the stories involving voter fraud.”

Rob Karel’s Blog: Discussing The Forrester Wave™: Enterprise Data Quality Platforms, Q4 2010

“Also, many data quality vendors specialize and provide depth of expertise in a focused part of the data quality market such as postal address verification (e.g., Experian QAS, Melissa DATA), matching or identity resolution [e.g., Infoglide Software, Netrics (acquired by TIBCO Software), and Pervasive Software], and data profiling (e.g., Ab Initio and Business Data Quality).”

Providence Journal: Deportee charged in identity theft case

“The R.I. State Fusion Center, a state police unit that tracks information on homeland security and crime, assisted in the investigation through the use of facial recognition software that determined that Medrano had been previously issued a Massachusetts identity document in his real name.”

Aviation News Today: November 1 Ends Grace Period For Secure Flight Data Submissions

“While TSA’s watch-list matching takes seconds and can be completed up until the time of departure, the agency cautions passengers that a boarding pass will not be issued until the airline submits complete passenger data to Secure Flight. The agency noted that, despite the crackdown, minor variations in the name on the boarding pass and ID, like middle initials, should not present problems at checkpoints.”

Identity Resolution Daily Links 2010-10-12

Tuesday, October 12th, 2010

By the Infoglide Team

Hays Daily News: Making the move to electronic records a natural fit for clinic

“Beginning in 2015, providers who have not successfully demonstrated meaningful use will face cuts in the amount of Medicare reimbursement they receive. It will begin with 99-percent payment in 2015, and drop to 97 percent by 2017, according to information from the Centers for Medicare and Medicaid Services. ‘So if your practice has not implemented an EHR and have meaningful use, you’re going to get reimbursed less dollars for the same service as someone who does,’ Brull said.”

GIGaom: Jeff Jonas Video on How Data Makes Corporations Dumb

“‘Information is being created faster than organizations can make sense of it,’ he says. The gap between the growth of information and understanding is widening because the tools for understanding are not scaling as fast as the growth in data and information.  ‘As computers are getting faster and the world is getting more sensors, the organizations have been getting dumber,’ he said. ‘The percentage of what is knowable is on a decline.’”

Identity Resolution Daily Links 2010-10-10

Sunday, October 10th, 2010

[Post from Infoglide] OYSTER: A Configurable ER Engine

“Now that I have finished the four-part series on linking methods, I would like to talk about one of my pet projects, OYSTER.  It stands for Open sYSTem Entity Resolution, a project to build a configurable, open-source entity resolution.  Although I am somewhat hesitant to announce a system that is not yet available to readers, it does exist and has been a valuable teaching tool in my ER class.  A run-time version (Java JAR file) will available soon on the ERIQ website, and the source code should be available on Source Forge by the end of the year.”

DATAMONITOR: Bad data costing US businesses $700 billion a year

Madan Sheina, author of the report and an Ovum lead analyst, said: ‘Bad data is a growing problem for businesses due to the sheer volume and pace at which it is now moved between organisations. We now estimate that bad data costs US companies 30 per cent of their revenues – a massive $700 billion per year and a figure that is set to increase.’”

thestar.com: Watchdog warns criminals, terrorists could abuse new payment methods

“‘FINTRAC anticipates that the FATF will publish a public report on this work later in 2010,’ it said. Over the past few years, prepaid cards and Internet payment services have only been identified in a minority of domestic money laundering and terrorist financing cases. In 2008-2009, for instance, Internet-based payment services were involved in roughly 4 per cent of all disclosed cases, FINTRAC said in its report.”

Identity Resolution Daily Links 2010-09-03

Friday, September 3rd, 2010

[Post from Infoglide] Reference Linking Methods - Part 4

“In the direct matching, transitive linking, and association analysis methods discussed in previous posts, the evidence for establishing a link comes from the references themselves, either as attribute values or relationships with other references.  A link created in this way is also called an inferred link. But in almost any ER context, some pairs of equivalent references (i.e. that refer to the same entity) will have insufficient evidence available in the references themselves to make that determination, thereby leaving them as unlinked false negatives.”

Liliendahl on Data Quality: Out of Facebook

“Doing ‘Social Master Data Management’ will become an integrated part of customer master data management offering both opportunities for approaching a ’single version of the truth’ and some challenges in doing so. Of course privacy is a big issue.”

CRN: SMB Cloud Spending To Approach $100 Billion By 2014

“Total cloud-related information and communications technology spending among SMBs globally surpassed $52 billion in 2009, representing just 6 percent of total worldwide SMB ICT spending. But AMI predicts that that will nearly double over a five-year period.”

Media Newswire: Owner of illegal money transmitting business sentenced to 2 years in prison, ordered to forfeit $690K

“According to court documents, between Jan. 1, 2004 and Dec. 31, 2008, Lemine, owner of Sorrento Grocery in Sorrento, Fla., cashed more than $4 million in checks from a local construction company in return for a fee of between 1 and 1.5 percent of the checks’ face value. He did so knowing that the owners of the construction company were attempting by cashing the checks through the grocery to conceal their employment of illegal aliens, avoid paying worker’s compensation and employment taxes, and hide income from state and federal tax officials.”

Reference Linking Methods - Part 4

Thursday, September 2nd, 2010

By John Talburt, PhD, CDMP, Director, UALR Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ)

This is the last in a series of four posts that discuss four methods for linking references.  These methods are:

  1. Direct matching
  2. Transitive linking
  3. Linking by association
  4. Asserted linking

In the direct matching, transitive linking, and association analysis methods discussed in previous posts, the evidence for establishing a link comes from the references themselves, either as attribute values or relationships with other references.  A link created in this way is also called an inferred link.

But in almost any ER context, some pairs of equivalent references (i.e. that refer to the same entity) will have insufficient evidence available in the references themselves to make that determination, thereby leaving them as unlinked false negatives.  For example, in the previous post we discussed how it might be possible to discover that the references to Mary Smith on Oak St and the Mary Smith on Elm St are equivalent through association analysis.  But if the collateral evidence of the shared address association were not available, then the link could not have been inferred.

A different way to approach this problem is through asserted linking.  An asserted link between two references is based on prior knowledge that they are equivalent.  For this reason, creating links in this way is also called knowledge-based linking, and ER systems that use this method of resolution are called knowledge-based ER systems.

An asserted link often takes the form of a single record carrying the attribute values of two non-matching references.  The assertion about Mary Smith’s change of address might be something like:

The Mary Smith previously residing at 123 Oak is now residing at 456 Elm.

It reflects the knowledge that references to Mary Smith on Oak Street and Mary Smith on Elm Street are equivalent independent of any similarity or dissimilarity between their corresponding attribute values.

So where do these assertions come from?  Not out of thin air.  An assertion like this could have been self-reported, acquired from public records, or gotten from a commercial data provider, such as a magazine subscription service.  If this knowledge were to be acquired and provisioned in the ER identity management system prior to processing a reference to either Mary Smith on Oak street or Mary Smith on Elm street, then both references would be recognized as equivalent and could be linked at the time they were processed, regardless of the order in which they were received.  Jeff Jonas calls ER systems that have this property “sequence neutral.”

Asserted linking is not just theoretical.  For example, Acxiom® Corporation has made asserted linking the backbone of its AbiliTec® CDI technology that manages billions of assertions for U.S. consumers alone.

The disadvantage of asserted linking is that it is a non-trivial activity to acquire, store, and manage the assertions.  Asserted linking divides the overall ER process into two concurrent processes.  One is a foreground process for resolving equivalence and applying links.  The other is a background process that acquires and integrates assertions into the identity management system.  Of course, timing is critical.  If an assertion is not acquired and available before processing the references that need them, then their equivalence will not be recognized and they will not be linked.

In the next post, I plan to discuss the role of ER in entity-based information exchange systems,  sometimes called “information hubs.”


Bad Behavior has blocked 1169 access attempts in the last 7 days.

Close
E-mail It
Portfolio Strategy News The Direct Marketing Voice