“According to the Centers for Medicare and Medicaid Services, the program made nearly $24 billion in improper payments in 2009, almost doubling the previous years’ rate. The price of fraud, however, runs even higher. A CBS report notes that Medicare fraud costs taxpayers an estimated $60 billion a year, and some estimates put the figure at nearly $100 billion.”
“For the last two years, respondents to our survey have cited several information management-related problems among the top barriers to adopting BI tools company-wide. Data quality problems are cited most often, by 55% in both 2009 and this year, followed by ease-of-use challenges, and integration and compatibility with existing platforms. Among the people directly responsible for information management, the biggest impediments to success are accessing relevant, timely, reliable data (59%); cleansing, deduping, and ensuring consistent data (51%); and integrating data (49%).”
“Billions of your tax dollars are lost every year to healthcare fraud. In fact, the tab is $36Million a day for Medicare fraud alone. U.S. Congressman Michael Burgess of Lewisville watched FOX 4’s undercover investigation in to the practices of a home health care recruiter. Today, FOX 4’s Becky Oliver spoke with Congressman Burgess.”
Infoglide Software is a proud sponsor of the 15th International Conference on Information Quality (ICIQ). The 2010 edition of this annual event is being hosted this weekend by the George W. Dohaghey College of Engineering and Information Technology at the University of Arkansas at Little Rock. Researchers from all over the world will convene to share the results of their efforts.
The organizer of the event is John Talburt, PhD, founder and director of the Center for Advanced Research in Entity Resolution and Information Quality (ERIQ). Infoglide has sponsored the ERIQ lab and the Information Quality graduate program in recent years.
If you’re attending, we’ll be there and look forward to meeting you in Little Rock.
“We’re currently in the heat of the election season. No matter how impeccable the record of any candidate that the major parties put forward, minions of the opposing parties go to great lengths to uncover an embarrassing incident that can be exposed (or even an incident that can be twisted to appear embarrassing) in order to influence voters away from voting for that candidate. While the populace is reasonably good at figuring these tricks out, even more disturbing are the stories involving voter fraud.”
“Also, many data quality vendors specialize and provide depth of expertise in a focused part of the data quality market such as postal address verification (e.g., Experian QAS, Melissa DATA), matching or identity resolution [e.g., Infoglide Software, Netrics (acquired by TIBCO Software), and Pervasive Software], and data profiling (e.g., Ab Initio and Business Data Quality).”
“The R.I. State Fusion Center, a state police unit that tracks information on homeland security and crime, assisted in the investigation through the use of facial recognition software that determined that Medrano had been previously issued a Massachusetts identity document in his real name.”
“While TSA’s watch-list matching takes seconds and can be completed up until the time of departure, the agency cautions passengers that a boarding pass will not be issued until the airline submits complete passenger data to Secure Flight. The agency noted that, despite the crackdown, minor variations in the name on the boarding pass and ID, like middle initials, should not present problems at checkpoints.”
“This is the third in a series of four posts that discuss four methods for linking references. These methods are:
Direct matching
Transitive linking
Linking by association
Asserted Linking
In the last post I discussed transitive linking, and why it is essential for producing a unique and deterministic outcome of an ER process. In this post I will discuss the third method, linking by association.”
“There are many types of relationships that are discovered as a by-product of entity resolution, such as households or families. These terms take on different meaning depending on the subject area and the business situation. For example, we can examine parent-child and sibling relationships associated with individuals, we can look at components such as paper clips or screws that are in the same ‘family,’ or we can look at corporate ownership relationships that reflect families of companies. Alternatively, we can look at other types of relationships – individuals belonging to the same health club, components manufactured from the same type of metal, or companies that share the same board members.”
“In an 80-page civil complaint, the United States Attorney’s Office claims 51-year-old Doctor Robert Ritchea, a physician, not only allowed an unlicensed medical assistant to inject patients with pain medications, but also improperly billed Medicare for the treatments. The complaint also alleges Ritchea over-billed Medicare by more than $2.2 million in over 4,300 separate claims over a period of four years.”
“If you know that 123 Main Street in Anytown is a single family house there is a high probability that this is the same real world individual. But if you know that 123 Main Street in Anytown is a building used as a nursing home, a campus or that this entrance has many apartments or other kind of units, then it is not so certain that these records represents the same real world individual (not at least if the name is John Smith). So this example highlights the importance of using external reference data in data matching.”
“During 2010, independent/standalone data quality vendors (Clavis, Pitney Bowes, Human Inference and Trillium) will focus on name and address cleansing as they struggle against better-funded match/merge and data profiling capabilities increasingly integrated with megavendor MDM. Also at this time, a dearth of non-aligned matching algorithms (such as those from Digital Trowel, Infoglide, Omikron and Uniserve) will engender ‘algorithm envy’ among disenfranchised MDM providers.”
“Rockland County Legislator Ed Day, R-New City, has called for a review of Medicaid spending by the county that would also determine whether enough is being done to prevent and detect Medicaid fraud. ‘Medicaid expenditures represent an amount that is 110 percent of all the property taxes collected here in Rockland,’ said Day.”
“The most significant area of concern is organized crime. Canadian Security Intelligence Service estimates that there are about 750 organized crime groups operating in Canada and 80% of them are involved in the illicit drug trade. The cross-border movement of currency was identified as a continued concern.”
“Who’d have thought that iTunes could be used for money laundering? Yet that is exactly what five men in Great Britain were recently jailed for the other day. Using stolen credit card numbers, they bought £750,000 in vouchers, then sold them at cheaper prices over eBay. Methods of money laundering continue to evolve.”
“So the question is if authorities may have avoided losing 5 billion taxpayer Euros if some identity resolution including automated fuzzy connection checks and real world checks was implemented. I know that you are so much more enlightened on what could have been done when the scam is discovered, but I actually think that there may be a lot of other billions of Euros (Pounds, Dollars, Rupees) to avoid losing out there by making some decent identity resolution.”
“In a 2008 study conducted by Kroll Fraud Solutions/HIMSS Analytics to better understand the status of patient data security at hospitals, the hospitals surveyed reported an average level of preparedness to deal with a security breach of 5.88 on a one to seven ascending scale.19 Yet the same study indicated that only 56 percent of these hospitals had notified patients whose information was compromised as a result of a security breach.”
“The ‘official use only’ bulletin, produced by the Northern California Regional Intelligence Center, a partnership of federal, state, and local agencies originally set up to deal with drug trafficking, is entitled ‘Al-Qa’ida in the Arabian Peninsula’s Online Rhetoric Signals Shift in Intentions.’”
“I don’t think anyone knows what product is the best match engine, because I don’t think that all match engines have been benchmarked with a representative set of data.”
“It’s important to realize that SOA is really a rather loose collection of best practices. It’s not necessarily a well-defined list where you have some checklist of things to do SOA and if you miss one, you’re not doing SOA. What’s happening is architecture teams are incorporating SOA best practices into various other initiatives.”
“The U.S. Transportation Security Administration is on track to assume watchlist matching from all U.S. carriers by the end of May, only slightly behind its March 31 U.S. implementation target for the Secure Flight passenger prescreening system, according to a U.S. Government Accountability Office report. The Secure Flight program also calls for TSA to assume watchlist matching from foreign carriers, and the agency already is working with 19 airlines outside the United States to do so. Five of those carriers are fully functional within the program, and an additional 14 are testing, GAO reported.”
“Currently 80 to 90 percent of all medical records are stored on paper. The goal is that have an electronic health record for everyone in the U.S. by 2014. Electronic health records are expected to greatly reduce the number of medical errors, which is significant. Each year in the United States, as many as 100,000 people die in hospitals because of such errors. That’s the equivalent of one major airline crash every single day of every single year.”
“One of the oldest phrases in computer science seems to still be in vogue. ‘Garbage in, garbage out’ (GIGO) is a term coined during the early days of the computing industry. It pointed out that the value of computer systems of the day were entirely dependent upon their input data. No amount of processing power could produce a right answer from bad data. Fast forward many decades…”
“The concept of MDM is a good one, and many companies have piloted MDM projects over the last few years. Now research firm Baseline Consulting says that many companies are beginning to move beyond their MDM pilot systems. Baseline Consulting co-founder Jill Dyche said that ‘the fact that data quality, data governance, and data enrichment processes may accompany an MDM initiative make it all the more attractive as an enterprise solution.’”
“The success of the fusion center program,” said the report, “ is dependent on the infrastructure that enables state and local fusion centers to have access to each other’s information as well as to the appropriate federal databases. The fusion center program and the Nationwide Suspicious Activity Report Initiative (NSI) rely on the concept of shared space architecture, where the fusion centers replicate data from their systems to an external server under their control, making the decision on what to share totally under their control.”
“No matter how the rules shake out, EHR implementation in the United States is a foregone conclusion, Blumenthal said. He sees the skills of collecting, using, searching and sharing health data electronically becoming part of the assumed professional skill set for health care providers, just as using a stethoscope is now. In the next five to 10 years, hospitals will use their robust EHR systems to recruit physicians; solo physicians who succeed in implementing EHR will sell their practices more easily when the time comes, but solo physicians still using paper will not be able to sell their practices at all.”
One of the oldest phrases in computer science seems to still be in vogue. “Garbage in, garbage out” (GIGO) is a term coined during the early days of the computing industry. It pointed out that the value of computer systems of the day were entirely dependent upon their input data. No amount of processing power could produce a right answer from bad data.
Fast forward many decades. The same phrase is still used today to emphasize the importance of data quality in many application areas (e.g., healthcare). While high quality data remains important, two factors influence me to say that GIGO is not the absolute rule that it once was: (1) advancements in the evolution of software and hardware technology, and (2) the emergence of whole classes of new applications targeting fraud detection.
What happens when the quality of data is “enhanced”? Processes like data transformation, data cleansing, and de-duplication filter out information that is unnecessary and confusing. Names, addresses, and other attributes are standardized. Duplicate records are deleted. Links to “bad” data are broken. Master records, aka “golden records”, are created for use by multiple systems.
While this has great value for traditional systems, it can devastate fraud detection efforts. For example, discovering and evaluating multiple addresses during fraud analysis is crucial in finding and prosecuting perpetrators of fraud. Or conversely, standardizing multiple forms and instances of someone’s name held in multiple data sources may remove vital clues and break a forensic chain of evidence. We sometimes refer to the result as data deterioration.
So “garbage in, garbage out” is still an operative phrase for most software systems, but for entity resolution, we’ve found repeatedly that “one man’s garbage is another man’s treasure.”
“In the last post we examined how entity resolution (ER) systems are actually implemented, starting with the most basic merge/purge process and heterogeneous join systems. Both of these approaches focus on collecting equivalent references from among the sources provided, either as a large batch of references in a single file, or through queries against a federation of databases…”
“Knowing what we know now, would the U.S. be able to stop another attack like that of Christmas Day 2009? This is certainly the question on the minds of many Americans today. It is also one that Jamie McIntyre, veteran journalist and blogger for Military.com, had the opportunity to ask of Rand Beers, Under Secretary for National Protection and Programs Directorate from DHS, at a Heritage Foundation National Security Bloggers Luncheon.”
“In the middle of all of this are software providers, primarily IBM InfoSphere Identity Insight Solutions, Infoglide (which is providing software for the DHS) and Informatica… Identity recognition and resolution systems enable organizations to use data matches to gain a better understanding of identity across multiple systems. This could include not just individual identities but also networks and relationships: that is, who people know and how they are connected.”
“It’s been a heady couple of months in the IT infrastructure market, as any independent company that wasn’t tied down seemed to be swept up in a whirlwind of M&A activity. Independent data integration specialist Informatica, a 4,000-customer company in business since 1993, announced in January that it had acquired Siperian for $130 million.”
Infoglide Software provides entity resolution and analysis solutions for retail, banking, insurance, government, and law enforcement. Without the need for data cleansing or warehousing, Infoglide Software's Identity Resolution Engine™ (IRE) analyzes all of the information relating to individuals and/or entities from multiple sources of data and then applies...